Kubernetes has become one of the most widely used tools in distributed system infrastructure, but powerful tools can rack up significant expenses without proper configuration or management. Seven members of our Roundtable offered advice this month on the best ways to control those costs.
Featuring:
Justin Warren PivotNine - Liz Fong-Jones Honeycomb
Jyoti Bansal Harness - Thom McCann Independent
Dave McJannet HashiCorp - Andreas Grabner Dynatrace
Ari Weil Akamai
Justin Warren
Principal Analyst, PivotNine
Let those who are closest to the platform decide how budget should be spent. If the goals and objectives are clear, they are best placed to figure out how to cost-effectively achieve them. Give developers and engineers the terrifying responsibility of being in charge of their own spending. Let them enjoy both the upsides and downsides of being adults.
Support them with expert advice and assistance on request, but resist the temptation to impose detailed bureaucratic controls. High-level spend controls let different teams adapt to their own circumstances while staying inside broad guidelines about what sensible spending looks like.
Concentrate your efforts on clearly articulating broad goals and objectives; both what to do, and what _not_ to do. It’s an iterative process of improvement, which requires having a clear idea of what ‘good’ looks like and being able to explain it to other people.
Liz Fong-Jones
Managing costs in Kubernetes while maximizing its benefits can be challenging, but there are effective strategies to achieve this balance. Utilizing the most sustainable [compute] instances, such as Graviton, Cobalt, or Axion, can significantly reduce both your costs and carbon footprint. These options can be a great choice for companies that have Kubernetes as a part of their stack, as they are efficient without compromising performance, at a lower price.
Another essential practice to properly manage Kubernetes costs is regularly stress testing each workload to the highest degree of utilization without degrading performance. This will help teams identify optimal resource allocation, ensuring that they are not over-provisioning resources and incurring unnecessary costs while still meeting performance requirements.
Continuously validating bin-packing and cluster autoscaling is also imperative for cost savings by automatically adjusting nodes based on demand, minimizing wasted resources and reducing overall expenditure. By integrating these approaches, you can effectively manage Kubernetes costs while optimizing performance and sustainability.
Jyoti Bansal
Effectively managing Kubernetes costs requires a multi-faceted approach centered on visibility, optimization, and cost governance. First, achieving granular visibility into Kubernetes expenses, including accurate cost attribution across organizational hierarchies like teams and projects, is essential. This transparency highlights high-expenditure areas and informs budget allocations that align with business priorities.
Anomaly detection plays a crucial role by enabling teams to proactively address unexpected cost spikes. Coupled with this, right-sizing workloads and nodes helps prevent over-provisioning, ensuring resources are allocated based on actual usage.
For added efficiency, dynamically detecting and auto-remediating idle workloads, particularly in non-production clusters, prevents wasted expenses from unused resources. Investing in long-term commitments, such as AWS or Azure Reserved Instances, Savings Plans, or GCP Committed Use Discounts for node capacity, and leveraging spot nodes for spot-ready workloads, can also drive substantial savings, making Kubernetes more cost-effective and scalable.
Together, these strategies enable organizations to harness the full power of Kubernetes while maintaining tight control over associated costs.
Thom McCann
Technology Platform, Ecosystem and Cost Expert, Independent
Kubernetes has the potential to drive a sea change in cost improvement in large organizations, but it doesn't happen by default. Intentional choices must be made to shape organizational culture to be cost-focused, ecosystem-driven, and coordinated around technology decisions and deployments.
Avoid "cost bombs" and cluster sprawl by focusing on three areas:
- Workgroup Model vs Ecosystem Model: Shift from independent workgroup clusters to an "Ecosystem Model" that consolidates clusters, reducing hundreds to tens. This approach centralizes management, shares infrastructure, and significantly lowers costs.
- Expertise: Invest in hiring or developing in-house expertise to shift to an ecosystem-focused model. Avoid vendors whose incentives drive costs up, and instead prioritize those aligned with cost optimization.
- Cost Visibility: Gain visibility at the unit cost level for CPU, memory, and other resources to identify and reduce waste continuously.
These intentional shifts can lead to substantial improvements in cost efficiency and operational performance, positioning Kubernetes as a powerful cost-saving tool.
Dave McJannet
This summer, when HashiCorp surveyed more than 1,200 enterprise technology leaders and practitioners, 90% of the respondents reported overspending on cloud resources.
While Kubernetes allows for more flexible infrastructure management, its ephemeral architecture spins up new pods and clusters in a very dynamic manner as workloads scale. This can lead to cost unpredictability, and resources that can be difficult to track.
To alleviate this complexity and better manage costs, companies need to create platform teams, centralized technology functions that can implement control points in the process of provisioning infrastructure. Once a platform team is managing an organization’s Kubernetes environment and enforcing these centralized controls, costs can be managed through policies without getting in the way of the development teams who build net new applications.
The data show that these platform engineering principles work: The majority of survey respondents with the highest level of cloud maturity — who reported fewer cost overruns and better talent management — had implemented platform teams.
Andreas Grabner
DevOps Activist, Dynatrace
To enhance cost visibility in Kubernetes, use annotations on Kubernetes workloads to assign ownership for workloads and namespaces. This enables precise cost reporting by team, fostering cost-aware resource usage. Tagging resources with ownership details makes actual costs more transparent, encouraging mindful resource management.
Kubernetes has multiple layers, and cloud providers charge based on these dimensions. Teams can also manage costs by defining appropriate resource requests and limits for workloads. This improves resource allocation and scheduling. Accurate requests prevent resource starvation, while limits guard against overconsumption. Load-testing is essential to determine the right resource requests and limits for specific workloads, ensuring stability and performance.
Additionally, implementing resource quotas at the namespace level can prevent a single application from monopolizing cluster resources, promoting fairness and predictability in resource allocation. Lastly, avoid relying solely on default Kubernetes Horizontal Pod Autoscaler (HPA) metrics, which are based only on CPU and memory usage. Instead, consider more relevant metrics like CPU throttling, response times, or business-specific metrics such as requests per second.
Ari Weil
VP, Product Marketing, Akamai
Managing Kubernetes costs requires more than just trimming expenses; [companies need to] strategically engineer your infrastructure to work smarter, not harder. Start by right-sizing your workload components. This isn’t just about saving money on unused resources; it’s about designing a system that can adjust in real-time. By automating this balance through autoscaling, you keep your infrastructure lean without sacrificing the ability to pivot when demand spikes.
Simplification is the next critical piece, but not in the basic sense of cutting corners. Pre-configured environments and a standardized stack empower your teams to move faster with fewer errors, reducing friction and complexity at every step. This thoughtful streamlining not only lowers costs, but also improves security and frees up bandwidth for innovation, sidestepping the typical chaos of sprawling application ecosystems.
Finally, prioritize cloud-agnostic flexibility. This isn't just a safeguard against vendor lock-in; it’s a deliberate choice to future-proof your infrastructure. By staying nimble and choosing where workloads live based on performance and price, you retain the power to continuously optimize your setup as technology evolves.