Products

Cloud management

Kubernetes rightsizing

Cloud reselling

Customer stories

Lobster Data customer story

Blogs

Bill-Accurate Kubernetes Cost Allocation, Now Built Into CloudBolt

Partners

CloudBolt Partner Program

Become a partner

MSP/CSPs

Videos, demos, webinars

Kubernetes rightsizing trust gap: Why the stakes just got higher

Blogs

The VMware Double Tax: Why So Many Enterprises Stay Longer Than They Want

Videos, demos, webinars

How Acquia cut web node infrastructure by 65% with continuous Kubernetes rightsizing

Resources

All resources

Blog

Industry Research

Events

Support center

Documentation

Cloud infrastructure complexity calculator

Kubernetes utilization benchmark calculator

VMware benchmark calculator

Interactive cloud architecture builder

Videos, demos, webinars

CloudBolt CMP 3-minute demo

Company

About us

Press

Careers

SUPPORT

Service offerings

Documentation

Support center

Videos, demos, webinars

StormForge Optimize Live: 5-minute demo

Get started

When Hardware Triples in Price, Idle Capacity Becomes a Line Item.

by: Yasmin Rajabi / May 12, 2026

A platform leader at a Fortune 50 company recently told his app teams something that I keep thinking about. The message was very direct: Adopt rightsizing or fund your own hardware. That’s the new position inside a lot of large enterprises I work with right now, and it is not posturing. The reality is that lead times for server hardware delivery are longer than they were during COVID. I’ve talked to customers who tell me pricing is literally changing by the minute, and the price you order is not the same as the price when the hardware is delivered to your doorstep… and that’s if they don’t cancel the order after getting the PO. Yes, this has happened.

Hyperscaler capex is forecast to exceed $600 billion in 2026, a 36% jump from 2025. Server DRAM prices are up 60 to 70% from late 2025. Dell increased server prices by 15 to 20% in mid-December. Memory prices increased more than 100% in the first quarter of 2026. Intel is reallocating wafers from consumer to data center. GPU server prices are running 30 to 50% above early 2025 levels, and lead times for data center GPUs are at 36 to 52 weeks.

This is not a passing supply chain story. It’s the new shape of infrastructure cost for at least the next 24 to 36 months.

If you run a platform team, chances are you already feel it. The capacity-planning conversation that used to be a Q4 ritual is now a monthly fire drill. The exec who manages the budget is staring at a forecast that doesn’t reconcile with the original plan, and the path of least resistance is to walk into finance and ask for the delta. More racks, more nodes, a bigger reservation. The justification writes itself, because you need the capacity to move the business forward, and the vendor’s price list is the price list, whether you like it or not.

I’d argue this is the wrong dialogue.

The local picture is worse than the macro picture

The macro story is real, but most enterprises are not actually short on capacity. They are short on capacity they can see. The clusters are full of requests, not usage.

I see this everywhere. App teams allocate the entire cluster on day one. They request what they think they will need at peak, plus a buffer for the spike they are afraid of, plus another buffer for the OOMKill they got paged on once two years ago. CPU utilization sits in the teens, and memory utilization sits below 40%. The cluster is “out of capacity” only in the sense that the scheduler has nowhere else to put things, not in the sense that any compute is doing useful work.

The dynamic is more critical inside on-prem environments. EKS, GKE, and AKS bills go back to the app team that caused them, and that team has a budget owner who notices. Kubernetes on shared physical infrastructure does not. To the app team, the on-prem cluster appears to be a free resource. There is no monthly invoice attached to their namespace. Asking them to give back requests feels to them like volunteering to take on risk with no upside.

For example, at one of the larger enterprises I work with, the platform team had run an internal rightsizing initiative on a subset of workloads. The reclaimed capacity was real, the benefits were visible. But the moment they tried to scale that work across all app teams, every conversation reset. The skeptical question is always the same: what if you rightsize and I don’t have enough capacity, or something breaks? And the answer to that question is not technical. It is a trust question wearing a technical costume.

Two doors, two very different bills

So the platform leader has two doors in front of them.

Door one is finance. Walk in, present the price list and lead time, and ask for an unbudgeted capital allocation to expand the footprint. This door is fast. It is also expensive in ways that compound. You buy hardware at the worst pricing window in years, you wait 36 to 52 weeks for the shipments that matter, and you absorb the next 8 to 10% round of CPU price increases analysts already expect in the back half of 2026. You also reinforce the cultural pattern that capacity is somebody else’s problem, which guarantees you will be in this same room next year.

Door two is the app teams. Sit down with the people whose namespaces account for 60% of requests but only 20% of actual usage. Build a real rightsizing program with them, not at them. Show them the safety mechanisms before you ask them to give anything back. Be specific about what gets reclaimed, where the buffer sits, what the rollback looks like, and who owns the pager if a recommendation is wrong. Door two is slower. It is also where the actual savings live.

Most teams I talk to default to door one because door two requires a kind of organizational work that does not show up on anyone’s roadmap. You have to earn the right to touch someone else’s manifests. That earning happens through conservative defaults, transparent recommendations, observable behavior under load, and clear contracts about who owns what when something gets resized. None of that is glamorous. All of it is the actual job.

The incentives are not aligned, and pretending otherwise is the problem

The reason rightsizing stalls inside large enterprises is not that engineers are afraid of automation. It is because the incentives are not aligned. The platform team owns the bill. The app team owns the SLO. If a request is too high, the platform team pays. If a request is too low and a pod gets killed, the app team gets paged. So the rational move for an app team is to over-request, every time, with a comfort buffer. The cluster fills with that rationality, and then the platform team gets handed a bigger bill.

You can’t fix that with tooling alone. You fix it by building confidence with the app team, giving them visibility into what their requests cost in real terms, and making the safety net explicit enough that they trust the rightsizing loop as much as they trust autoscaling tools. When that trust is in place, the same teams that were blocking a 30% capacity harvesting in March will accept it in July. I’ve watched this happen time and time again. It looks like a process problem until it looks like a culture problem, and then it looks like a budget win.

The hardware crunch is a forcing function, not a fate

Most platform teams will respond to this crunch by buying their way through it. They will go to finance, get a partial “yes,” and place orders that arrive nine to twelve months later at prices nobody budgeted for. The capacity gets absorbed within a quarter of landing, because nothing about how app teams request resources has changed. The next planning cycle starts in the same hole.

The other path costs dramatically less but takes more trust. It’s time to opportunistically treat the supply crunch as the moment when your organization finally makes utilization a shared responsibility rather than a platform team problem. Same workloads, same SLOs, less idle capacity, more headroom. That is a trust story, not a capex story. The capex story is easier to start and harder to finish. The trust story is harder to start, but it pays for itself the day after you finish it.

If hardware were getting cheaper, this would be a footnote. But the reality is that it is getting more expensive, arriving more slowly, and harder to forecast. That is exactly the moment to stop expanding the footprint and start using the one you already have.

Exclusive insights and strategies for cloud pros. Delivered straight to your inbox.

AUTHOR

Yasmin Rajabi

Yasmin Rajabi is the Chief Operating Officer at CloudBolt Software. She is a recognized leader in the FinOps and Kubernetes communities, and her background as an engineer, product leader, and operator gives her a...

Learn more

Related Blogs

When Karpenter isn’t enough: a real Kubernetes cost teardown

A high-volume payments platform runs hundreds of Kubernetes clusters and thousands of services, with a platform team responsible for the…

What actually happens when a workload OOMs in production

The pager goes off at 2:47 AM. CrashLoopBackOff on a payment service. The on-call rolls over, opens a laptop, runs…

Directionally Close Isn’t Defensible: Reconciling Kubernetes Cost to the Penny

Every Kubernetes chargeback program dies in the same meeting. The platform team puts together a thoughtful dashboard with costs broken…