When Hardware Triples in Price, Idle Capacity Becomes a Line Item.
A platform leader at a Fortune 50 company recently told his app teams something that I keep thinking about. The message was very direct: Adopt rightsizing or fund your own hardware. That’s the new position inside a lot of large enterprises I work with right now, and it is not posturing. The reality is that lead times for server hardware delivery are longer than they were during COVID. I’ve talked to customers who tell me pricing is literally changing by the minute, and the price you order is not the same as the price when the hardware is delivered to your doorstep… and that’s if they don’t cancel the order after getting the PO. Yes, this has happened.
Hyperscaler capex is forecast to exceed $600 billion in 2026, a 36% jump from 2025. Server DRAM prices are up 60 to 70% from late 2025. Dell increased server prices by 15 to 20% in mid-December. Memory prices increased more than 100% in the first quarter of 2026. Intel is reallocating wafers from consumer to data center. GPU server prices are running 30 to 50% above early 2025 levels, and lead times for data center GPUs are at 36 to 52 weeks.
This is not a passing supply chain story. It’s the new shape of infrastructure cost for at least the next 24 to 36 months.
If you run a platform team, chances are you already feel it. The capacity-planning conversation that used to be a Q4 ritual is now a monthly fire drill. The exec who manages the budget is staring at a forecast that doesn’t reconcile with the original plan, and the path of least resistance is to walk into finance and ask for the delta. More racks, more nodes, a bigger reservation. The justification writes itself, because you need the capacity to move the business forward, and the vendor’s price list is the price list, whether you like it or not.
I’d argue this is the wrong dialogue.
The local picture is worse than the macro picture
The macro story is real, but most enterprises are not actually short on capacity. They are short on capacity they can see. The clusters are full of requests, not usage.
I see this everywhere. App teams allocate the entire cluster on day one. They request what they think they will need at peak, plus a buffer for the spike they are afraid of, plus another buffer for the OOMKill they got paged on once two years ago. CPU utilization sits in the teens, and memory utilization sits below 40%. The cluster is “out of capacity” only in the sense that the scheduler has nowhere else to put things, not in the sense that any compute is doing useful work.
The dynamic is more critical inside on-prem environments. EKS, GKE, and AKS bills go back to the app team that caused them, and that team has a budget owner who notices. Kubernetes on shared physical infrastructure does not. To the app team, the on-prem cluster appears to be a free resource. There is no monthly invoice attached to their namespace. Asking them to give back requests feels to them like volunteering to take on risk with no upside.
For example, at one of the larger enterprises I work with, the platform team had run an internal rightsizing initiative on a subset of workloads. The reclaimed capacity was real, the benefits were visible. But the moment they tried to scale that work across all app teams, every conversation reset. The skeptical question is always the same: what if you rightsize and I don’t have enough capacity, or something breaks? And the answer to that question is not technical. It is a trust question wearing a technical costume.
Two doors, two very different bills
So the platform leader has two doors in front of them.
Door one is finance. Walk in, present the price list and lead time, and ask for an unbudgeted capital allocation to expand the footprint. This door is fast. It is also expensive in ways that compound. You buy hardware at the worst pricing window in years, you wait 36 to 52 weeks for the shipments that matter, and you absorb the next 8 to 10% round of CPU price increases analysts already expect in the back half of 2026. You also reinforce the cultural pattern that capacity is somebody else’s problem, which guarantees you will be in this same room next year.
Door two is the app teams. Sit down with the people whose namespaces account for 60% of requests but only 20% of actual usage. Build a real rightsizing program with them, not at them. Show them the safety mechanisms before you ask them to give anything back. Be specific about what gets reclaimed, where the buffer sits, what the rollback looks like, and who owns the pager if a recommendation is wrong. Door two is slower. It is also where the actual savings live.
Most teams I talk to default to door one because door two requires a kind of organizational work that does not show up on anyone’s roadmap. You have to earn the right to touch someone else’s manifests. That earning happens through conservative defaults, transparent recommendations, observable behavior under load, and clear contracts about who owns what when something gets resized. None of that is glamorous. All of it is the actual job.
The incentives are not aligned, and pretending otherwise is the problem
The reason rightsizing stalls inside large enterprises is not that engineers are afraid of automation. It is because the incentives are not aligned. The platform team owns the bill. The app team owns the SLO. If a request is too high, the platform team pays. If a request is too low and a pod gets killed, the app team gets paged. So the rational move for an app team is to over-request, every time, with a comfort buffer. The cluster fills with that rationality, and then the platform team gets handed a bigger bill.
You can’t fix that with tooling alone. You fix it by building confidence with the app team, giving them visibility into what their requests cost in real terms, and making the safety net explicit enough that they trust the rightsizing loop as much as they trust autoscaling tools. When that trust is in place, the same teams that were blocking a 30% capacity harvesting in March will accept it in July. I’ve watched this happen time and time again. It looks like a process problem until it looks like a culture problem, and then it looks like a budget win.
The hardware crunch is a forcing function, not a fate
Most platform teams will respond to this crunch by buying their way through it. They will go to finance, get a partial “yes,” and place orders that arrive nine to twelve months later at prices nobody budgeted for. The capacity gets absorbed within a quarter of landing, because nothing about how app teams request resources has changed. The next planning cycle starts in the same hole.
The other path costs dramatically less but takes more trust. It’s time to opportunistically treat the supply crunch as the moment when your organization finally makes utilization a shared responsibility rather than a platform team problem. Same workloads, same SLOs, less idle capacity, more headroom. That is a trust story, not a capex story. The capex story is easier to start and harder to finish. The trust story is harder to start, but it pays for itself the day after you finish it.
If hardware were getting cheaper, this would be a footnote. But the reality is that it is getting more expensive, arriving more slowly, and harder to forecast. That is exactly the moment to stop expanding the footprint and start using the one you already have.
Related Blogs
How to get Slack notifications when StormForge applies recommendations
The StormForge Applier does its job quietly. It watches for recommendations, applies patches to your workloads, and moves on—no fanfare,…
StormForge vs ScaleOps: A Technical Comparison of Kubernetes Rightsizing Approaches
StormForge and ScaleOps both automate Kubernetes resource optimization, but they take meaningfully different approaches to how much control you hand over and when. This page walks through the differences in architecture, automation model,…