When Kubernetes 1.33 dropped, the headline that jumped out at me was in-place pod resizing.
“Finally,” I thought, “we can bump CPU or memory on a running container without nuking it.”
As a former engineering manager who once babysat clusters with thousands of pods during trading hours, this felt huge. Before now, the only real option was to spin up a new pod with different resources and roll it out, which meant downtime risk, connection drains, and readiness-probe delays. Sometimes even a full replica replacement just to get more memory.
The idea of patching a live pod felt like magic.
But the other part of my brain—the one that knows how deceptively stateful Kubernetes can be—yelled: Pods are disposable. So how the hell do you patch something that might vanish in 30 seconds? And what happens to the next pod? Particularly with HPA in play?
That tension—the power of in-place patching vs. the ephemeral nature of pods— is shaping everything we’re building at CloudBolt.
Pod resizing is like UDP. Deployment updates are like TCP.
To make sense of this, I picture two kinds of changes:
Pod-level patches ≈ UDP
- Fast, lightweight, fire-and-forget. But not guaranteed.
Workload (Deployment) patches ≈ TCP
- Ordered, stateful, and synchronized across replicas.
Both matter. But orchestrating them together—especially in production—is where things get messy fast.
Four face-palm lessons from my first week hacking on in-place pod resizing
So how do you actually make this work without tripping over Kubernetes edge cases? Here’s what bit me right out of the gate.
1. No requests/limits? No patch for you.
If a Deployment ships with zero resource requests, its pods get the BestEffort QoS class. Try to patch one of those to add requests, and Kubernetes rejects the call—it would change the pod’s QoS class mid-flight. Moral of the story: set some requests and limits up front, or forget about in-place tweaks.
2. A PATCH isn’t a promise.
A successful PATCH only means “the control plane accepted it.” Whether the node actually has free headroom is another story—remember, pod PATCHes are like UDP: fire-and-forget. You’ll need to check .status.containers[].resources to confirm whether the pod actually received what it requested after patching.
3. Want to shrink memory? Add resizePolicy first.
Kubernetes will happily let you patch up memory, but it refuses to patch down unless every container explicitly defines resources.resizePolicy. This behavior is baked into 1.33. So if you’re running legacy manifests, you’ll need to add that field before you can right-size memory down.
4. The default edit role won’t cut it.
Depending on your distro, the default Kubernetes edit ClusterRole doesn’t include permissions to PATCH the /resize subresource. You’ll have to manually extend it—or create a new role from scratch.
Where in-place pod resizing shines
Once I got past the initial gotchas, I started to see where in-place pod resizing could actually be a game-changer.
Think trading hours. Or a Black Friday checkout spike. CPU climbs, memory gets tight, pager goes off. You’ve got two choices:
Option 1: Roll out a new ReplicaSet
- Scheduler looks for a node.
- The autoscaler might spin one up.
- Readiness probes delay traffic.
It’s a slow, multi-step rollout with multiple points of delay—right when you need speed the most.
Option 2: Over-provision all the time
- Allocate far more resources than you need.
- Keep them idle just in case traffic spikes.
- Burn budget—especially for JVM workloads that spike during startup but settle down afterward.
It’s expensive, inefficient, and rarely justifiable in the long run.
But with in-place pod resizing, there’s a third option: Patch the live pod.
No restarts. No churn. No waiting around for readiness gates to pass.
That kind of just-in-time capacity makes all the difference when you’re trying to avoid an outage—not win elegance points.
What about HPA and VPA?
And if you’re like most teams, in-place pod resizing isn’t happening in isolation—it’s bumping elbows with HPA and VPA. That’s where things get weird.
Resizing a pod live changes that pod’s resources—but it doesn’t update the Deployment spec.
So when HPA kicks in and spins up a new pod? It’ll pull from the old config. Unless you sync the patch back, you’re basically forking your workload.
- HPA keeps functioning based on metrics, but if your patched pod and Deployment spec are out of sync, it can lead to unpredictable behavior.
- VPA does not support in-place pod resizing unless you install the latest version (1.4.X) with the feature gate enabled. Hopefully, in-place pod resizing draws more attention to VPA so we can finally have the conversation it deserves.
That’s the catch: in-place patches don’t persist unless you make them persist.
How to use it wisely
Now that we’ve covered what it can do, here are some ground rules I’ve learned the hard way.
- If your workload handles restarts cleanly, a redeploy is often simpler.
- If you find yourself patching the same pod again and again? That’s not agility—it’s misconfigured requests.
- If you don’t have decent observability in place, resizing might just mask real issues.
- Shrinking memory mid-day? Not worth it. Resize up when needed, scale down after hours when it’s safe to recycle pods.
To put it simply:
During the day:
- Use in-place resizing to bump memory up when a pod needs more headroom. Machine learning can suggest lowering CPU requests based on steady-state usage—without compromising performance.
At night:
- Sync those resized values back into the Deployment spec so tomorrow’s pods launch with settings that actually reflect what worked in production.
Treat live patches as just-in-time fixes—but always close the loop and prevent config drift.
How StormForge is adding support for in-place pod resizing
Once we saw how in-place resizing could help avoid outages—but also how fragile and manual it was—we knew it had to be handled differently if it was going to work in the real world.
So we started building support from the ground up.
Automatic preflight.
We began by addressing the most common blockers to in-place patching:
- If an app is missing requests or limits, we inject sane defaults during onboarding.
- If the manifest lacks a
resizePolicy, we add it—provided the cluster supports it.
This way, workloads are patchable from day one, without developers having to memorize the 1.33 release notes.
A multi-mode patching engine
Supporting in-place pod resizing isn’t just about issuing a PATCH—it’s about managing the lifecycle around it. We built a patch engine to do just that:
- Hot Fix Patches: Triggered by real-time signals like OOM events or CPU spikes. We patch the pod immediately, without waiting for a redeploy.
- SyncBack Patches: Off-peak background jobs that take what worked in production and sync it back to the Deployment spec. That way, future pods aren’t stuck with outdated configs.
- Startup and Steady-State Patches: Some apps require more resources during startup than in steady-state (JVMs, we’re looking at you). We support separate resource profiles for startup and steady-state—and automatically switch once the readiness probe passes.
As Kubernetes expands support for in-place pod resizing, we’re building right alongside it. Try it out in StormForge, give us feedback, and check our release notes for what’s coming next.
TL;DR for fellow platform engineers
- Upgrade to 1.33+ before you even think about pod-level resizing.
- Bake minimal viable requests/limits into every workload—yes, even cronjobs.
- Add
resizePolicynow or resign yourself to size-ups only. - Treat in-place pod resizing as a hot-fix, not a config replacement.
- Don’t trust PATCH alone—check
.statusto confirm - Automate the drift—StormForge can help
Use it right, and you get flexibility without sacrificing the control and predictability ops teams live for.
Start your free 30-day trial of StormForge
Start free trial