Kubernetes Cost Optimization: AKS & EKS Without the Waste
Kubernetes is where cloud cost goes to hide. You don't pay for pods — you pay for nodes (VMs on AKS, EC2 on EKS), and the gap between what your pods request and what they actually use is pure waste you're billed for. Most clusters run at 30–40% real CPU utilization while paying for 100% of the nodes. Here's how to close that gap, in priority order.
First: see requests vs. actual usage
The scheduler packs nodes by resource requests, not real consumption. So the first job is to compare the two:
# actual usage right now (needs metrics-server)
kubectl top pods -A --sum=true
kubectl top nodes
# what each pod RESERVED (requests) — the number you're billed against
kubectl get pods -A -o custom-columns=\
NS:.metadata.namespace,POD:.metadata.name,\
CPU_REQ:.spec.containers[*].resources.requests.cpu,\
MEM_REQ:.spec.containers[*].resources.requests.memory
For allocation by namespace/team, add an open-source cost tool — OpenCost (the CNCF standard) or Kubecost — which maps node cost onto pods by their requests. That turns "the cluster costs $X" into "team A's namespace costs $Y", which is the start of showback.
Lever 1 — Right-size pod requests (the biggest win)
Over-requested CPU and memory strand node capacity nobody uses. Lower requests to match reality (with headroom for spikes). The Vertical Pod Autoscaler in recommendation mode (Off updateMode) will suggest right-sized requests without changing anything:
kubectl describe vpa <name> # shows Target / Lower / Upper recommendations
Set requests near the VPA "target" (or your p90 usage), keep limits sane to avoid noisy-neighbour issues, and you'll immediately fit more pods per node. This one change routinely reclaims 20–40% of node spend.
Lever 2 — Bin-pack with the right autoscaler
Once requests are honest, let the cluster shrink to fit:
- Horizontal Pod Autoscaler scales replicas to demand so you're not running peak capacity 24/7.
- Cluster Autoscaler removes underused nodes. On AKS, enable the cluster autoscaler (and consider Node Autoprovisioning); on EKS, Karpenter is the modern choice — it provisions just-right instance types on demand and consolidates workloads onto fewer, cheaper nodes automatically.
Lever 3 — Spot for anything interruptible
Stateless, batch, CI, and dev workloads should run on Spot — typically 60–90% off on-demand. Use AKS Spot node pools or EKS Spot (via Karpenter or managed node groups), and keep critical/stateful pods on a small on-demand pool using taints, tolerations and node selectors so only interruptible work lands on Spot.
Lever 4 — Scale non-prod to zero off-hours
Dev/test clusters running nights and weekends are ~65% wasted time. Scale node pools to zero on a schedule, or use KEDA to scale workloads (and therefore nodes) to zero when idle. A dev cluster that sleeps 7pm–7am and weekends costs roughly a third of a 24/7 one.
Lever 5 — Cover the baseline with commitments
After right-sizing, your cluster has a steady-state floor of nodes that runs all the time — commit to that, not your peak. AKS nodes are Azure VMs (Reserved VM Instances / Azure savings plans for compute); EKS nodes are EC2 (Compute Savings Plans or RIs), and EKS Fargate is covered by Compute Savings Plans. Right-size first so you don't commit to waste.
Don't forget the leftovers
Kubernetes sheds orphaned cloud resources: unattached persistent-volume disks from deleted PVCs, idle LoadBalancer services (each spins up a billed cloud load balancer + public IP), old snapshots, and abandoned dev clusters. These don't show up in kubectl — they show up on the cloud bill. Sweep them with your normal cloud cost review.
See your cluster's node waste automatically. The CloudFinOpsKit tool checks AKS node pools and EKS node groups for right-sizing against real utilization, flags the orphaned disks and idle load balancers that clusters leave behind, and shows whether your steady-state nodes are covered by commitments — read-only, priced from your actual bill, for Azure and AWS.
FAQ
Is AKS or EKS cheaper?
The control plane pricing differs (AKS's standard tier and EKS both charge a small per-cluster hourly fee), but that's noise — the real cost is the worker nodes, and it's driven by how well you right-size and pack them, not by the provider. Optimization technique matters far more than the AKS-vs-EKS choice.
How do I allocate Kubernetes cost back to teams?
Use namespaces/labels per team and an allocation tool (OpenCost/Kubecost) that splits node cost by pod requests, then roll it into your wider cost-allocation statement. AI/ML workloads on GPU nodes especially need this.
What about GPU nodes?
GPU nodes are the most expensive thing in most clusters — keep them on their own pool, scale them to zero when no GPU jobs are queued, and never let general workloads schedule onto them (taint them).
Related reading: the Azure cost optimization checklist · the AWS cost optimization checklist · cloud unit economics