The Azure Cost Optimization Checklist for 2026: 47 checks that find real money
Most "cost optimization tips" articles list the same five ideas. This is the working checklist — grouped the way an actual monthly cost review runs, ordered so the zero-risk savings come first. Work it top to bottom and you'll touch every major waste pattern we see in real Azure estates. (2026 context worth knowing: per the State of FinOps survey, 98% of FinOps teams now manage AI spend, and the discipline now covers SaaS, licensing and data centre too — so the checklist ends where most don't: AI workloads.)
Phase 1 — Remove pure waste (zero performance risk)
- Orphaned managed disks — unattached, fully billed. Full guide here (mind the Site Recovery traps).
- Unattached public IPs — Standard static IPs bill (~$3.65/mo each) even pointing at nothing.
- Disks on long-deallocated VMs — stopped VMs don't bill compute, but every disk still does.
- Disconnected snapshots — snapshots whose source disk no longer exists.
- Orphaned restore point collections — VM restore points whose parent VM was deleted keep accruing snapshot storage.
- Backup items for deleted resources — Recovery Services vaults keep charging for protected items whose source VM/share is gone until you stop backup with delete-data.
- Empty App Service plans — paying for compute hosting zero apps.
- Idle load balancers — Standard LBs with no rules and every backend pool empty.
- Idle NAT gateways — not associated to any subnet, still ~$32/mo + data.
- Idle VPN/ExpressRoute gateways — no connections for 30+ days.
- Empty Event Hub / Service Bus namespaces — namespace fees with zero entities (Premium Service Bus is ~$668/mo per messaging unit).
- Idle Application Gateways — no listeners, no rules, empty pools (verify all three — AGIC setups can look idle).
- Disabled/empty Traffic Manager profiles and idle DDoS plans.
Phase 2 — Rightsize what's left
- Idle VMs by actual CPU — <5% average over 30 days with low peaks: deallocate, downsize, or move to Spot.
- Advisor rightsizing recommendations — apply or document why not.
- Old VM generations — v3/v4 series to v5/v6 usually means same-or-better performance per dollar.
- Dev/test VMs without auto-shutdown — 24/7 running costs ~2.5× a 10-hour schedule.
- Over-provisioned Premium SSD v2 / Ultra disk performance — provisioned IOPS/throughput above what Monitor shows you using is pure headroom tax (tunable independently of size).
- Premium disks on dev workloads — Standard SSD is fine for most non-prod.
- App Service plan tiers — Premium plans hosting one small site; consolidate.
- Container Apps minReplicas — 0 for scale-to-zero on non-prod; ACI on Always restart for long-running jobs is the expensive pattern.
- AKS — Spot node pools for stateless/batch (60–80% off), cluster autoscaler on, right node SKUs.
Phase 3 — Storage & data
- Lifecycle management policies — the #1 storage saving: auto-tier blobs Hot → Cool → Archive.
- Hot-tier data nobody reads — Cool is ~45% cheaper per GB; Archive ~90%.
- Legacy v1 storage accounts — upgrade to GPv2 (free, instant, unlocks tiering).
- GRS/RA-GRS by default — geo-redundancy doubles cost; require justification, use LRS/ZRS where RPO allows.
- Full snapshots on a schedule — switch to incremental snapshots (bill changed blocks only).
- Backup retention sprawl — vault policies keeping yearly points 7+ years "because default"; align to actual compliance need.
- Recovery Services vault redundancy — GRS vaults for workloads that don't need geo-restore: ~50% saving moving to LRS.
Phase 4 — Databases
- SQL DTU → vCore migration — unlocks Azure Hybrid Benefit and reserved capacity.
- Serverless auto-pause for intermittent SQL — dev/test databases that sleep nights and weekends.
- Over-provisioned elastic pools — eDTU/vCore far above peak usage.
- Cosmos DB — provisioned RU/s on spiky workloads → autoscale or serverless; review Strong consistency and multi-region writes.
- PostgreSQL/MySQL flexible servers — Burstable tier for dev; HA doubles compute, so confirm the SLA need.
- Redis tiers — Premium/Enterprise features (persistence, geo-rep) actually in use?
Phase 5 — Commitments & licensing (the big multipliers)
- Reserved instances for steady compute — ~40% (1yr) to ~60%+ (3yr) off. RI vs Savings Plans guide.
- Savings plans for flexible compute — shallower discount, survives SKU/region change.
- Reservation utilization ≥ 90% — use-it-or-lose-it; monitor monthly, exchange under-used reservations.
- Azure Hybrid Benefit everywhere it's eligible — and nowhere it isn't. Full eligibility checklist.
- Dev/Test subscription offers for non-prod — base-rate Windows/SQL via the offer itself, not tags.
- Reserved capacity beyond VMs — SQL, Cosmos, storage, App Service all have reservation options.
Phase 6 — Monitoring, ops & AI workloads (the 2026 additions)
- Log Analytics ingestion — commitment tiers above ~100 GB/day (15–30% off), daily caps on noisy non-prod workspaces, retention beyond 90 days only where required.
- App Insights sampling — 100% sampling on chatty apps is an ingestion firehose.
- Verbose diagnostic settings — every log category forwarded "just in case" is a recurring bill.
- GPU/AI compute utilization — idle GPU VMs are the most expensive idle in Azure; schedule, pool, or move burst training to spot.
- AI service tier & token spend — review Azure OpenAI/AI Foundry provisioned throughput vs PAYG against actual tokens; batch where latency allows.
- Tag coverage & budgets — owner/cost-centre tags ≥80%, budgets with alerts on every subscription, and a monthly review cadence so all 46 checks above keep happening.
Run all of this in one click. The CloudFinOpsKit Tool ($25) automates this checklist — 75+ checks across every subscription you can read, priced from your actual billed cost, with an interactive report, FOCUS-aligned export, and a copy-for-AI hand-off. Read-only, no install, no agents.
Prefer paper first? The free Azure Cost Review Checklist (PDF) is the one-page meeting agenda version.
FAQ
Where do I start?
Phase 1, always. Waste removal needs no performance analysis, no owner debate, no commitment — it's free money and it builds the credibility to do the harder phases.
How much will I save?
Industry numbers hover around 25–30% of cloud spend being avoidable. Your defensible number comes from pricing findings against your actual bill — which is exactly why our tool reads Cost Management data rather than guessing from list prices.
Monthly or quarterly review?
Monthly. Churn never stops; quarterly reviews let waste run for 90 days. The checklist gets faster every cycle.
Related reading: finding orphaned disks · Azure Hybrid Benefit checklist · RIs vs Savings Plans · what is FOCUS?