An AWS Cost-Cutting Checklist, in the Order We Actually Work It

Your AWS bill crept up again, someone in a meeting said “we should optimize our cloud spend,” and now you’re staring at a Cost Explorer dashboard with forty service line items wondering where to start.

Most cost-cutting advice doesn’t help here. Half of it is obvious (“turn off what you don’t use”), and the other half sends you down a rabbit hole — chasing a $12/month savings on a Lambda’s memory setting while a forgotten NAT Gateway quietly bills $400.

When we review a client’s bill, we work it in a deliberate order: biggest, easiest wins first, and rearchitecting last. Here’s that checklist, and — the part most cost lists leave out — the optimizations we deliberately don’t bother with.

Start with the bill itself, not a blog post

Before changing anything, find out where the money actually goes. Open Cost Explorer, group by service, and look at the top five line items. In the accounts we review, the top five are usually 80%+ of spend. Everything below them is noise until the top is handled.

Two things make this faster:

Turn on cost allocation tags (Billing → Cost allocation tags) and group by tag — Environment, Team, or Project. If you’ve never tagged anything, that’s finding number one: you can’t manage what you can’t attribute.
Look at the last 3 months, not last month. A single spiky month sends you chasing the wrong thing. Trends tell the truth.

You’re looking for two patterns: a line item that’s bigger than it should be (right-sizing) and a line item that shouldn’t exist at all (waste). Handle waste first — it’s free money with no performance trade-off.

Kill the quiet money sinks

These are the orphans and idle resources nobody owns. Individually small, collectively often 10–20% of a bill:

Unattached EBS volumes and old snapshots. Non-root volumes — or any volume where DeleteOnTermination was disabled — keep billing after their instance is gone (root volumes delete by default). Snapshots accumulate forever unless you have a lifecycle policy.
gp2 volumes that should be gp3. gp3 is roughly 20% cheaper per GB and lets you set IOPS independently of size. You can change the volume type live — no downtime, no detach — so for most workloads it’s a genuinely no-downside swap.
Idle NAT Gateways and stray public IPv4. A NAT Gateway bills hourly plus per-GB data processing — it’s a classic surprise. And since February 2024, every public IPv4 address bills hourly — attached or not — at $0.005/hr (~$3.60/month each). An account with dozens of public IPs spread across EC2 instances, NAT Gateways, and load balancers is leaking real money; audit with VPC IPAM’s Public IP Insights and move what you can to IPv6 or behind a shared egress.
Data transfer you can’t point at a resource. Internet egress and cross-AZ traffic are among the most common surprise line items — and unlike an idle volume, there’s nothing to delete. Check the data-transfer rows in Cost Explorer; the usual culprits are chatty cross-AZ services and unnecessary internet egress — which, from private subnets, also racks up NAT data-processing charges on top. The deeper egress-architecture story deserves its own post.
Non-prod running 24/7. Dev and staging rarely need nights and weekends. A scheduler that stops them outside business hours cuts those environments by ~70% (you keep maybe 50 of 168 weekly hours).
Forgotten everything — old load balancers, abandoned RDS instances, log groups with no retention set quietly growing in CloudWatch.

None of this touches production performance. It’s the highest return-on-effort work on the list, so do it first.

Right-size before you reserve

The most common mistake we see is buying a Savings Plan or Reserved Instance on top of oversized infrastructure — locking in a one-to-three-year commitment to pay for waste.

So right-size first. Pull the last few weeks of CloudWatch metrics (and turn on the memory agent — AWS doesn’t track RAM by default) and look for instances sitting at single-digit CPU. Drop them a size, or two. AWS Compute Optimizer will surface the obvious candidates for free.

Only commit to what survives right-sizing.

Move steady-state compute to Graviton

Once an instance is sized correctly and you know it’s going to run continuously, the next lever is the processor underneath it. AWS’s Graviton (ARM) instances generally deliver meaningfully better price/performance than the equivalent x86 — and the savings compound on anything that runs 24/7.

The migration is usually less painful than teams fear: most interpreted and managed runtimes (Node, Python, Java, most containers) run on ARM with little or no change. The work is in your build pipeline and any native dependencies. We walked through the EC2 side of this in Migrating your x86 EC2 web servers to Graviton2 instances, and in one stack — pairing Graviton with our Kyte framework — we documented a 64% cost reduction. Your mileage depends on the workload, but the price/performance gap is real and worth testing.

Commit only what’s genuinely steady

Now you reserve. With infrastructure right-sized and on the right processor, Compute Savings Plans are the simplest high-value commitment — they apply across instance families, sizes, and regions, and can take a large bite out of on-demand rates in exchange for a one- or three-year commitment.

The honest trade-off: a Savings Plan is a financial commitment, not a technical one. Don’t cover 100% of usage — cover your reliable baseline and leave headroom on-demand — or on Spot for stateless, fault-tolerant workloads — for the variable top. Over-committing to lock in a slightly better rate is how teams end up paying for capacity they no longer use.

Then — and only then — rearchitect

Architecture changes are the biggest potential savings and the biggest effort and risk, which is why they come last, not first:

Does it need a server at all? A surprising amount of “web hosting” spend is a static site behind an over-provisioned EC2 box. Moving it to S3 + CloudFront can drop the cost to near-nothing — we run sites for less than a dollar a month this way.
Serverless vs. always-on. For spiky or low-baseline workloads, Lambda + managed services can beat an idle fleet — but for steady, high-throughput traffic, always-on compute is often cheaper. It’s a genuine trade-off, not a default; we broke it down in Serverless vs. Traditional Servers.

Rearchitecting pays the most, but it spends engineering time and adds risk. Bank the free wins above first; reach for this when the line item is big enough to justify the project.

What we usually don’t bother with

Knowing what to skip keeps you out of the rabbit hole:

Penny-tuning Lambda memory unless the function is genuinely high-volume. The engineering hour costs more than the saving.
Spot for stateful production without a real interruption-handling plan. The savings are tempting; the 2-minute eviction notice is not, mid-transaction.
Chasing every last service. If it’s not in the top five line items, it’s usually not worth a meeting.

The order is the point

The reason this works isn’t any single trick — it’s the sequence. Free waste removal, then right-sizing, then the right processor, then commitments on what’s left, then architecture. Each step makes the next one cheaper and safer, and you stop long before the effort outweighs the savings.

If your bill has crept past where it should be and you’d rather not spend a sprint spelunking through Cost Explorer, that’s the kind of review we do — the first pass typically finds enough waste to cover the engagement.