Technical
Serverless: What Finally Broke at Scale
I've been quietly rooting for a pure-serverless stack to carry a client's production traffic on AWS free tier. Five months in, most of it works. A few things broke in ways that are worth documenting so nobody else learns them the hard way.
What Held Up
Lambda for stateless HTTP handlers: fine. DynamoDB for key-based lookups: fine. SES for transactional email: fine. API Gateway for the public edge: fine. The boring path through serverless is genuinely boring, which is what you want.
What Broke
Three things broke under real load:
- Cold starts on a Python Lambda with heavy imports. First request after idle was 3 seconds. Fix: moved the expensive import inside the handler so it only runs on warm invocations, and added a scheduled ping.
- DynamoDB hot partition on a popular item. A single post's metrics counter got all the traffic. Fix: sharded the counter across N partitions, read-aggregated at query time.
- SES sandbox limits. Forgot to request production access. Email queue filled up silently until I noticed. Fix: set up CloudWatch alarms on bounced and queued counts.
# Sharded counter write
shard = random.randint(0, SHARDS - 1)
table.update_item(
Key={'pk': f'counter#{post_id}', 'sk': f'shard#{shard}'},
UpdateExpression='ADD views :one',
ExpressionAttributeValues={':one': 1},
)The Verdict
Serverless earns its keep for workloads that are spiky, mostly-idle, and bounded in complexity. It is not a panacea. Every problem above was fixable, but each one required going back to fundamentals. The tools hide complexity, they don't remove it.
See the Lambda best practices docs.
RELATED READING
The Consulting Shift I Am Making In Year Two
After a year of writing and building, my consulting practice is changing shape. Shorter engagements. Sharper outcomes.
ReadThe Frontend Shift: Shipping Less JavaScript In Year Two
A year ago I reached for Next.js for everything. This year I often reach for nothing.
ReadThe Serverless Lesson I Would Write On A Sticky Note
After a year of shipping serverless projects, one rule explains most of the wins and all of the losses.
Read