Technical
Agent Guardrails: Why Boundaries Matter in Production
Early agent workflows gave the agent everything. Full filesystem access. All shell commands. Every API key in reach. It felt productive until the agent deleted a file it should not have. Now I run every production-adjacent agent with explicit guardrails.
Three Layers
Tool allowlist: the agent only has the tools it needs. Not every tool I own. Content agent gets read-post and create-post. It does not get delete-user.
Path allowlist: the filesystem tools restrict to specific directories. An agent working on a blog post cannot touch /etc or my SSH keys. The sandboxing is enforced outside the prompt.
Dry-run default: destructive operations log what they would do but do not execute. The agent reports the plan. A human (or an orchestrator with more context) approves. Then the real call fires.
# Production guardrail example
@tool
def delete_post(slug: str, dry_run: bool = True) -> dict:
if dry_run:
return {'would_delete': slug, 'dry_run': True}
# real deletion only with dry_run=False
db.delete(slug)
return {'deleted': slug}The agent calls delete_post('x') and gets a dry-run result. It has to call delete_post('x', dry_run=False) to actually delete. The second call requires explicit intent that my orchestrator logs.
The Real Cost Of Skipping This
One bad deletion in production is worth a thousand dry-runs. Insurance you do not buy until you need it is not insurance. I put guardrails in from day one on any agent that touches production data.
The Prompt Side
Prompts also have guardrails, but prompts are requests, not enforcement. 'Do not delete files' in a prompt is a polite suggestion. Code-level restrictions are what actually prevent damage.
See the Anthropic agent safety guide for patterns. Production agents need production guardrails, not production hopes.
RELATED READING
The Consulting Shift I Am Making In Year Two
After a year of writing and building, my consulting practice is changing shape. Shorter engagements. Sharper outcomes.
ReadThe Frontend Shift: Shipping Less JavaScript In Year Two
A year ago I reached for Next.js for everything. This year I often reach for nothing.
ReadThe Serverless Lesson I Would Write On A Sticky Note
After a year of shipping serverless projects, one rule explains most of the wins and all of the losses.
Read