Technical
Python Dataclasses vs Pydantic: Picking the Right Tool
New Python developers ask me which one to use. The honest answer is both, for different jobs. Using the wrong one adds latency or leaks bad data, and the two mistakes look identical in a stack trace.
The Split
Dataclasses are for internal, trusted data. You already validated it. You are passing it between functions. You need structure, not safety. Pydantic is for external, untrusted data. HTTP bodies, queue messages, config files. You need the validation.
from dataclasses import dataclass
from pydantic import BaseModel, EmailStr
class SubscriberRequest(BaseModel): # from HTTP, validate it
email: EmailStr
name: str
@dataclass
class Subscriber: # internal, already clean
id: str
email: str
name: str
created_at: strThe HTTP handler receives a SubscriberRequest. Pydantic rejects bad email formats before my code runs. I then construct a Subscriber dataclass for internal use. No repeated validation on the hot path.
Why The Split Matters
Pydantic validation is not free. On a high-throughput service I measured a 15% CPU drop by moving internal objects to dataclasses and keeping Pydantic only at the boundaries. The boundary is where errors originate. The interior is where you need speed.
This is the 'parse, don't validate' principle. Validate once at the edge, then trust the type system inside. It is also how I reason about security boundaries in the same codebase.
See the Pydantic docs for validation patterns. Use dataclasses for everything Pydantic is overkill on.
RELATED READING
The Consulting Shift I Am Making In Year Two
After a year of writing and building, my consulting practice is changing shape. Shorter engagements. Sharper outcomes.
ReadThe Frontend Shift: Shipping Less JavaScript In Year Two
A year ago I reached for Next.js for everything. This year I often reach for nothing.
ReadThe Serverless Lesson I Would Write On A Sticky Note
After a year of shipping serverless projects, one rule explains most of the wins and all of the losses.
Read