Python Dataclasses: Less Boilerplate, More Readable Code
Imagine filling out a form where you have to write your name, address, and phone number. Now imagine filling out 50 forms where every single one asks for the same information in a slightly different format. That's what writing Python classes feels like sometimes.
Every time you create a class to hold data, you end up writing the same boring __init__, __repr__, and __eq__ methods. Dataclasses are Python's answer to this problem. They auto-generate all that repetitive code, so you can focus on what your class actually does.
In this tutorial, you'll learn how to use the @dataclass decorator, set default values, run validation with __post_init__, create immutable objects with frozen, and compare dataclass instances.
What Problem Do Dataclasses Solve?
Let's see the pain first. Here's a typical class that just stores some data about a book. Look at how much code it takes to do something simple.
That's 16 lines of code just to store four values, display them nicely, and compare two books. Every time you add a new field, you have to update three different methods. It's tedious and error-prone.
class Book:
def __init__(self, title, author, pages, price):
self.title = title
self.author = author
self.pages = pages
self.price = price
def __repr__(self):
return f'Book(...)'
def __eq__(self, other):
return (self.title == other.title and
...)from dataclasses import dataclass
@dataclass
class Book:
title: str
author: str
pages: int
price: floatHow Do You Create a Dataclass?
Import dataclass from the dataclasses module, put @dataclass above your class, and list your fields with type annotations. Python generates __init__, __repr__, and __eq__ for you automatically.
With just four lines, you get a class with an __init__ that accepts name, grade, and gpa, a __repr__ that shows all field values, and an __eq__ that compares all fields.
With a regular class, p1 == p2 would be False because Python compares memory addresses by default. Dataclasses compare the actual values of every field, which is almost always what you want.
How Do Default Values Work in Dataclasses?
You can give fields default values just like function arguments. Fields with defaults must come after fields without defaults — same rule as function parameters.
But what if you want a default value that's a mutable object, like a list? You can't use tags: list = [] directly — Python would share the same list between all instances. This is a classic bug.
The solution is field(default_factory=...). It calls a function to create a fresh default value for each new instance.
Each cart gets its own separate list. default_factory=list calls list() each time a new ShoppingCart is created, giving each instance a brand new empty list.
How Do You Validate Data with __post_init__?
Sometimes you need to run code right after an object is created — validate inputs, compute derived values, or format data. The special __post_init__ method runs automatically after the generated __init__ finishes.
The fahrenheit field has a default of 0.0, but __post_init__ immediately recalculates it from celsius. You never need to pass fahrenheit — it's always computed.
What Are Frozen (Immutable) Dataclasses?
Sometimes you want an object that cannot be changed after creation. Think of it like a printed receipt — once it's printed, you can't edit the numbers. Setting frozen=True makes your dataclass immutable.
Frozen dataclasses are also hashable, which means you can use them as dictionary keys or put them in sets. Regular dataclasses (which are mutable) are not hashable by default.
How Do You Sort and Compare Dataclasses?
By default, dataclasses generate __eq__ for equality checks. If you want to use <, >, and sorting, add order=True to the decorator. Python will compare fields in the order they're defined, like comparing tuples.
Python compares major first, then minor, then patch — just like version numbers should be compared. This works because fields are compared in definition order, top to bottom.
Practice Exercises
Create a dataclass called Product with three fields: name (str), price (float), and quantity (int). Create a product with name 'Laptop', price 999.99, and quantity 5, then print it.
Create a dataclass called Playlist with:
name (str) — no defaultsongs (list) — default to an empty list using field(default_factory=list)rating (float) — default to 0.0Create a playlist called 'Road Trip', add 'Bohemian Rhapsody' and 'Hotel California' to its songs list, set its rating to 4.5, and print the playlist.
This dataclass has a bug — fields with defaults come before fields without defaults. Fix the field order so the output is:
Student(name='Alice', grade=10, gpa=3.9)Create a dataclass called Rectangle with width (float), height (float), and area (float, default 0.0). Use __post_init__ to compute area as width * height automatically. Create a rectangle with width 5.0 and height 3.0, then print its area.
What does this code print? Think about what frozen=True does.
from dataclasses import dataclass
@dataclass(frozen=True)
class City:
name: str
population: int
c = City('Tokyo', 14000000)
print(c.name)
try:
c.population = 0
except AttributeError:
print('Cannot modify frozen dataclass')Create a dataclass Student with order=True and two fields: gpa (float) and name (str). Note: put gpa first so sorting uses GPA as the primary key. Create three students: (3.5, 'Charlie'), (3.9, 'Alice'), (3.2, 'Bob'). Sort them and print each student's name and GPA.
Refactor this regular class into a dataclass. The output must remain the same:
Movie(title='Inception', director='Nolan', year=2010)
TrueReplace the manual __init__, __repr__, and __eq__ with a @dataclass decorator.