Skip to main content

Python Generators: Process Millions of Items Without Running Out of Memory

Advanced30 min7 exercises125 XP
0/7 exercises

Picture an assembly line in a factory. Workers don't build every product at once and pile them up — they build one item, pass it down the line, then build the next. The factory floor stays clean, and resources stay manageable.

That's what generators do for your code. In the previous tutorial, you learned how to build iterators with classes and __iter__/__next__. Generators give you the same power with a fraction of the code — just use yield instead of return.

What Makes yield Different from return?

When a function hits return, it's done. The function finishes, its local variables are destroyed, and it hands back a single value. A yield is different — it pauses the function, saves its state, and hands back a value. When you ask for the next value, the function resumes right where it left off.

Regular function with return
def get_squares(n):
    result = []
    for i in range(n):
        result.append(i ** 2)
    return result  # All at once

print(get_squares(5))
Generator function with yield
def gen_squares(n):
    for i in range(n):
        yield i ** 2  # One at a time

for sq in gen_squares(5):
    print(sq)

The regular function builds the entire list in memory, then returns it. The generator produces one value, pauses, waits for the next request, and continues. For 5 items this doesn't matter much. For 5 million items, it's the difference between crashing and running smoothly.

How Do You Create and Use a Generator?

Step-by-step generator execution
Loading editor...

Notice that "Starting countdown!" doesn't print until the first next() call. The function body is completely frozen until you ask for a value. After yielding 1, the next next() call would print "Blastoff!" and then raise StopIteration.

Your First Generator
Write Code

Write a generator function called evens_up_to(n) that yields all even numbers from 2 up to and including n.

for e in evens_up_to(10):
    print(e)
# Output: 2 4 6 8 10
Loading editor...

What Are Generator Expressions?

Just like list comprehensions create lists, generator expressions create generators. The syntax is identical — just swap the square brackets for parentheses.

List comprehension (eager)
squares = [x**2 for x in range(10)]
print(type(squares))  # <class 'list'>
Generator expression (lazy)
squares = (x**2 for x in range(10))
print(type(squares))  # <class 'generator'>

Generator expressions are perfect for one-shot transformations where you don't need to keep the results. You can pass them directly into functions like sum(), max(), and min().

Generator expressions with built-in functions
Loading editor...
Predict the Generator Output
Predict Output

What will this code print? Think carefully — generator expressions are lazy.

def make_gen():
    return (x * 10 for x in range(4))

g = make_gen()
next(g)
print(next(g))
print(sum(g))

Write two print() statements that produce the exact same output.

Loading editor...

How Much Memory Do Generators Actually Save?

The memory difference is dramatic. A list of 10 million integers takes roughly 80 MB of RAM. A generator that yields the same values uses a constant ~120 bytes regardless of how many items it produces.

Memory comparison: list vs generator
Loading editor...
Refactor to a Generator
Refactor

Refactor this function to use yield instead of building and returning a list. The output should remain the same.

def get_powers_of_two(n):
    result = []
    power = 1
    for _ in range(n):
        result.append(power)
        power *= 2
    return result

for p in get_powers_of_two(6):
    print(p)
Loading editor...

What Does yield from Do?

yield from delegates to another iterable, yielding each of its values as if you wrote yield for each one individually. It's especially useful for flattening nested structures or composing generators.

Without yield from
def flatten(nested):
    for sublist in nested:
        for item in sublist:
            yield item
With yield from
def flatten(nested):
    for sublist in nested:
        yield from sublist
Flattening nested lists with yield from
Loading editor...
Flatten with yield from
Write Code

Write a generator function flatten_deep(data) that flattens a nested list of arbitrary depth. If an element is a list, recurse into it. Otherwise, yield the element.

for item in flatten_deep([1, [2, [3, 4], 5], [6]]):
    print(item)
# Output: 1 2 3 4 5 6
Loading editor...

How Do You Build a Generator Pipeline?

One of the most powerful patterns with generators is pipelining — chaining multiple generators together, where each one processes data from the previous one. Think of it as an assembly line: each station does one small transformation.

A three-stage generator pipeline
Loading editor...

This pipeline processes numbers 1-8, filters for odd numbers (1, 3, 5, 7), and squares each one (1, 9, 25, 49). Each generator does exactly one job. No intermediate lists are created — data flows through lazily.

Build a Data Pipeline
Write Code

Build a three-stage generator pipeline:

1. words(sentence) — yields each word from the sentence (split by spaces)

2. long_words(seq, min_len) — yields only words with length >= min_len

3. uppercased(seq) — yields each word in uppercase

Then chain them: uppercased(long_words(words(sentence), 4))

Use the sentence: 'the quick brown fox jumps over the lazy dog'

Loading editor...

Can You Communicate with a Running Generator?

Generators aren't just one-way streets. You can send values back into a paused generator, throw exceptions into it, or close it early.

Using send() to push values into a generator
Loading editor...

The send(value) method resumes the generator and makes the yield expression evaluate to value. The generator then runs until the next yield and returns that value. You must always call next() first to advance to the first yield — this is called "priming" the generator.

Fix the Generator Bug
Fix the Bug

This generator should count words, yielding the running count after each word is sent. But it has a bug. Fix it.

def word_counter():
    count = 0
    while True:
        word = yield
        count += 1
        print(f'Word {count}: {word}')

Expected output when used:

Word 1: hello
Word 2: world
Word 3: python
Loading editor...
Infinite Fibonacci Generator
Write Code

Write a generator function fibonacci() that yields an infinite sequence of Fibonacci numbers starting with 0, 1, 1, 2, 3, 5, 8, ...

Then use it to print the first 10 Fibonacci numbers.

Loading editor...

What Should You Remember About Generators?

Generators are one of Python's most elegant features. They let you write lazy, memory-efficient iterators using simple functions with yield.

Key takeaways:

  • yield pauses the function and saves its state; return ends it
  • Generator expressions (x for x in seq) are lazy list comprehensions
  • Generators use constant memory regardless of sequence length
  • yield from delegates to sub-iterators cleanly
  • Generator pipelines chain multiple transforms without intermediate lists
  • send() pushes values back into a paused generator
  • Next up: decorators, which let you wrap and enhance functions without modifying them