Skip to main content

How Python Works Under the Hood: Memory, GIL, and Bytecode

Expert30 min5 exercises100 XP
0/5 exercises

Every time you type x = 42, a lot happens behind the scenes. Python creates an integer object in memory, assigns a reference to it, and manages its lifecycle automatically. Understanding these internals doesn't just satisfy curiosity -- it helps you write faster code, debug memory issues, and nail technical interviews.

What Is CPython?

When people say "Python," they usually mean CPython -- the reference implementation written in C. It's the version you download from python.org. Other implementations exist (PyPy, Jython, IronPython), but CPython is by far the most common.

CPython works in two stages: first it compiles your source code into bytecode (.pyc files), then it interprets that bytecode on a virtual machine. Let's see this in action.


Bytecode: Python's Secret Language

The dis module lets you peek at the bytecode instructions that CPython generates. Each instruction is a simple operation like "load a value" or "call a function." Let's disassemble a simple function:

Python
Loading editor...

You'll see instructions like LOAD_FAST, BINARY_ADD, and RETURN_VALUE. Each one maps to a C function inside the CPython interpreter. The bytecode is stack-based: values are pushed onto a stack, operations pop them off and push results back.

Python
Loading editor...

Object Identity: id() and is

Every Python object has a unique identity, accessible via id(). The is operator checks whether two variables point to the same object in memory, while == checks if they have the same value.

Python
Loading editor...

Integer Caching and String Interning

CPython caches small integers from -5 to 256 at startup. Every variable assigned one of these values points to the same pre-created object. This saves memory and speeds up common operations.

Python
Loading editor...

Reference Counting and Garbage Collection

CPython uses reference counting as its primary memory management strategy. Every object has a counter tracking how many variables reference it. When the count drops to zero, the memory is freed immediately.

Python
Loading editor...

Reference counting can't handle circular references -- two objects that reference each other. That's where Python's garbage collector steps in. It periodically scans for reference cycles and cleans them up.

Python
Loading editor...

The Global Interpreter Lock (GIL)

The GIL is a mutex that allows only one thread to execute Python bytecode at a time. Even on a 16-core machine, only one thread runs Python code at any given moment. This simplifies CPython's memory management but limits true parallelism for CPU-bound tasks.

Note: Python 3.13+ introduces an experimental free-threaded mode (PEP 703) that removes the GIL. This is a major ongoing change in the Python ecosystem.


__dict__ vs __slots__: Memory Layout

By default, each Python object stores its attributes in a dictionary (__dict__). This is flexible but uses more memory. __slots__ replaces the dictionary with a fixed set of attribute slots, saving memory when you have many instances.

Default __dict__
class PointDict:
    def __init__(self, x, y):
        self.x = x
        self.y = y

p = PointDict(1, 2)
print(p.__dict__)  # {'x': 1, 'y': 2}
p.z = 3  # Allowed! Dynamic attribute
With __slots__
class PointSlots:
    __slots__ = ['x', 'y']
    def __init__(self, x, y):
        self.x = x
        self.y = y

p = PointSlots(1, 2)
# p.__dict__  # AttributeError!
# p.z = 3     # AttributeError!
Python
Loading editor...

Practice Exercises

Disassemble a Function
Write Code

Write a function called multiply(a, b) that returns a * b. Then use dis.dis() to disassemble it. After disassembling, print Done on a new line.

Loading editor...
Predict Integer Caching
Predict Output

What does the following code print?

a = 100
b = 100
print(a is b)

x = [1, 2]
y = [1, 2]
print(x is y)
Loading editor...
Count References
Write Code

Write code that:

1. Creates a list data = [10, 20, 30]

2. Prints the reference count of data using sys.getrefcount()

3. Creates alias = data (a second reference)

4. Prints the reference count of data again

5. Deletes alias with del alias

6. Prints the reference count of data one more time

Note: sys.getrefcount() itself adds a temporary reference, so the count is always 1 higher than you might expect.

Loading editor...
Slots vs Dict Memory
Write Code

Create two classes:

1. PersonDict with __init__(self, name, age) that sets self.name and self.age (uses default __dict__)

2. PersonSlots with __slots__ = ['name', 'age'] and the same __init__

Create one instance of each. Print whether each has a __dict__ attribute using hasattr(). Expected output:

PersonDict has __dict__: True
PersonSlots has __dict__: False
Loading editor...
Fix the Identity Bug
Fix the Bug

The code below uses is to compare values, which is unreliable. Fix it to use proper value comparison so that all three comparisons print True.

Loading editor...