How Python Works Under the Hood: Memory, GIL, and Bytecode
Every time you type x = 42, a lot happens behind the scenes. Python creates an integer object in memory, assigns a reference to it, and manages its lifecycle automatically. Understanding these internals doesn't just satisfy curiosity -- it helps you write faster code, debug memory issues, and nail technical interviews.
What Is CPython?
When people say "Python," they usually mean CPython -- the reference implementation written in C. It's the version you download from python.org. Other implementations exist (PyPy, Jython, IronPython), but CPython is by far the most common.
CPython works in two stages: first it compiles your source code into bytecode (.pyc files), then it interprets that bytecode on a virtual machine. Let's see this in action.
Bytecode: Python's Secret Language
The dis module lets you peek at the bytecode instructions that CPython generates. Each instruction is a simple operation like "load a value" or "call a function." Let's disassemble a simple function:
You'll see instructions like LOAD_FAST, BINARY_ADD, and RETURN_VALUE. Each one maps to a C function inside the CPython interpreter. The bytecode is stack-based: values are pushed onto a stack, operations pop them off and push results back.
Object Identity: id() and is
Every Python object has a unique identity, accessible via id(). The is operator checks whether two variables point to the same object in memory, while == checks if they have the same value.
Integer Caching and String Interning
CPython caches small integers from -5 to 256 at startup. Every variable assigned one of these values points to the same pre-created object. This saves memory and speeds up common operations.
Reference Counting and Garbage Collection
CPython uses reference counting as its primary memory management strategy. Every object has a counter tracking how many variables reference it. When the count drops to zero, the memory is freed immediately.
Reference counting can't handle circular references -- two objects that reference each other. That's where Python's garbage collector steps in. It periodically scans for reference cycles and cleans them up.
The Global Interpreter Lock (GIL)
The GIL is a mutex that allows only one thread to execute Python bytecode at a time. Even on a 16-core machine, only one thread runs Python code at any given moment. This simplifies CPython's memory management but limits true parallelism for CPU-bound tasks.
Note: Python 3.13+ introduces an experimental free-threaded mode (PEP 703) that removes the GIL. This is a major ongoing change in the Python ecosystem.
__dict__ vs __slots__: Memory Layout
By default, each Python object stores its attributes in a dictionary (__dict__). This is flexible but uses more memory. __slots__ replaces the dictionary with a fixed set of attribute slots, saving memory when you have many instances.
class PointDict:
def __init__(self, x, y):
self.x = x
self.y = y
p = PointDict(1, 2)
print(p.__dict__) # {'x': 1, 'y': 2}
p.z = 3 # Allowed! Dynamic attributeclass PointSlots:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
p = PointSlots(1, 2)
# p.__dict__ # AttributeError!
# p.z = 3 # AttributeError!Practice Exercises
Write a function called multiply(a, b) that returns a * b. Then use dis.dis() to disassemble it. After disassembling, print Done on a new line.
What does the following code print?
a = 100
b = 100
print(a is b)
x = [1, 2]
y = [1, 2]
print(x is y)Write code that:
1. Creates a list data = [10, 20, 30]
2. Prints the reference count of data using sys.getrefcount()
3. Creates alias = data (a second reference)
4. Prints the reference count of data again
5. Deletes alias with del alias
6. Prints the reference count of data one more time
Note: sys.getrefcount() itself adds a temporary reference, so the count is always 1 higher than you might expect.
Create two classes:
1. PersonDict with __init__(self, name, age) that sets self.name and self.age (uses default __dict__)
2. PersonSlots with __slots__ = ['name', 'age'] and the same __init__
Create one instance of each. Print whether each has a __dict__ attribute using hasattr(). Expected output:
PersonDict has __dict__: True
PersonSlots has __dict__: FalseThe code below uses is to compare values, which is unreliable. Fix it to use proper value comparison so that all three comparisons print True.