Python Copy vs Deepcopy: Why Your "Copy" Isn't Really a Copy
Imagine you have a folder full of documents. You photocopy the folder and give it to a friend. But here's the catch — the photos inside the folder aren't copies. They're the same physical photos. If your friend draws a mustache on one, your original photo is ruined too.
This is exactly how Python's default copying works. You think you made an independent copy, but the inner objects are still shared. It's one of the most confusing bugs in Python, and it trips up beginners and experienced developers alike.
In this tutorial, you'll learn the difference between assignment, shallow copy, and deep copy. By the end, you'll know exactly when to use each one and never get bitten by a "shared reference" bug again.
Why Does Changing a "Copy" Change the Original?
Let's start with the most common surprise. When you write b = a, you are NOT making a copy. You're creating a second name for the exact same object. Both a and b point to the same data in memory.
This happens because a and b are two labels stuck on the same box. When you put something new in the box through label b, you see it when you look through label a too.
You can verify that two variables point to the same object using the is operator or by checking their id().
What Is a Shallow Copy?
A shallow copy creates a new outer container but does NOT copy the objects inside it. The new list is independent at the top level, but any nested objects are still shared with the original.
There are several ways to make a shallow copy of a list.
For dictionaries, use .copy() or dict().
This looks perfect. But here's the trap. Shallow copy only copies one level deep. If your list contains other lists (or dicts), those inner objects are still shared.
original = [[1, 2], [3, 4]]
shallow = original.copy()
shallow[0].append(99)
print(original) # [[1, 2, 99], [3, 4]]import copy
original = [[1, 2], [3, 4]]
deep = copy.deepcopy(original)
deep[0].append(99)
print(original) # [[1, 2], [3, 4]]What Is a Deep Copy and When Do You Need It?
A deep copy creates a completely independent clone. It copies the outer object AND recursively copies every object inside it, no matter how deeply nested. Nothing is shared between the original and the copy.
To make a deep copy, you need the copy module from Python's standard library.
Deep copy works with dictionaries too, including deeply nested structures.
Under the Hood: How Python References Work
To truly understand copying, you need to understand how Python stores data. Every piece of data lives somewhere in memory as an object. Variables are just names that reference (point to) those objects.
Assignment copies the reference. Shallow copy creates a new container with the same references inside. Deep copy creates a new container with new references pointing to new copies of everything inside.
Here's a summary table of the three approaches.
When Should You Use Each Type of Copy?
Use assignment (b = a) when you intentionally want two names for the same data. This is common when passing data to functions — you usually want the function to work with the original.
Use shallow copy when you have a flat list or dictionary (no nested mutable objects) and need an independent copy. This covers most everyday cases.
Use deep copy when you have nested mutable objects (lists of lists, dicts with list values, etc.) and need a fully independent clone. This is essential when you want to modify the copy without any risk of affecting the original.
Practice Exercises
Read the code carefully. What will it print? Remember that = with lists creates an alias, not a copy.
Type the exact output.
The code below tries to create a backup of the original list before modifying it. But the backup changes too! Fix the code so the backup stays unchanged.
Expected output:
Original: [1, 2, 3, 4]
Backup: [1, 2, 3]This code uses a shallow copy on nested data. Read it carefully and predict what will be printed.
Type the exact output.
Create a fully independent copy of the original data using deep copy. Then add the score 100 to the copy's scores list. Print the original scores and the copy's scores on separate lines to prove they are independent.
This game saves a checkpoint of the player's state, then the player picks up an item. But when we load the checkpoint, the item is already there! Fix the code so the checkpoint is a truly independent snapshot.
Expected output:
Checkpoint inventory: ['sword', 'shield']
Current inventory: ['sword', 'shield', 'potion']