Skip to main content

Python Threading vs Multiprocessing vs asyncio: When to Use What

Advanced30 min5 exercises100 XP
0/5 exercises

Most programs you've written so far do one thing at a time. But real applications need to handle multiple tasks: downloading files while updating a progress bar, serving web requests from many users, or crunching numbers across all CPU cores.

Python gives you three tools for this: threading (concurrent tasks sharing memory), multiprocessing (parallel tasks in separate processes), and asyncio (cooperative multitasking for I/O). Each solves different problems, and choosing wrong means either bugs or wasted performance.

What Is the Difference Between Concurrency and Parallelism?

These two terms are often confused, but they mean different things. Concurrency is about dealing with multiple tasks at once — they might take turns on a single CPU core. Parallelism is about doing multiple tasks at once — they actually run simultaneously on different cores.

Analogy: A single chef juggling three pots on the stove is concurrency. Three chefs each working on their own dish is parallelism.

Sequential vs concurrent
Loading editor...

How Does Threading Work in Python?

A thread is a lightweight unit of execution within a process. Multiple threads share the same memory space, which makes communication easy but introduces the risk of data races.

Thread basics (simulated)
Loading editor...

In real Python, threading.Thread creates an actual OS thread that runs concurrently. The start() method begins execution, and join() waits for the thread to finish. In our browser environment, we simulate this pattern.

Why Do You Need Locks?

When multiple threads access shared data, things can go wrong. Imagine two threads both reading a counter at 5, both adding 1, and both writing 6. The counter should be 7 but it's 6 — this is a race condition.

Locks prevent race conditions
Loading editor...

A Lock (mutex) ensures only one thread can access a block of code at a time. The with self.lock: statement acquires the lock, executes the code, and releases it automatically — even if an exception occurs.

Threading vs Multiprocessing vs asyncio: Decision Guide

Choosing the right concurrency tool depends on your workload type. Here's the decision tree professional Python developers use:

Concurrency decision guide
Loading editor...

What Is asyncio and How Does It Work?

asyncio uses cooperative multitasking: instead of the OS switching between threads, your code explicitly says "I'm waiting, let someone else run" using the await keyword. This is more efficient than threading for I/O-heavy workloads because there's no thread switching overhead.

asyncio concepts (simulated)
Loading editor...

In real Python, you'd write async def fetch_users() and use await for I/O operations. asyncio.gather() runs multiple coroutines concurrently. The event loop switches between them whenever one hits an await.

Synchronous code
import requests

def get_pages(urls):
    results = []
    for url in urls:
        r = requests.get(url)  # Blocks!
        results.append(r.text)
    return results
# Total time: sum of all requests
Async code (conceptual)
import aiohttp, asyncio

async def get_pages(urls):
    async with aiohttp.ClientSession() as s:
        tasks = [s.get(u) for u in urls]
        results = await asyncio.gather(*tasks)
    return results
# Total time: longest single request

What Is ThreadPoolExecutor?

Creating and managing threads manually is error-prone. ThreadPoolExecutor from concurrent.futures provides a high-level API: submit tasks, get results, and the pool manages the threads for you.

ThreadPoolExecutor for clean concurrent code
Loading editor...

The with statement ensures threads are properly cleaned up. executor.map() works just like the built-in map() but runs tasks across a pool of threads. Results come back in the same order as the inputs.


Practice Exercises

Concurrency Decision Maker
Write Code

Write a function classify_task(description) that takes a task description string and returns the recommended concurrency tool.

Rules:

  • If the description contains "download", "request", "api", or "file" (case-insensitive), return "threading"
  • If the description contains "calculate", "process", "compute", or "analyze" (case-insensitive), return "multiprocessing"
  • If the description contains "server", "websocket", or "connections" (case-insensitive), return "asyncio"
  • Otherwise, return "sequential"
  • Test with these descriptions:

  • "Download 100 files from S3"
  • "Calculate prime numbers"
  • "Handle 5000 WebSocket connections"
  • "Print hello world"
  • Loading editor...
    Build a Thread-Safe Counter
    Write Code

    Create a SafeCounter class with:

  • An __init__ that sets self.value = 0 and creates a threading.Lock()
  • An increment() method that uses the lock to safely add 1 to value
  • A get() method that returns the current value
  • Create a counter, call increment() 100 times in a loop, and print the final value.

    Loading editor...
    Predict the Output: ThreadPoolExecutor
    Predict Output

    What does this code print? Note that executor.map preserves the order of inputs.

    from concurrent.futures import ThreadPoolExecutor
    
    def double(x):
        return x * 2
    
    numbers = [5, 10, 15, 20]
    
    with ThreadPoolExecutor(max_workers=2) as executor:
        results = list(executor.map(double, numbers))
    
    print(results)
    print(sum(results))
    Loading editor...
    Fix the Bug: Missing Lock
    Fix the Bug

    This SharedList class is meant to be thread-safe, but the add() method doesn't use the lock. Fix the code so that add() and get_all() both use the lock properly.

    After the fix, add three items and print the list.

    Loading editor...
    Build a Task Queue
    Write Code

    Build a TaskQueue class that simulates a concurrent task processor:

    1. __init__ creates an empty list self.tasks and an empty list self.results

    2. add_task(name, func, arg) appends a tuple (name, func, arg) to tasks

    3. run_all() processes each task by calling func(arg), storing (name, result) in results

    4. get_results() returns the results list

    Create a queue, add three tasks:

  • ("square", lambda x: x**2, 5)
  • ("double", lambda x: x*2, 7)
  • ("negate", lambda x: -x, 3)
  • Run all tasks and print each result as "{name}: {result}".

    Loading editor...