Multiprocessing vs Multithreading in Python: Complete Guide
Concurrency and parallelism are important concepts in modern programming for improving application performance. Python provides two main approaches: multithreading and multiprocessing. In this tutorial, we'll learn the differences between them, when to use each, and how to implement them effectively.
Core Concepts
Threading
A thread is the smallest unit of execution within a process. Multiple threads in one process share the same memory space.
Threading Characteristics:- Lightweight
- Shared memory space
- Faster context switching
- Suitable for I/O-bound tasks
Multiprocessing
A process is an independent instance of a running program. Each process has its own separate memory space.
Multiprocessing Characteristics:- Heavyweight
- Separate memory space
- True parallelism
- Suitable for CPU-bound tasks
Global Interpreter Lock (GIL)
What is GIL?
GIL is a mutex (mutual exclusion lock) that protects access to Python objects, preventing multiple threads from executing Python bytecode simultaneously.
GIL Implications:import threading
import time
CPU-bound task
def cpuintensive():
count = 0
for i in range(50000000):
count += 1
return count
Single thread
start = time.time()
cpuintensive()
cpuintensive()
end = time.time()
print(f"Single thread: {end - start:.2f}s")
Multi-threading (NO speedup due to GIL!)
start = time.time()
t1 = threading.Thread(target=cpuintensive)
t2 = threading.Thread(target=cpuintensive)
t1.start()
t2.start()
t1.join()
t2.join()
end = time.time()
print(f"Multi-threading: {end - start:.2f}s")
Output:
Single thread: 3.45s
Multi-threading: 3.52s # No speedup!
Conclusion: Threading in Python does NOT provide speedup for CPU-bound tasks due to GIL.
When to Use Threading
Threading is effective for I/O-bound tasks because GIL is released when waiting for I/O.
Example 1: Download Multiple Files
import threading
import requests
import time
urls = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3',
'https://jsonplaceholder.typicode.com/posts/4',
'https://jsonplaceholder.typicode.com/posts/5',
]
def download(url):
"""Download content from URL"""
response = requests.get(url)
print(f"Downloaded {url}: {len(response.content)} bytes")
Sequential (without threading)
start = time.time()
for url in urls:
download(url)
sequentialtime = time.time() - start
print(f"\nSequential: {sequentialtime:.2f}s")
Multi-threading
start = time.time()
threads = []
for url in urls:
thread = threading.Thread(target=download, args=(url,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
threadedtime = time.time() - start
print(f"Multi-threading: {threadedtime:.2f}s")
print(f"Speedup: {sequentialtime/threadedtime:.2f}x")
When to Use Multiprocessing
Multiprocessing is effective for CPU-bound tasks because it bypasses GIL by using separate processes.
Example: Heavy Computation
import multiprocessing
import time
import math
def cpuintensivetask(n):
"""CPU-intensive calculation"""
count = 0
for i in range(n):
count += math.sqrt(i) math.sin(i)
return count
Parameters
numbers = [10000000, 10000000, 10000000, 10000000]
Sequential
start = time.time()
results = [cpuintensivetask(n) for n in numbers]
sequentialtime = time.time() - start
print(f"Sequential: {sequentialtime:.2f}s")
Multiprocessing
start = time.time()
with multiprocessing.Pool(processes=4) as pool: