Multiprocessing vs Multithreading in Python: Complete Guide

# Multiprocessing vs Multithreading di Python: Panduan Lengkap Concurrency dan parallelism adalah konsep penting dalam programming modern untuk meningkatkan performa aplikasi. Python menyediakan dua...

By Ruby Abdullah · · tutorial
PythonMultithreadingMultiprocessingConcurrencyGILPerformance

Multiprocessing vs Multithreading in Python: Complete Guide

Concurrency and parallelism are important concepts in modern programming for improving application performance. Python provides two main approaches: multithreading and multiprocessing. In this tutorial, we'll learn the differences between them, when to use each, and how to implement them effectively.

Core Concepts

Threading

A thread is the smallest unit of execution within a process. Multiple threads in one process share the same memory space.

Threading Characteristics:
  • Lightweight
  • Shared memory space
  • Faster context switching
  • Suitable for I/O-bound tasks

Multiprocessing

A process is an independent instance of a running program. Each process has its own separate memory space.

Multiprocessing Characteristics:
  • Heavyweight
  • Separate memory space
  • True parallelism
  • Suitable for CPU-bound tasks

Global Interpreter Lock (GIL)

What is GIL?

GIL is a mutex (mutual exclusion lock) that protects access to Python objects, preventing multiple threads from executing Python bytecode simultaneously.

GIL Implications:
import threading

import time

CPU-bound task

def cpuintensive():

count = 0

for i in range(50000000):

count += 1

return count

Single thread

start = time.time()

cpuintensive()

cpuintensive()

end = time.time()

print(f"Single thread: {end - start:.2f}s")

Multi-threading (NO speedup due to GIL!)

start = time.time()

t1 = threading.Thread(target=cpuintensive)

t2 = threading.Thread(target=cpuintensive)

t1.start()

t2.start()

t1.join()

t2.join()

end = time.time()

print(f"Multi-threading: {end - start:.2f}s")

Output:

Single thread: 3.45s

Multi-threading: 3.52s # No speedup!

Conclusion: Threading in Python does NOT provide speedup for CPU-bound tasks due to GIL.

When to Use Threading

Threading is effective for I/O-bound tasks because GIL is released when waiting for I/O.

Example 1: Download Multiple Files

import threading

import requests

import time

urls = [

'https://jsonplaceholder.typicode.com/posts/1',

'https://jsonplaceholder.typicode.com/posts/2',

'https://jsonplaceholder.typicode.com/posts/3',

'https://jsonplaceholder.typicode.com/posts/4',

'https://jsonplaceholder.typicode.com/posts/5',

]

def download(url):

"""Download content from URL"""

response = requests.get(url)

print(f"Downloaded {url}: {len(response.content)} bytes")

Sequential (without threading)

start = time.time()

for url in urls:

download(url)

sequentialtime = time.time() - start

print(f"\nSequential: {sequentialtime:.2f}s")

Multi-threading

start = time.time()

threads = []

for url in urls:

thread = threading.Thread(target=download, args=(url,))

threads.append(thread)

thread.start()

for thread in threads:

thread.join()

threadedtime = time.time() - start

print(f"Multi-threading: {threadedtime:.2f}s")

print(f"Speedup: {sequentialtime/threadedtime:.2f}x")

When to Use Multiprocessing

Multiprocessing is effective for CPU-bound tasks because it bypasses GIL by using separate processes.

Example: Heavy Computation

import multiprocessing

import time

import math

def cpuintensivetask(n):

"""CPU-intensive calculation"""

count = 0

for i in range(n):

count += math.sqrt(i) math.sin(i)

return count

Parameters

numbers = [10000000, 10000000, 10000000, 10000000]

Sequential

start = time.time()

results = [cpuintensivetask(n) for n in numbers]

sequentialtime = time.time() - start

print(f"Sequential: {sequentialtime:.2f}s")

Multiprocessing

start = time.time()

with multiprocessing.Pool(processes=4) as pool:

Related Articles