Multiprocessing is a technology that runs multiple processes (independent execution units) simultaneously. In Python, you can easily implement multiprocessing using the multiprocessing module.
Features of Multiprocessing
Each process has its own independent memory space
Can fully utilize CPU cores
Inter-process communication is required (using Queue or Pipe)
Tasks that fully utilize CPU (image processing, data analysis)
1.2 Difference from Multithreading
Python also has a parallel processing mechanism called “multithreading”. How do multiprocessing and multithreading differ?
Item
Multiprocessing
Multithreading
Memory sharing
No (independent processes)
Yes (within the same process)
Effect of GIL
Not affected
Affected
CPU-bound suitability
◎
✕
I/O-bound suitability
△
◎
Data exchange
Requires Queue or Pipe
Can use shared memory
What is GIL (Global Interpreter Lock)?
The standard Python interpreter (CPython) has a mechanism called the GIL, which means that even when using multithreading, only one thread can execute at a time. Therefore, if you want to fully utilize the CPU, multiprocessing is appropriate.
1.3 Simple Multiprocessing Example in Python
import multiprocessing
import time
def worker(n):
print(f"Process {n} start")
time.sleep(2)
print(f"Process {n} finished")
if __name__ == "__main__":
process_list = []
# Create 3 processes
for i in range(3):
p = multiprocessing.Process(target=worker, args=(i,))
process_list.append(p)
p.start()
# Wait until all processes finish
for p in process_list:
p.join()
print("All processes finished")
1.4 Precautions When Using Multiprocessing
1. The if __name__ == "__main__": guard is required on Windows
On Windows, if you use multiprocessing.Process() without writing if __name__ == "__main__":, an error will occur.
Incorrect code (causes error)
import multiprocessing
def worker():
print("Hello from process")
p = multiprocessing.Process(target=worker)
p.start()
This code will raise an error on Windows.
Correct code
import multiprocessing
def worker():
print("Hello from process")
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
p.start()
p.join()
Adding if __name__ == "__main__": allows it to run correctly on Windows.
1.5 Summary
What is multiprocessing? → A method to run multiple processes in parallel
Difference from multithreading → Not affected by the GIL and suitable for CPU-bound tasks
Simple example in Python → Use multiprocessing.Process()
Precautions on Windows → if __name__ == "__main__": is required
2. Practical Guide: Using the multiprocessing Module
2.1 Overview of the multiprocessing Module
multiprocessing module is the standard library in Python for process-based parallel processing. By using this module, you can fully utilize CPU cores and bypass the GIL’s limitations.
Key Features of multiprocessing
Feature
Description
Process
Create and run individual processes
Queue
Send and receive data between processes
Pipe
Exchange data between two processes
Value & Array
Use shared memory between processes
Pool
Create a pool of processes to perform parallel processing efficiently
2.2 Basic Usage of the Process Class
To create a new process in Python, use the multiprocessing.Process class.
import multiprocessing
def worker(q):
q.put("Hello from child process")
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(q,))
p.start()
p.join()
# Retrieve data from child process
print(q.get())
2.4 Shared Memory Using Value and Array
import multiprocessing
def worker(val, arr):
val.value = 3.14 # Change the value in shared memory
arr[0] = 42 # Change the array value
if __name__ == "__main__":
val = multiprocessing.Value('d', 0.0) # 'd' is double type
arr = multiprocessing.Array('i', [0, 1, 2]) # 'i' is integer type
p = multiprocessing.Process(target=worker, args=(val, arr))
p.start()
p.join()
print(f"val: {val.value}, arr: {arr[:]}")
2.5 Process Management Using the Pool Class
Parallel Processing Using Pool
import multiprocessing
def square(n):
return n * n
if __name__ == "__main__":
with multiprocessing.Pool(4) as pool:
results = pool.map(square, range(10))
print(results)
2.6 Summary
multiprocessing module makes parallel processing easy to implement
Create individual processes with the Process class
Using Queue or Pipe enables data sharing between processes
Value and Array provide shared memory
Using the Pool class allows efficient processing of large amounts of data
3. Advanced: Error Handling and Performance Optimization
3.1 Common Errors in multiprocessing and Solutions
Error 1: Missing if __name__ == "__main__": error on Windows
Error Message
RuntimeError: freeze_support() must be called if program is run in frozen mode
Solution
import multiprocessing
def worker():
print("Hello from process")
if __name__ == "__main__": # This is required
p = multiprocessing.Process(target=worker)
p.start()
p.join()
Error 2: PicklingError (cannot pass functions between processes)
Error Message
AttributeError: Can't pickle local object 'main..'
Solution
import multiprocessing
def square(x): # make it a global function
return x * x
if __name__ == "__main__":
with multiprocessing.Pool(4) as pool:
results = pool.map(square, range(10)) # avoid lambda
print(results)
Error 3: Deadlock (process remains stopped)
Solution
import multiprocessing
def worker(q):
q.put("data")
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(q,))
p.start()
print(q.get()) # receive data
p.join() # normal termination here
3.2 Performance Optimization Techniques
Optimization 1: Set the number of processes appropriately
import multiprocessing
def worker(n):
return n * n
if __name__ == "__main__":
num_workers = multiprocessing.cpu_count() # get number of CPU cores
with multiprocessing.Pool(num_workers) as pool:
results = pool.map(worker, range(100))
print(results)
Optimization 2: Use Pool.starmap()
import multiprocessing
def multiply(a, b):
return a * b
if __name__ == "__main__":
with multiprocessing.Pool(4) as pool:
results = pool.starmap(multiply, [(1, 2), (3, 4), (5, 6)])
print(results)
Optimization 3: Leverage shared memory
import multiprocessing
import ctypes
def worker(shared_array):
shared_array[0] = 99 # modify the value in shared memory
if __name__ == "__main__":
shared_array = multiprocessing.Array(ctypes.c_int, [1, 2, 3]) # create shared memory
p = multiprocessing.Process(target=worker, args=(shared_array,))
p.start()
p.join()
print(shared_array[:]) # [99, 2, 3]
3.3 Summary
multiprocessing common error avoidance methods explained
Performance optimization points:
Set the number of processes appropriately
Leverage starmap()
Speed up with shared memory
4. FAQ: Common Questions and Solutions
4.1 Which should you use in Python, multiprocessing or multithreading?
Too much data copying → Use shared memory (Value, Array)
Processing many small tasks → Try concurrent.futures.ThreadPoolExecutor
import multiprocessing
def worker(n):
return n * n
if __name__ == "__main__":
with multiprocessing.Pool(multiprocessing.cpu_count()) as pool:
results = pool.map(worker, range(100))
print(results)
4.3 How to share dictionaries and lists in multiprocessing?
Answer
Use multiprocessing.Manager().
import multiprocessing
def worker(shared_list):
shared_list.append(100) # Update shared list
if __name__ == "__main__":
with multiprocessing.Manager() as manager:
shared_list = manager.list([1, 2, 3])
p = multiprocessing.Process(target=worker, args=(shared_list,))
p.start()
p.join()
print(shared_list) # [1, 2, 3, 100]
4.4 Common errors with multiprocessing.Pool and how to address them?
concurrent.futures → Unified management of threads and processes
5.4 Conclusion
In this article, we explained “Python multiprocessing”in detail from basics to practice and advanced applications. Apply the knowledge you learned in this article to real projects and give it a try! 🚀