目次
- 1 1. Basics: What is Python Multiprocessing?
- 2 2. Practical Guide: Using the multiprocessing Module
- 3 3. Advanced: Error Handling and Performance Optimization
- 4 4. FAQ: Common Questions and Solutions
- 5 5. Summary and Additional Learning Resources
- 6 5.4 Conclusion
1. Basics: What is Python Multiprocessing?
1.1 What is Multiprocessing?
Multiprocessing is a technology that runs multiple processes (independent execution units) simultaneously. In Python, you can easily implement multiprocessing using themultiprocessing module.Features of Multiprocessing
- Each process has its own independent memory space
- Can fully utilize CPU cores
- Inter-process communication is required (using
QueueorPipe)
Typical Use Cases
- Compute-intensive tasks (machine learning, numerical simulation)
- Tasks that fully utilize CPU (image processing, data analysis)
1.2 Difference from Multithreading
Python also has a parallel processing mechanism called “multithreading”. How do multiprocessing and multithreading differ?| Item | Multiprocessing | Multithreading |
|---|---|---|
| Memory sharing | No (independent processes) | Yes (within the same process) |
| Effect of GIL | Not affected | Affected |
| CPU-bound suitability | ◎ | ✕ |
| I/O-bound suitability | △ | ◎ |
| Data exchange | Requires Queue or Pipe | Can use shared memory |
What is GIL (Global Interpreter Lock)?
The standard Python interpreter (CPython) has a mechanism called the GIL, which means that even when using multithreading, only one thread can execute at a time. Therefore, if you want to fully utilize the CPU, multiprocessing is appropriate.1.3 Simple Multiprocessing Example in Python
import multiprocessing
import time
def worker(n):
print(f"Process {n} start")
time.sleep(2)
print(f"Process {n} finished")
if __name__ == "__main__":
process_list = []
# Create 3 processes
for i in range(3):
p = multiprocessing.Process(target=worker, args=(i,))
process_list.append(p)
p.start()
# Wait until all processes finish
for p in process_list:
p.join()
print("All processes finished")1.4 Precautions When Using Multiprocessing
1. The if __name__ == "__main__": guard is required on Windows
On Windows, if you use multiprocessing.Process() without writing if __name__ == "__main__":, an error will occur.Incorrect code (causes error)
import multiprocessing
def worker():
print("Hello from process")
p = multiprocessing.Process(target=worker)
p.start()This code will raise an error on Windows.Correct code
import multiprocessing
def worker():
print("Hello from process")
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
p.start()
p.join()Adding if __name__ == "__main__": allows it to run correctly on Windows.1.5 Summary
- What is multiprocessing? → A method to run multiple processes in parallel
- Difference from multithreading → Not affected by the GIL and suitable for CPU-bound tasks
- Simple example in Python → Use
multiprocessing.Process() - Precautions on Windows →
if __name__ == "__main__":is required
Ad
2. Practical Guide: Using the multiprocessing Module
2.1 Overview of the multiprocessing Module
multiprocessing module is the standard library in Python for process-based parallel processing. By using this module, you can fully utilize CPU cores and bypass the GIL’s limitations.Key Features of multiprocessing
| Feature | Description |
|---|---|
Process | Create and run individual processes |
Queue | Send and receive data between processes |
Pipe | Exchange data between two processes |
Value & Array | Use shared memory between processes |
Pool | Create a pool of processes to perform parallel processing efficiently |
2.2 Basic Usage of the Process Class
To create a new process in Python, use the multiprocessing.Process class.Creating a Basic Process
import multiprocessing
import time
def worker(n):
print(f"Process {n} start")
time.sleep(2)
print(f"Process {n} finished")
if __name__ == "__main__":
p1 = multiprocessing.Process(target=worker, args=(1,))
p2 = multiprocessing.Process(target=worker, args=(2,))
p1.start()
p2.start()
p1.join()
p2.join()
print("All processes finished")2.3 Interprocess Communication (Queue & Pipe)
Sending and Receiving Data Using Queue
import multiprocessing
def worker(q):
q.put("Hello from child process")
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(q,))
p.start()
p.join()
# Retrieve data from child process
print(q.get())2.4 Shared Memory Using Value and Array
import multiprocessing
def worker(val, arr):
val.value = 3.14 # Change the value in shared memory
arr[0] = 42 # Change the array value
if __name__ == "__main__":
val = multiprocessing.Value('d', 0.0) # 'd' is double type
arr = multiprocessing.Array('i', [0, 1, 2]) # 'i' is integer type
p = multiprocessing.Process(target=worker, args=(val, arr))
p.start()
p.join()
print(f"val: {val.value}, arr: {arr[:]}")2.5 Process Management Using the Pool Class
Parallel Processing Using Pool
import multiprocessing
def square(n):
return n * n
if __name__ == "__main__":
with multiprocessing.Pool(4) as pool:
results = pool.map(square, range(10))
print(results)2.6 Summary
multiprocessingmodule makes parallel processing easy to implement- Create individual processes with the
Processclass - Using
QueueorPipeenables data sharing between processes ValueandArrayprovide shared memory- Using the
Poolclass allows efficient processing of large amounts of data

3. Advanced: Error Handling and Performance Optimization
3.1 Common Errors in multiprocessing and Solutions
Error 1: Missing if __name__ == "__main__": error on Windows
Error Message
RuntimeError: freeze_support() must be called if program is run in frozen modeSolution
import multiprocessing
def worker():
print("Hello from process")
if __name__ == "__main__": # This is required
p = multiprocessing.Process(target=worker)
p.start()
p.join()Error 2: PicklingError (cannot pass functions between processes)
Error Message
AttributeError: Can't pickle local object 'main..'Solution
import multiprocessing
def square(x): # make it a global function
return x * x
if __name__ == "__main__":
with multiprocessing.Pool(4) as pool:
results = pool.map(square, range(10)) # avoid lambda
print(results)Error 3: Deadlock (process remains stopped)
Solution
import multiprocessing
def worker(q):
q.put("data")
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(q,))
p.start()
print(q.get()) # receive data
p.join() # normal termination here3.2 Performance Optimization Techniques
Optimization 1: Set the number of processes appropriately
import multiprocessing
def worker(n):
return n * n
if __name__ == "__main__":
num_workers = multiprocessing.cpu_count() # get number of CPU cores
with multiprocessing.Pool(num_workers) as pool:
results = pool.map(worker, range(100))
print(results)Optimization 2: Use Pool.starmap()
import multiprocessing
def multiply(a, b):
return a * b
if __name__ == "__main__":
with multiprocessing.Pool(4) as pool:
results = pool.starmap(multiply, [(1, 2), (3, 4), (5, 6)])
print(results)Optimization 3: Leverage shared memory
import multiprocessing
import ctypes
def worker(shared_array):
shared_array[0] = 99 # modify the value in shared memory
if __name__ == "__main__":
shared_array = multiprocessing.Array(ctypes.c_int, [1, 2, 3]) # create shared memory
p = multiprocessing.Process(target=worker, args=(shared_array,))
p.start()
p.join()
print(shared_array[:]) # [99, 2, 3]3.3 Summary
multiprocessingcommon error avoidance methods explained- Performance optimization points:
- Set the number of processes appropriately
- Leverage starmap()
- Speed up with shared memory
Ad
4. FAQ: Common Questions and Solutions
4.1 Which should you use in Python, multiprocessing or multithreading?
Answer
- CPU-bound (computationally intensive tasks) → Multiprocessing (multiprocessing)
- I/O-bound (file/network operations) → Multithreading (threading)
| Type of Task | Suitable Parallelism |
|---|---|
| CPU-bound (numerical calculations, image processing, etc.) | Multiprocessing (multiprocessing) |
| I/O-bound (file or API requests, etc.) | Multithreading (threading) |
4.2 Why does multiprocessing feel “slow”?
Answer
- High cost of creating processes → Use
Pool - Too much data copying → Use shared memory (
Value,Array) - Processing many small tasks → Try
concurrent.futures.ThreadPoolExecutor
import multiprocessing
def worker(n):
return n * n
if __name__ == "__main__":
with multiprocessing.Pool(multiprocessing.cpu_count()) as pool:
results = pool.map(worker, range(100))
print(results)4.3 How to share dictionaries and lists in multiprocessing?
Answer
Usemultiprocessing.Manager().import multiprocessing
def worker(shared_list):
shared_list.append(100) # Update shared list
if __name__ == "__main__":
with multiprocessing.Manager() as manager:
shared_list = manager.list([1, 2, 3])
p = multiprocessing.Process(target=worker, args=(shared_list,))
p.start()
p.join()
print(shared_list) # [1, 2, 3, 100]4.4 Common errors with multiprocessing.Pool and how to address them?
| Error | Cause | Solution |
|---|---|---|
AttributeError: Can't pickle local object | Passing a lambda or local function | Use a global function |
RuntimeError: freeze_support() must be called | Missing if __name__ == "__main__": on Windows | Add if __name__ == "__main__": |
EOFError: Ran out of input | Processes did not terminate properly in Pool | Call pool.close() and pool.join() appropriately |
4.5 How to debug Python multiprocessing?
Answer
Usemultiprocessing.log_to_stderr().import multiprocessing
import logging
def worker(n):
logger = multiprocessing.get_logger()
logger.info(f"Process {n} running")
if __name__ == "__main__":
multiprocessing.log_to_stderr(logging.INFO) # Enable logging
p = multiprocessing.Process(target=worker, args=(1,))
p.start()
p.join()5. Summary and Additional Learning Resources
5.1 Summary of This Article
Fundamentals of Multiprocessing
- What is multiprocessing? → A technique that runs multiple processes in parallel to maximize CPU utilization
- Difference from multithreading
- Multiprocessing → Suited for CPU‑bound tasks (numerical calculations, image processing, etc.)
- Multithreading → Suited for I/O‑bound tasks (file handling, network communication, etc.)
How to Use the multiprocessing Module
Processclass to create individual processes- Using
QueueandPipeto send and receive data between processes - Leveraging
ValueandArrayto utilize shared memory - With the
Poolclass to execute parallel processing efficiently
Error Handling and Performance Optimization
- Common errors
- If you omit
if __name__ == "__main__":, you get errors on Windows lambdafunctions and local functions cause PicklingError- Forgetting to call
Queue.get()leads to deadlocks - Performance optimization
- Use
Poolto reduce the overhead of creating processes - Use
starmap()to pass multiple arguments - Utilize
multiprocessing.shared_memoryto reduce data copy overhead
5.2 Additional Learning Resources
1. Python Official Documentation
2. Online Tutorials
5.3 Looking Ahead to Future Applications
By properly leveraging Python’smultiprocessing, you can use the CPU efficiently and create high‑performance programs.Technologies to Learn Next
- Asynchronous processing (asyncio) → Parallelize I/O‑bound tasks
- concurrent.futures → Unified management of threads and processes
Ad




