Python is loved by many developers for its simple syntax and powerful libraries. Among them, “asynchronous processing” is one of the key techniques for handling tasks efficiently. This article clearly explains Python’s asynchronous processing from basics to advanced topics. By understanding asynchronous processing, you can learn how to dramatically improve the speed of web scraping and API requests.
2. Basics of Asynchronous Processing
What Is Asynchronous Processing?
Asynchronous processing is a technique that allows a program to execute other tasks concurrently while waiting for a single task to complete. For example, when scraping multiple web pages, a typical synchronous approach sends requests sequentially for each page. In contrast, using asynchronous processing lets you issue multiple requests simultaneously.
Differences Between Synchronous and Asynchronous Processing
Feature
Synchronous Processing
Asynchronous Processing
Task Execution Order
Execute tasks one at a time in order
Execute multiple tasks concurrently
Processing Wait Time
Wait time occurs
Other processing can run during that time
Use Cases
Small‑scale task processing
Scenarios requiring large amounts of I/O operations
Benefits of Asynchronous Processing
Improved Efficiency: By handling multiple tasks simultaneously, you can make effective use of wait times.
Scalability: Ideal for efficiently handling large volumes of I/O operations.
Resource Savings: Saves system resources compared to creating threads or processes.
3. Basics of Asynchronous Programming in Python
How to Implement Asynchronous Programming in Python
Python uses the async and await keywords to perform asynchronous processing. Using these two allows you to write asynchronous tasks concisely.
import asyncio
async def say_hello():
print("Hello, asynchronous processing!")
await asyncio.sleep(1)
print("1 second has passed!")
asyncio.run(say_hello())
async: Defines a function as asynchronous.
await: Pauses an asynchronous task to allow other tasks to run.
How Coroutines, Tasks, and Event Loops Work
Coroutines: The unit of execution for asynchronous tasks. Functions defined with async become coroutines.
Tasks: Wrappers that allow coroutines to be managed by the event loop.
Event loop: The Python engine that executes and schedules tasks.
4. Practical Examples of Asynchronous Processing
There are many scenarios where you can leverage asynchronous processing in Python. In this section, we will explain the following examples in detail as real-world use cases.
Web Scraping
Parallel API Request Processing
Asynchronous Database Operations
Web Scraping (using aiohttp)
In web scraping, you often send requests to many web pages to collect data. Using asynchronous processing allows you to send multiple requests concurrently, improving throughput. Below is an example of asynchronous web scraping using aiohttp.
import aiohttp
import asyncio
async def fetch_page(session, url):
async with session.get(url) as response:
print(f"Fetching: {url}")
return await response.text()
async def main():
urls = [
"https://example.com/page1",
"https://example.com/page2",
"https://example.com/page3"
]
async with aiohttp.ClientSession() as session:
tasks = [fetch_page(session, url) for url in urls]
results = await asyncio.gather(*tasks)
print("All pages fetched!")
asyncio.run(main())
Key Points:
Use aiohttp.ClientSession to achieve efficient requests.
Execute multiple tasks in parallel with asyncio.gather.
Parallel API Request Processing
Asynchronous processing is also effective when making API requests. Below is an example that sends requests to multiple API endpoints in parallel and retrieves the results.
import aiohttp
import asyncio
async def fetch_data(session, endpoint):
async with session.get(endpoint) as response:
print(f"Requesting data from: {endpoint}")
return await response.json()
async def main():
api_endpoints = [
"https://api.example.com/data1",
"https://api.example.com/data2",
"https://api.example.com/data3"
]
async with aiohttp.ClientSession() as session:
tasks = [fetch_data(session, endpoint) for endpoint in api_endpoints]
results = await asyncio.gather(*tasks)
for i, result in enumerate(results):
print(f"Data from endpoint {i + 1}: {result}")
asyncio.run(main())
Key Points:
Streamline data retrieval from multiple API endpoints.
Process response data in JSON format.
Asynchronous Database Operations (example with aiomysql)
Implementing asynchronous database operations enables fast data reads and writes. Below is an example of an asynchronous database query using aiomysql.
import aiomysql
import asyncio
async def fetch_from_db():
conn = await aiomysql.connect(
host="localhost",
port=3306,
user="root",
password="password",
db="test_db"
)
async with conn.cursor() as cursor:
await cursor.execute("SELECT * FROM users")
result = await cursor.fetchall()
print("Data from database:", result)
conn.close()
asyncio.run(fetch_from_db())
Key Points:
Execute asynchronous queries to retrieve data efficiently.
Effective also when handling multiple queries simultaneously.
5. Considerations When Using Asynchronous Programming
Asynchronous programming is a very powerful tool, but if not used properly, unexpected problems can arise. This section explains the points to watch out for when using asynchronous programming and how to avoid them.
Avoiding Deadlocks
A deadlock occurs when multiple tasks wait on each other for resources. When using asynchronous programming, you need to manage task ordering and the timing of resource acquisition properly. Example: A case where a deadlock occurs
import asyncio
lock = asyncio.Lock()
async def task1():
async with lock:
print("Task1 acquired the lock")
await asyncio.sleep(1)
print("Task1 released the lock")
async def task2():
async with lock:
print("Task2 acquired the lock")
await asyncio.sleep(1)
print("Task2 released the lock")
async def main():
await asyncio.gather(task1(), task2())
asyncio.run(main())
Deadlock avoidance strategies
Clearly identify the resources each task needs and acquire them in the same order.
Use asyncio.TimeoutError to set a timeout for resource acquisition.
Preventing Race Conditions
In asynchronous programming, when multiple tasks access the same resource, a “race condition” can occur that compromises data integrity. Example: A case where a race condition occurs
import asyncio
counter = 0
async def increment():
global counter
for _ in range(1000):
counter += 1
async def main():
await asyncio.gather(increment(), increment())
print(f"Final counter value: {counter}")
asyncio.run(main())
In the example above, the value of counter may not be as expected. How to prevent race conditions
Use locks: Use asyncio.Lock to control concurrent access to resources.
import asyncio
counter = 0
lock = asyncio.Lock()
async def increment():
global counter
async with lock:
for _ in range(1000):
counter += 1
async def main():
await asyncio.gather(increment(), increment())
print(f"Final counter value: {counter}")
asyncio.run(main())
The Importance of Error Handling
Asynchronous programming can encounter network errors, timeout errors, and other issues. Failing to handle these errors properly can cause the entire program to behave unexpectedly. Example: Implementing error handling
import asyncio
import aiohttp
async def fetch_url(session, url):
try:
async with session.get(url, timeout=5) as response:
return await response.text()
except asyncio.TimeoutError:
print(f"Timeout error while accessing {url}")
except aiohttp.ClientError as e:
print(f"HTTP error: {e}")
async def main():
urls = ["https://example.com", "https://invalid-url"]
async with aiohttp.ClientSession() as session:
tasks = [fetch_url(session, url) for url in urls]
await asyncio.gather(*tasks)
asyncio.run(main())
Error handling tips
Identify anticipated errors and write handling code accordingly.
Log exceptions to aid troubleshooting.
When Asynchronous Programming Is Inappropriate
Asynchronous programming is not effective in every scenario. In particular, it is unsuitable for the following cases.
CPU-bound tasks:
CPU-intensive operations such as image processing or training machine‑learning models are better suited to concurrent.futures or multiprocessing rather than asynchronous programming.
Small tasks:
If the overhead of initializing asynchronous processing exceeds the execution time of the task, synchronous processing is more efficient.
Resource Management and Optimization
Because asynchronous programming runs many tasks concurrently, memory and CPU usage can increase sharply. Manage resources with the following considerations in mind.
Limit the number of concurrent tasks: Use asyncio.Semaphore to restrict how many tasks run at the same time.
import asyncio
semaphore = asyncio.Semaphore(5)
async def limited_task(task_id):
async with semaphore:
print(f"Running task {task_id}")
await asyncio.sleep(1)
async def main():
tasks = [limited_task(i) for i in range(20)]
await asyncio.gather(*tasks)
asyncio.run(main())
Monitoring: Implement a system to regularly monitor the number of running tasks and memory usage.
6. Advanced Topics in Asynchronous Programming
After mastering the basics of asynchronous programming, learning its applications and comparing it with other technologies enables you to leverage async more deeply. This section explains comparisons with asynchronous technologies beyond Python and real-world use cases.
Comparing Asynchronous Technologies Outside of Python
Asynchronous programming is widely used in languages other than Python as well. Here we compare popular technologies with Python and examine their characteristics.
Node.js
Node.js is a JavaScript runtime environment that excels at asynchronous processing, handling async I/O operations efficiently.
Feature
Python
Node.js
Use Cases
Data analysis, AI, web development
Web servers, real-time applications
How Asynchrony Is Implemented
asyncio module, async/await
Callbacks, Promise, async/await
Performance (I/O)
High, but slightly lower than Node.js
Optimized for asynchronous I/O
Learning Curve
Moderately high
Relatively low
Go
Go (Golang) implements asynchronous processing using lightweight threads called goroutines.
Feature
Python
Go
Use Cases
General-purpose programming
Servers, cloud development
How Asynchrony Is Implemented
asyncio module, async/await
Goroutines, channels
Performance (parallelism)
High, but async is not suitable for CPU-bound tasks
Excels in parallel processing performance
Learning Curve
Moderate
Relatively low
Python’s Advantages and Application Scope
Versatility: Python can be used for a wide range of purposes, not only web development but also data analysis, machine learning, and more.
Application Scenarios for Asynchronous Programming
By leveraging asynchronous programming, you can build efficient programs in scenarios such as the following.
Server-Side Development
Using async programming allows you to efficiently build high-load server applications. For example, FastAPI is a Python web framework designed around asynchronous I/O, offering the following benefits.
Fast API responses: Achieves high concurrency, handling many requests efficiently.
Concise async code: Can be written simply using async/await.
In a microservice architecture, multiple small services work together. Using async processing provides the following advantages.
Efficient inter-service communication: Achieves low latency using async HTTP requests and message queues.
Improved scalability: Resource management per service becomes more flexible.
Real-Time Systems
In real-time systems such as chat apps or online games, async processing enables smooth data updates. For example, you can build asynchronous WebSocket communication using the websockets library.
To deepen your understanding of async programming, consider studying the following resources and topics.
Advanced async patterns:
Implementing task cancellation and timeouts.
Low-level asyncio APIs (e.g., Future and custom event loops).
Utilizing libraries:
Libraries for async I/O (e.g., aiohttp, aiomysql, asyncpg).
Async web frameworks (e.g., FastAPI, Sanic).
Combining with distributed processing:
Combining async processing with distributed computing enables building even more scalable systems.
7. Summary
We have covered Python asynchronous processing extensively, from fundamentals to advanced topics. In this section, we recap the material and summarize key points for effectively leveraging async processing. We also suggest the next steps to learn.
Overview of Asynchronous Processing
Asynchronous processing is a technique for efficiently executing multiple tasks concurrently. It is especially useful in I/O‑bound scenarios and has the following characteristics.
Efficient task handling: Utilize waiting time for other tasks.
Improved scalability: Handle a large number of requests efficiently.
Key Points Covered in This Article
Fundamentals of Asynchronous Processing
Differences between synchronous and asynchronous processing.
Basic syntax for async tasks using async and await.
Practical Examples of Asynchronous Processing
Parallelizing web scraping and API requests with async for greater efficiency.
Accelerated data handling by async-ifying database operations.
Considerations and Challenges
Design strategies to avoid deadlocks and race conditions.
Proper error handling and resource management.
Advanced Use Cases
Comparison with other async technologies (Node.js, Go, etc.).
Examples of applying async on the server side and in real‑time applications.
Next Steps for Learning Asynchronous Processing
To deepen your understanding of async processing, we recommend the following additional studies.
Leveraging Libraries
Hands‑on work with async libraries such as aiohttp, aiomysql, and asyncpg.
Web application development using async web frameworks (e.g., FastAPI, Sanic).
Advanced Design Patterns
Task cancellation, exception handling, and use of async queues.
Low‑level designs that leverage custom event loops in asyncio.
Building Practical Projects
Create small async programs to verify behavior.
Tackle projects that solve real‑world problems (e.g., speeding up APIs, real‑time communication).
8. FAQ
Finally, we summarize the frequently asked questions about Python asynchronous programming and their answers.
Q1: What is the difference between asynchronous programming and multithreading?
Answer:
Asynchronous programming runs multiple tasks efficiently by switching between them within a single thread. In contrast, multithreading uses multiple threads to execute tasks concurrently. Asynchronous programming is suited for I/O‑bound tasks, while multithreading is appropriate for CPU‑bound tasks.
Q2: Are there any resources suitable for learning asynchronous programming?
Answer:
The following resources are recommended.
Python official documentation: the asyncio section.
Books focused on asynchronous programming (e.g., “Python Concurrency with Asyncio”).
Online tutorials (e.g., Real Python, practical videos on YouTube).
Q3: In what situations should asynchronous programming be used?
Answer:
Asynchronous programming is effective in the following scenarios.
When handling a large number of web requests (e.g., web scraping).
Applications that require real‑time communication (e.g., chat apps).
Tasks with many I/O waits to databases or external APIs.
Q4: Is asynchronous programming suitable for CPU‑bound tasks?
Answer:
No, asynchronous programming is not suitable for CPU‑bound tasks. For such tasks, using the concurrent.futures or multiprocessing modules is more effective.