目次
- 1 1. Introduction
- 2 2. Fundamentals of Python Memory Management
- 3 3. How to Check Memory Usage
- 4 4. How to Optimize Memory Usage
- 5 5. Troubleshooting
- 6 6. Practical Example: Measuring Memory Usage in Python Scripts
- 7 7. Summary and Next Steps
1. Introduction
Target Audience
This article is primarily aimed at beginners to intermediate users who use Python on a daily basis. It is especially useful for those who want to check and optimize their program’s memory usage.Purpose of the Article
The purpose of this article is as follows:- Understand how Python’s memory management works.
- Learn concrete methods for measuring memory usage.
- Acquire optimization techniques to reduce memory consumption.
2. Fundamentals of Python Memory Management
How Memory Management Works
In Python, memory management is performed using two primary mechanisms: reference counting and garbage collection.Reference Counting
Reference counting is a mechanism that counts how many references each object has. In Python, when an object is created, its reference count is set to 1. Each time another variable references that object, the count increases, and when a reference is removed, the count decreases. When the reference count reaches zero, the object is automatically freed from memory.Code Example
import sys
a = [1, 2, 3] ## List object is created
print(sys.getrefcount(a)) ## Initial reference count (usually 2, includes internal reference)
b = a ## Another variable references the same object
print(sys.getrefcount(a)) ## Reference count increases
del b ## Reference is removed
print(sys.getrefcount(a)) ## Reference count decreases
Garbage Collection
Garbage Collection (GC) is a mechanism that reclaims memory that cannot be freed by reference counting (especially cyclic references). In Python, a built-in garbage collector runs periodically to automatically delete unnecessary objects. The garbage collector specializes in detecting and freeing cyclic references and is useful in situations such as:class Node:
def __init__(self):
self.next = None
## Example of a cyclic reference
a = Node()
b = Node()
a.next = b
b.next = a
## In this state, the reference count never reaches zero, so memory is not freed
If you want to manipulate the garbage collector explicitly, you can control it using the gc
module.import gc
## Force the garbage collector to run
gc.collect()
Risks of Memory Leaks
Python’s memory management is very powerful, but not perfect. In particular, memory leaks can occur in situations such as:- Cyclic references exist but the garbage collector is disabled.
- Long-running programs where unnecessary objects remain in memory.
Summary of This Section
- Python’s memory management operates via reference counting and garbage collection.
- Garbage collection helps especially with resolving cyclic references, but proper design is crucial to prevent unnecessary memory consumption.
- The next section will explain how to measure memory usage concretely.
3. How to Check Memory Usage
Basic Approach
Check Object Size with sys.getsizeof()
Python’s standard library sys
module includes the getsizeof()
function, which lets you obtain the memory size of any object in bytes.Example Code
import sys
## Check memory size of each object
x = 42
y = [1, 2, 3, 4, 5]
z = {"a": 1, "b": 2}
print(f"Size of x: {sys.getsizeof(x)} bytes")
print(f"Size of y: {sys.getsizeof(y)} bytes")
print(f"Size of z: {sys.getsizeof(z)} bytes")
Notes
sys.getsizeof()
returns only the size of the object itself; it does not include the sizes of other objects it references (such as elements inside a list).- Measuring the exact memory usage of large objects requires additional tools.
Using Profiling Tools
Function‑Level Memory Measurement with memory_profiler
memory_profiler
is an external library that measures the memory usage of Python programs in detail on a per‑function basis. It makes it easy to pinpoint how much memory specific parts of your code consume.Setup
First, installmemory_profiler
:pip install memory-profiler
Usage
By using the@profile
decorator, you can measure memory consumption at the function level.from memory_profiler import profile
@profile
def example_function():
a = [i for i in range(10000)]
b = {i: i**2 for i in range(1000)}
return a, b
if __name__ == "__main__":
example_function()
Run the following command at execution time:python -m memory_profiler your_script.py
Sample Output
Line ## Mem usage Increment Line Contents
------------------------------------------------
3 13.1 MiB 13.5 MiB @profile
4 16.5 MiB 3.4 MiB a = [i for i in range(10000)]
5 17.2 MiB 0.7 MiB b = {i: i**2 for i in range(1000)}
Monitor Overall Process Memory Usage with psutil
psutil
is a powerful library that can monitor the total memory usage of a process. It’s useful when you want to understand the overall memory consumption of a specific script or application.Setup
Install it with the following command:pip install psutil
Usage
import psutil
process = psutil.Process()
print(f"Total process memory usage: {process.memory_info().rss / 1024**2:.2f} MB")
Main Features
- Can retrieve the current process’s memory usage in bytes.
- Allows you to monitor program performance while gaining insights for optimization.
Detailed Memory Tracing
Trace Memory Allocations with tracemalloc
Using the Python standard library tracemalloc
, you can trace the origins of memory allocations and analyze which parts consume the most memory.Usage
import tracemalloc
## Start memory tracing
tracemalloc.start()
## Memory‑consuming operation
a = [i for i in range(100000)]
## Display memory usage
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics("lineno")
print("[Memory Usage]")
for stat in top_stats[:5]:
print(stat)
Main Uses
- Identify problematic memory allocations.
- Compare multiple operations to find optimization opportunities.
Summary of This Section
- To understand Python’s memory usage, there are many options ranging from basic tools like
sys.getsizeof()
to profiling tools such asmemory_profiler
andpsutil
. - If memory consumption is critical for your program, choose the appropriate tool and manage it efficiently.
- The next section will discuss concrete methods for actually optimizing memory usage.
4. How to Optimize Memory Usage
Choosing Efficient Data Structures
Replacing Lists with Generators
List comprehensions are convenient, but they consume a lot of memory when handling large amounts of data. Using generators instead generates data on the fly as needed, significantly reducing memory usage.Code Example
## Using a list
list_data = [i**2 for i in range(1000000)]
print(f"List memory size: {sys.getsizeof(list_data) / 1024**2:.2f} MB")
## Using a generator
gen_data = (i**2 for i in range(1000000))
print(f"Generator memory size: {sys.getsizeof(gen_data) / 1024**2:.2f} MB")
Using generators can dramatically reduce memory usage.Using collections.defaultdict
as a Dictionary Alternative
Python dictionaries are convenient, but they consume a lot of memory when handling large datasets. Using collections.defaultdict
enables efficient default value handling and simplifies processing.Code Example
from collections import defaultdict
## Regular dictionary
data = {}
data["key"] = data.get("key", 0) + 1
## Using defaultdict
default_data = defaultdict(int)
default_data["key"] += 1
Managing Unnecessary Objects
Explicit Deletion with the del
Statement
In Python, you can manually delete unnecessary objects, which reduces the burden on the garbage collector.Code Example
## Deleting unnecessary variable
a = [1, 2, 3]
del a
After deletion, the variable a
is freed from memory.Using the Garbage Collector
You can use thegc
module to manually run the garbage collector, which can resolve memory leaks caused by circular references.Code Example
import gc
## Running the garbage collector
gc.collect()
Optimization Using External Libraries
Leveraging NumPy and Pandas
NumPy and Pandas are designed for efficient memory management. Especially when handling large amounts of numeric data, using these libraries can significantly reduce memory usage.NumPy Example
import numpy as np
## Python list
data_list = [i for i in range(1000000)]
print(f"List memory size: {sys.getsizeof(data_list) / 1024**2:.2f} MB")
## NumPy array
data_array = np.arange(1000000)
print(f"NumPy array memory size: {data_array.nbytes / 1024**2:.2f} MB")
NumPy arrays are more memory-efficient compared to lists.Preventing Memory Leaks
To prevent memory leaks, it is important to keep the following points in mind.- Avoid Circular References Design objects so they do not reference each other.
- Scope Management Be mindful of function and class scopes to avoid leaving unnecessary objects behind.
Summary of This Section
- Optimizing memory usage requires selecting efficient data structures and properly deleting unnecessary objects.
- Leveraging external libraries such as NumPy and Pandas enables even more efficient memory management.
- The next section will discuss troubleshooting techniques that help solve real-world problems.
5. Troubleshooting
How to Handle Sudden Increases in Memory Usage
Adjust the Garbage Collector
If the garbage collector is not functioning properly, unnecessary memory may not be released, causing usage to spike. To resolve this, use thegc
module to adjust the garbage collector.Code Example
import gc
## Check the garbage collector's status
print(gc.get_threshold())
## Run the garbage collector manually
gc.collect()
## Change the garbage collector settings (e.g., adjust thresholds)
gc.set_threshold(700, 10, 10)
Reevaluate Object Lifecycles
Some objects may remain in memory even after they are no longer needed. In such cases, consider reviewing the object’s lifecycle and deleting it at an appropriate time.Memory Leaks Caused by Circular References
Problem Overview
Circular references occur when two or more objects reference each other. In this case, reference counts never reach zero, and the garbage collector cannot free them.Solutions
- Use weak references (
weakref
module) to avoid circular references. - Run the garbage collector manually to break circular references.
Code Example
import weakref
class Node:
def __init__(self, name):
self.name = name
self.next = None
a = Node("A")
b = Node("B")
## Use weak references to avoid circular references
a.next = weakref.ref(b)
b.next = weakref.ref(a)
When Memory Profiling Tools Do Not Work
memory_profiler
Error
When using memory_profiler
, the @profile
decorator may not work. This issue is caused by not running the script correctly.Solution
- Run the script with the
-m memory_profiler
option:
python -m memory_profiler your_script.py
- Ensure that the function decorated with the decorator is correctly specified.
psutil
Error
If psutil
cannot retrieve memory information, there may be issues with the library version or environment.Solution
- Check the
psutil
version and install the latest version:
pip install --upgrade psutil
- Verify that you are retrieving process information correctly:
import psutil
process = psutil.Process()
print(process.memory_info())
Handling Memory Exhaustion Errors
Problem Overview
When handling large datasets, programs may encounter memory exhaustion errors (MemoryError
).Solutions
- Reduce data size Delete unnecessary data and use efficient data structures.
## Use a generator
large_data = (x for x in range(10**8))
- Process in chunks Split data into smaller chunks to reduce memory consumption at any one time.
for chunk in range(0, len(data), chunk_size):
process_data(data[chunk:chunk + chunk_size])
- Leverage external storage Store data on disk instead of memory for processing (e.g., SQLite, HDF5).
Summary of This Section
- Use garbage collection and lifecycle management to properly control memory usage.
- If circular references or tool errors occur, they can be resolved with weak references and proper configuration.
- Memory exhaustion errors can be avoided by revisiting data structures, using chunked processing, and leveraging external storage.
6. Practical Example: Measuring Memory Usage in Python Scripts
Here we present a concrete example of measuring memory usage within a Python script using the tools and techniques discussed so far. Through this practical example, you will learn how to analyze and optimize memory usage.Sample Scenario: Comparing Memory Usage of Lists and Dictionaries
Code Example
The following script measures the memory usage of lists and dictionaries usingsys.getsizeof()
and memory_profiler
.import sys
from memory_profiler import profile
@profile
def compare_memory_usage():
## Create list
list_data = [i for i in range(100000)]
print(f"List memory usage: {sys.getsizeof(list_data) / 1024**2:.2f} MB")
## Create dictionary
dict_data = {i: i for i in range(100000)}
print(f"Dictionary memory usage: {sys.getsizeof(dict_data) / 1024**2:.2f} MB")
return list_data, dict_data
if __name__ == "__main__":
compare_memory_usage()
Execution Steps
memory_profiler
is not installed, please execute the following:
pip install memory-profiler
- Run the script with
memory_profiler
:
python -m memory_profiler script_name.py
Sample Output
Line ## Mem usage Increment Line Contents
------------------------------------------------
5 13.2 MiB 13.2 MiB @profile
6 17.6 MiB 4.4 MiB list_data = [i for i in range(100000)]
9 22.2 MiB 4.6 MiB dict_data = {i: i for i in range(100000)}
List memory usage: 0.76 MB
Dictionary memory usage: 3.05 MB
From this example, you can see that dictionaries consume more memory than lists. This provides guidance for selecting the appropriate data structure based on application requirements.Sample Scenario: Monitoring Overall Process Memory Usage
Code Example
The following script usespsutil
to monitor the overall process memory usage in real time.import psutil
import time
def monitor_memory_usage():
process = psutil.Process()
print(f"Initial memory usage: {process.memory_info().rss / 1024**2:.2f} MB")
## Simulate memory consumption
data = [i for i in range(10000000)]
print(f"Memory usage during processing: {process.memory_info().rss / 1024**2:.2f} MB")
del data
time.sleep(2) ## Wait for garbage collector to run
print(f"Memory usage after data deletion: {process.memory_info().rss / 1024**2:.2f} MB")
if __name__ == "__main__":
monitor_memory_usage()
Execution Steps
psutil
is not installed, please execute the following:
pip install psutil
- Run the script:
python script_name.py
Sample Output
Initial memory usage: 12.30 MB
Memory usage during processing: 382.75 MB
Memory usage after data deletion: 13.00 MB
From these results, you can observe the behavior when large amounts of data consume memory and how memory is released by deleting unnecessary objects.Key Points of This Section
- To measure memory usage, it is important to combine tools (such as
sys.getsizeof()
,memory_profiler
,psutil
) appropriately. - Visualizing data structures and overall process memory usage helps identify bottlenecks and enables efficient program design.

7. Summary and Next Steps
Key Points of the Article
- Fundamentals of Python Memory Management
- Python automatically manages memory using reference counting and garbage collection.
- Proper design is required to prevent issues caused by circular references.
- How to Check Memory Usage
- Using
sys.getsizeof()
, you can check the memory size of individual objects. - Tools such as
memory_profiler
andpsutil
allow detailed measurement of memory consumption for functions or entire processes.
- How to Optimize Memory Usage
- Using generators and efficient data structures (e.g., NumPy arrays) can reduce memory consumption when processing large data sets.
- Deleting unnecessary objects and leveraging the garbage collector helps prevent memory leaks.
- Applying in Practical Examples
- Through real code, we learned the steps for measuring memory and optimization techniques.
- We practiced examples comparing memory usage of lists versus dictionaries and monitoring memory for an entire process.
Next Steps
- Apply to Your Own Projects
- Incorporate the methods and tools introduced in this article into your everyday Python projects.
- For example, try
memory_profiler
on scripts handling large data to pinpoint memory‑intensive sections.
- Learn More Advanced Memory Management
- The official Python documentation contains detailed information on memory management and the
gc
module. If interested, refer to the following:
- Utilize External Tools and Services
- In large‑scale projects, using profiling features of
py-spy
orPyCharm
enables more detailed analysis. - When running in cloud environments, take advantage of monitoring tools offered by AWS and Google Cloud.
- Continuous Code Review and Improvement
- If developing in a team, discuss memory usage during code reviews to increase optimization opportunities.
- Cultivating coding habits that prioritize memory efficiency yields long‑term benefits.