Python Pointers: Memory Management & References Beginners

目次

1. Introduction

Python is widely used as a simple yet powerful programming language. One reason it is embraced by developers ranging from beginners to professionals is its intuitive syntax and extensive libraries. However, when learning about Python’s internal workings and memory management, many people become confused by the concept of “pointers.” It is sometimes said that “Python has no pointers,” but it is important to understand pointer-like behavior. Python does not have explicit pointers like C, but variables act as “references” to objects. These “references” form the basis of memory management and object manipulation in Python. This article provides a detailed explanation of the pointer-like concepts in Python and how they are realized. It focuses particularly on the following points:
  • How Python variables reference objects
  • The difference between pass-by-reference and pass-by-value
  • Python’s unique memory management mechanism
  • Understanding pointers through comparison with C
The content is designed for a wide audience, from Python beginners to intermediate developers. Through this article, we aim to help you correctly understand Python’s memory mechanisms and pointer concepts, and apply them to real-world programming. In the next section, we will first delve into the relationship between Python variables and objects.

2. Python Variables and Objects

Understanding the relationship between variables and objects is crucial for grasping Python. In Python, all data is treated as objects, and variables function as references that point to those objects. By understanding this mechanism, you can picture behavior similar to pointers in Python.

2.1 Variables are References to Objects

In Python, a variable does not hold data directly. Instead, it holds a reference to an object, which it uses. For example, consider the following code:
x = 10
y = x
In this case, the variable x holds a reference to the integer value 10. When you then execute y = x, y also references the same object 10. The key point is that x and y are not sharing the same data (value); they are referencing the same object. In Python, you can use the id() function to see which object a variable references. Here’s an example:
x = 10
y = x

print(id(x))  # ID of the object x references (memory address)
print(id(y))  # ID of the object y references (same as x)
Running this code shows that id(x) and id(y) return the same value, confirming that both variables reference the same object.

2.2 Immutable and Mutable Data Types

Python’s data types fall into two broad categories: immutable (unchangeable) and mutable (changeable). This distinction is an important point for understanding variable behavior and reference mechanics. Immutable data types:
  • Data types whose contents cannot be changed.
  • Examples: integers (int), floating-point numbers (float), strings (str), tuples (tuple).
  • Assigning a new value creates a new object distinct from the original.
Here’s an example of immutable data types:
x = 10
print(id(x))  # Get the object's ID
x += 1        # Assign a new value
print(id(x))  # Confirm that it references a different object
In this code, changing the value of x creates a new object, and the variable x then references that new object. Mutable data types:
  • Data types whose contents can be changed.
  • Examples: lists (list), dictionaries (dict), sets (set).
  • Modifications update the same object in place.
Here’s an example of mutable data types:
my_list = [1, 2, 3]
print(id(my_list))  # Get the object's ID
my_list.append(4)   # Append a value to the list
print(id(my_list))  # Confirm that it still references the same object
In this code, adding a value to the list my_list does not create a new object; the same object is updated.

2.3 Visual Supplement to the Concept

Visualizing the difference between immutable and mutable helps clarify the concept:
  • Immutable: Variable → Object A → New Object B (after change)
  • Mutable: Variable → Object → Change within the same object
By grasping this difference, you can gain a deeper understanding of how Python variables manage memory.

3. How Arguments Are Passed in Functions

In Python, all argument passing is done by “reference”. This means that the arguments passed to a function are actually references to objects. However, this behavior differs depending on whether the passed object is immutable or mutable. This section explains Python’s argument passing mechanism with concrete examples.

3.1 Difference Between Pass-by-Reference and Pass-by-Value

Generally, argument passing in programming languages is classified into the following two types:
  • Pass-by-Value: A method of passing the actual value of a variable to a function (e.g., in C or parts of JavaScript).
  • Pass-by-Reference: A method of passing a pointer to the object that the variable references.
In Python, all arguments are passed as “references”, but for immutable objects a new object is created, so it may appear as pass-by-value.

3.2 Behavior of Immutable Objects

When an immutable object is passed to a function, it cannot be modified inside the function. Assigning a new value creates a new object within the function, leaving the original object unaffected. Here is a concrete example:
def modify_number(num):
    num += 10  # A new object is created
    print(f"Inside function value: {num}")

x = 5
modify_number(x)
print(f"Outside function value: {x}")
Output:
Inside function value: 15
Outside function value: 5
In this example, although num is modified inside the function, the variable x outside the function is not affected. This is because integers (int type) are immutable, causing num to refer to a new object.

3.3 Behavior of Mutable Objects

Conversely, when a mutable object is passed to a function, its contents can be directly modified inside the function. In this case, the variable outside the function is also affected. Here is an example:
def modify_list(lst):
    lst.append(4)  # Directly modify the original list

my_list = [1, 2, 3]
modify_list(my_list)
print(f"Outside function list: {my_list}")
Output:
Outside function list: [1, 2, 3, 4]
In this example, adding a value to the list my_list inside the function results in the change being reflected in the list outside the function. This is because lists are mutable and are operated on directly via references.

3.4 Practical Example: Shallow Copy vs Deep Copy

Understanding reference passing also requires knowing the difference between shallow and deep copies. Especially when dealing with nested objects, their behavior differs, so it’s important to understand each. Below is an example of shallow vs deep copy using a list:
import copy

original = [1, [2, 3]]
shallow_copy = copy.copy(original)  # Shallow copy
deep_copy = copy.deepcopy(original)  # Deep copy

# Behavior of shallow copy
shallow_copy[1].append(4)
print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Shallow copy: {shallow_copy}")  # [1, [2, 3, 4]]

# Behavior of deep copy
deep_copy[1].append(5)
print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Deep copy: {deep_copy}")  # [1, [2, 3, 4, 5]]

3.5 Cautions and Best Practices

  • Manipulating Mutable Objects Inside Functions:
  • Avoid modifying objects more than necessary.
  • If you modify an object, explicitly note it in a comment.
  • When Passing Immutable Objects to Functions:
  • Consider using immutable data types to avoid unnecessary changes.

4. Pointer-like Operations in Python

Python does not have explicit pointers like C, but by understanding that variables reference objects, you can perform operations similar to pointers. This section explains the mechanism in depth through concrete examples that demonstrate pointer-like behavior in Python.

4.1 Checking Memory Addresses: id() Function

Python’s id() function is used to obtain an object’s memory address. This allows you to check whether variables reference the same object. Here is a concrete example:
a = 42
b = a

print(id(a))  # display the ID of the object a references
print(id(b))  # ID of the object b references (same as a)
Output:
139933764908112
139933764908112
From this result, you can confirm that a and b reference the same object.

4.2 Behavior When Referencing the Same Object

In Python, if two variables reference the same object, a change made through one affects the other. Let’s observe this behavior with the following example:
x = [1, 2, 3]
y = x

y.append(4)

print(f"x: {x}")  # [1, 2, 3, 4]
print(f"y: {y}")  # [1, 2, 3, 4]
In this code, y.append(4) modifies the list, and the change is reflected in x, which references the same list.

4.3 Preserving Object Independence: Using Copies

In some cases, you need to create an independent copy instead of referencing the same object. In Python, this can be achieved using shallow copies (copy.copy()) or deep copies (copy.deepcopy()). Here is an example that preserves object independence using copies:
import copy

original = [1, [2, 3]]
shallow_copy = copy.copy(original)  # shallow copy
deep_copy = copy.deepcopy(original)  # deep copy

# Example where changes to the shallow copy affect the original object
shallow_copy[1].append(4)
print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Shallow copy: {shallow_copy}")  # [1, [2, 3, 4]]

# Example where changes to the deep copy do not affect the original object
deep_copy[1].append(5)
print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Deep copy: {deep_copy}")  # [1, [2, 3, 4, 5]]

4.4 Function Pointer-like Mechanism: Function References

In Python, functions themselves are objects. Therefore, you can assign a function to a variable or pass it as an argument to another function, enabling function-pointer-like operations. Here is an example of function references:
def greet(name):
    return f"Hello, {name}!"

# Assign the function to another variable
say_hello = greet
print(say_hello("Python"))  # Hello, Python!

# Pass the function as an argument
def execute_function(func, argument):
    return func(argument)

print(execute_function(greet, "World"))  # Hello, World!

4.5 Reference Counting and Garbage Collection

Python’s memory management is automated via reference counting and garbage collection. When an object’s reference count drops to zero, the object is freed from memory. Here is an example of reference counting:
import sys

x = [1, 2, 3]
print(sys.getrefcount(x))  # reference count of x (incremented by 1 compared to the usual count)

y = x
print(sys.getrefcount(x))  # reference count increases

del y
print(sys.getrefcount(x))  # reference count decreases

4.6 Cautions and Best Practices

  • Caution When Handling Mutable Objects:
  • Handle mutable objects carefully to avoid unintended side effects.
  • Choosing Between Shallow and Deep Copies:
  • Select the appropriate copy method based on the complexity of your data structures.
  • Understanding Garbage Collection:
  • Understand Python’s automatic memory management to prevent memory leaks.
RUNTEQ(ランテック)|超実戦型エンジニア育成スクール

5. Comparison with C Pointers

Python and C are languages that differ greatly in programming philosophy and memory management approaches. In particular, regarding “pointers,” C makes extensive use of explicit pointers, whereas Python abstracts pointers so developers don’t need to handle them directly. This section compares the differences between C and Python pointers to understand their respective characteristics and advantages.

5.1 Basics of Pointers in C

Pointers in C are essential for directly manipulating memory addresses. Pointers are used for the following purposes:
  • Storing and manipulating memory addresses
  • Passing function arguments by reference
  • Dynamic memory allocation
Below is an example of basic pointer operations in C:
#include <stdio.h>

int main() {
    int x = 10;
    int *p = &x;  // Store the address of x in pointer p

    printf("Variable x value: %d
", x);
    printf("Variable x address: %p
", p);
    printf("Value pointed to by p: %d
", *p);  // Access the value via the pointer

    return 0;
}
Example Output:
Variable x value: 10
Variable x address: 0x7ffee2dcb894
Value pointed to by p: 10
In C, the & (address operator) and * (indirection operator) are used to obtain memory addresses and manipulate values via pointers.

5.2 Pointer-like Features in Python

Python does not have explicit pointers like C, but variables hold references to objects, behaving in a pointer-like manner. The following code is an example of translating C pointer operations into Python:
x = 10
p = id(x)  # Retrieve the memory address of x

print(f"Variable x value: {x}")
print(f"Variable x address: {p}")
In Python, you can use the id() function to view a memory address, but you cannot manipulate that address directly. This is because Python automates memory management.

5.3 Differences in Memory Management

One major difference between C and Python lies in their memory management mechanisms. C Memory Management:
  • Developers must explicitly allocate memory (e.g., malloc()) and free it (e.g., free()).
  • While memory management can be efficient, issues such as memory leaks and dangling pointers are more likely to occur.
#include <stdlib.h>

int main() {
    int *ptr = (int *)malloc(sizeof(int));  // Allocate memory
    *ptr = 42;
    printf("Dynamically allocated value: %d
", *ptr);
    free(ptr);  // Free memory
    return 0;
}
Python Memory Management:
  • Python automatically frees unused objects via garbage collection.
  • Since developers don’t need to manage memory explicitly, the code becomes simpler.

5.4 Comparison of Safety and Flexibility

  • Advantages of C Pointers:
  • High degree of freedom in memory manipulation, enabling efficient program construction.
  • Suitable for hardware and system-level development.
  • Disadvantages of C Pointers:
  • Incorrect operations can easily cause memory leaks and security vulnerabilities.
  • Advantages of Python’s Pointer-like Operations:
  • High safety, easy for beginners to use.
  • Garbage collection reduces the burden of memory management.
  • Disadvantages of Python’s Pointer-like Operations:
  • Lower freedom in memory manipulation makes low-level optimization difficult.

5.5 Summary of C and Python Pointers

ItemCPython
Pointer handlingExplicitAbstracted
Memory managementManual (malloc / free)Automatic (garbage collection)
Degree of freedomHighLimited
SafetyLow (prone to bugs)High
Python emphasizes safety and ease of use, eliminating the need for low-level pointer manipulation. Conversely, C offers high flexibility, making it suitable for system and hardware-level development. Understanding these characteristics allows you to choose the appropriate language for your goals.

6. Caveats and Best Practices

After understanding pointer-like operations and object reference mechanisms in Python, we introduce caveats regarding memory management and reference handling, along with best practices to avoid them. By applying this knowledge, you can write efficient and safe code.

6.1 Beware of Side Effects with Mutable Objects

In Python, when passing mutable objects such as lists or dictionaries by reference, they can be unintentionally modified. To prevent this side effect, please keep the following points in mind. Problem Example:
def add_item(lst, item):
    lst.append(item)

my_list = [1, 2, 3]
add_item(my_list, 4)
print(my_list)  # [1, 2, 3, 4] (unexpectedly modified)
In this case, modifying the list inside the function also affects the list outside the function. Solution: If you don’t want to modify the original, you can avoid it by passing a copy of the list.
def add_item(lst, item):
    new_lst = lst.copy()
    new_lst.append(item)
    return new_lst

my_list = [1, 2, 3]
new_list = add_item(my_list, 4)
print(my_list)  # [1, 2, 3] (original list remains unchanged)
print(new_list)  # [1, 2, 3, 4]

6.2 Choosing Shallow vs Deep Copies

When creating copies of objects, it is important to understand the difference between shallow and deep copies. Especially when dealing with nested data structures, their behavior differs, so you need to understand each. Risks of Shallow Copies:
import copy

original = [1, [2, 3]]
shallow_copy = copy.copy(original)
shallow_copy[1].append(4)

print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Shallow copy: {shallow_copy}")  # [1, [2, 3, 4]]
Solution Using Deep Copy:
deep_copy = copy.deepcopy(original)
deep_copy[1].append(5)

print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Deep copy: {deep_copy}")  # [1, [2, 3, 4, 5]]
Choose whether to use a shallow or deep copy based on the complexity of the data structure and the processing requirements.

6.3 Understanding Garbage Collection and Caveats

In Python, garbage collection (GC) automatically frees unused objects, but it doesn’t work perfectly in every case. Objects with circular references require special attention. Example of Circular Reference:
class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

# Create circular reference
a = Node(1)
b = Node(2)
a.next = b
b.next = a
In such cases, the garbage collector can detect the circular reference, but it may still cause unintended memory usage growth. Solution:
  • Design to avoid circular references.
  • Use weak references (the weakref module) to prevent circular references.

6.4 Leveraging Immutable Objects

Actively using immutable data types helps prevent unwanted changes and side effects. Benefits:
  • Preventing data changes improves code reliability.
  • Enables thread-safe processing.
For example, using a tuple instead of a list can prevent unintended modifications.
immutable_data = (1, 2, 3)
# immutable_data.append(4)  # AttributeError: 'tuple' object has no attribute 'append'

6.5 Summary of Best Practices

  1. Use of Copies:
  • When passing mutable objects to functions, use shallow or deep copies as needed.
  1. Avoid Circular References:
  • Do not create overly complex object references.
  • Use weak references appropriately.
  1. Use Immutable Objects:
  • For data that does not need to be changed, choose an immutable data type.
  1. Monitor Memory Usage:
  • handling large amounts of data, use sys.getsizeof() to check memory usage and aim for efficient processing.

7. FAQ

Here we provide concise and easy-to-understand answers to frequently asked questions about pointers and memory management in Python. Resolving these doubts can deepen your understanding of Python’s behavior and mechanisms.

Q1: Does Python have pointers?

A: Python does not have explicit pointers like in C. However, variables hold references to objects, and this reference mechanism works similarly to pointers. Using the id() function, you can check the memory address of the object a variable refers to.

Q2: Why does passing a list to a function modify the original list?

A: Lists are mutable objects, and when passed to a function, a reference is passed. Thus, changes made to the list inside the function affect the list outside the function as well. If you want to avoid this, create a copy of the list and pass that instead. Example:
def modify_list(lst):
    lst.append(4)

my_list = [1, 2, 3]
modify_list(my_list)
print(my_list)  # [1, 2, 3, 4] (unexpectedly modified)

Q3: What happens when you pass an immutable object to a function?

A: When you pass an immutable object (e.g., integers, strings, tuples) to a function, assigning a new value inside the function does not affect the original object. This is because immutable objects cannot be changed. Example:
def modify_number(num):
    num += 10  # a new object is created

x = 5
modify_number(x)
print(x)  # 5 (the original value remains unchanged)

Q4: How do you check an object’s memory usage in Python?

A: In Python, you can use the sys module’s getsizeof() function to check the memory size occupied by an object. This is especially useful when dealing with large data structures. Example:
import sys

x = [1, 2, 3]
print(sys.getsizeof(x))  # display the list's memory size

Q5: What is the difference between shallow copy and deep copy?

A: A shallow copy copies only the top-level object, keeping references to nested objects. In contrast, a deep copy creates a completely new copy, including all nested objects. Example:
import copy

original = [1, [2, 3]]
shallow_copy = copy.copy(original)
shallow_copy[1].append(4)
print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Shallow copy: {shallow_copy}")  # [1, [2, 3, 4]]

deep_copy = copy.deepcopy(original)
deep_copy[1].append(5)
print(f"Original: {original}")  # [1, [2, 3, 4]]
print(f"Deep copy: {deep_copy}")  # [1, [2, 3, 4, 5]]

Q6: How does Python’s garbage collection work?

A: In Python, garbage collection (GC) automatically frees unused objects. GC monitors reference counts and frees objects whose reference count drops to zero. Additionally, mechanisms are provided to collect objects involved in reference cycles. Example:
import gc

x = [1, 2, 3]
y = x
del x
del y

gc.collect()  # manually trigger garbage collection

Q7: How to avoid circular references in Python?

A: To avoid circular references, you can use the weakref module. Using weak references allows objects to become collectible, thus preventing circular reference issues. Example:
import weakref

class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

a = Node(1)
b = Node(2)
a.next = weakref.ref(b)  # using a weak reference

8. Summary

In this article, based on the theme “Pointers in Python,” we have provided a detailed explanation of Python variables, references, and memory management. Although explicit pointers like those in C do not exist, Python provides pointer-like operations abstracted by having variables function as references to objects.

Key Points of This Article

  1. Python Variables and Objects
  • In Python, variables do not hold objects directly; they hold references to them.
  • The distinction between immutable (unchangeable) and mutable (changeable) data types is an important point that affects how references and argument passing behave.
  1. How Arguments Are Passed in Functions
  • In Python, everything is passed by reference, but for immutable objects a new object is created, which can make it appear as if values are passed by value.
  1. Pointer-like Operations in Python
  • Using the id() function lets you view an object’s memory address and understand how references work.
  • When dealing with mutable objects, it’s important to understand the difference between shallow and deep copies and use them appropriately.
  1. Comparison with C Language Pointers
  • C pointers are powerful tools that can manipulate memory directly, whereas Python abstracts this to provide safe and intuitive memory management.
  1. Cautions and Best Practices
  • Create copies as needed to avoid side effects with mutable objects.
  • Leverage immutable data types proactively to prevent unexpected changes.
  • Use weak references (weakref) appropriately to avoid circular references.
  1. FAQ Section
  • By addressing common questions about Python pointers and memory management, we provided readers with practical knowledge they can apply.

The Importance of a Pointer-like Mindset in Python

Python’s abstracted memory management and reference mechanisms are safe and efficient for developers. However, understanding the underlying mechanisms of this abstraction equips you with skills to improve performance and prevent bugs. In particular, correctly grasping how to handle mutable objects and shallow versus deep copies is essential when working with complex data structures.

Next Steps

If you want to deepen your understanding of Python’s pointer-like concepts and memory management, we recommend studying the following topics:
  • Details of Python’s Garbage Collection Learn about reference counting and how to resolve circular references.
  • Memory Optimization Techniques Explore how to use sys.getsizeof() and the gc module to efficiently process large amounts of data.
  • Combining with C or Low-Level Languages Create hybrid programs that blend Python’s flexibility with C’s efficiency.
We hope that through this article you have built a foundation for understanding Python’s pointer-like concepts and memory management, enabling you to write more efficient and safe code.