Retrieving filenames from a folder in Python is a very useful skill for beginner to intermediate programmers. Being able to get filenames can streamline bulk data processing and file operations, helping with automation and data organization. In this article, we explain step-by-step how to retrieve filenames from a folder using Python. We include code examples and clear explanations so even first-time learners can follow along.
2. Basic filename retrieval using the os module
If you use Python’s standard library os module, you can easily retrieve file names in a folder. This approach is suitable even for those new to Python, featuring a simple and intuitive code structure.
Basic usage of the os.listdir() function
os.listdir() returns a list of the names of all items (files and folders) in the specified directory.
import os
# Get files and directories in the folder
folder_path = "sample_folder"
items = os.listdir(folder_path)
print("Items in the folder:")
print(items)
When you run the above code, the file and directory names in the specified folder are displayed as a list.
How to extract only filenames
The retrieved list may include directory names. By using os.path.isfile(), you can extract only the files.
import os
# Get only file names in the folder
folder_path = "sample_folder"
files = [f for f in os.listdir(folder_path) if os.path.isfile(os.path.join(folder_path, f))]
print("Files in the folder:")
print(files)
In this code, each item is checked to see if it is a file, and only the file names are stored in the list.
Example
For example, if you want to create a list of image files saved in a folder, you can use it as follows.
import os
# Get image files
folder_path = "images"
image_files = [f for f in os.listdir(folder_path) if f.endswith((".png", ".jpg", ".jpeg"))]
print("Image file list:")
print(image_files)
Advantages of this method
Simple and easy to understand.
Can be implemented using only Python’s standard library.
Ideal for file operations in small folders.
Notes
Hidden files or files with unusual names may be included, so filter as needed.
Using it on large directories may affect performance.
3. Using the glob module for wildcard searches
If you use Python’s glob module, you can efficiently retrieve filenames that match a specific pattern. Unlike the os module, it is characterized by the ability to flexibly set filename search criteria using wildcards.
Basic usage of the glob module
By using glob.glob(), you can retrieve files and directories that match the specified pattern.
import glob
# Get all files in the folder
folder_path = "sample_folder"
files = glob.glob(f"{folder_path}/*")
print("Files in the folder:")
print(files)
This code retrieves the paths of all items (files and directories) in the specified folder.
Retrieve files matching a specific pattern
For example, to retrieve only files with a specific extension, write the following.
import glob
# Get files with the specified extension
folder_path = "sample_folder"
text_files = glob.glob(f"{folder_path}/*.txt")
print("Text files:")
print(text_files)
In this example, only files with the .txt extension are retrieved. Similarly, to specify multiple extensions, write as follows.
import glob
# Get files by specifying multiple extensions
folder_path = "sample_folder"
files = glob.glob(f"{folder_path}/*.txt") + glob.glob(f"{folder_path}/*.csv")
print("Text and CSV files:")
print(files)
Retrieving files recursively
By using the recursive=True option of the glob module, you can retrieve files including those in subdirectories.
import glob
# Recursively get all files in subdirectories
folder_path = "sample_folder"
files = glob.glob(f"{folder_path}/**/*", recursive=True)
print("All files in subdirectories:")
print(files)
Examples of wildcard usage
*: Represents any string.
?: Represents any single character.
[abc]: Represents any character within the brackets.
import glob
# Get any files whose filename starts with "data"
folder_path = "sample_folder"
files = glob.glob(f"{folder_path}/data*")
print("Files starting with 'data':")
print(files)
Advantages and use cases
Allows flexible pattern specification and is tailored for filename searches.
Compared to the os module, code tends to be more concise.
Recursive searches are easy to perform.
Notes
An empty list is returned when nothing matches the pattern, so check the results.
Recursive searches can return a very large number of files, so apply appropriate filtering.
4. The pathlib module: object-oriented path manipulation
Python’s pathlib module provides a modern, intuitive way to work with file paths in an object-oriented manner. This improves code readability and makes operations simpler.
Basic usage of the pathlib module
Using the Path object from the pathlib module, you can easily obtain the entries in a directory.
from pathlib import Path
# Get items in the folder
folder_path = Path("sample_folder")
items = list(folder_path.iterdir())
print("Items in the folder:")
print(items)
In this code, all items (files and folders) in the folder are returned as a list of Path objects.
Extracting only filenames
To get only files from the directory entries, use the is_file() method.
from pathlib import Path
# Get files in the folder
folder_path = Path("sample_folder")
files = [f for f in folder_path.iterdir() if f.is_file()]
print("Files in the folder:")
print(files)
In this code, directories are excluded and only files are stored in the list.
Filtering by extension
To retrieve only files with a specific extension, specify the extension as a condition.
from pathlib import Path
# Get files with a specific extension in the folder
folder_path = Path("sample_folder")
text_files = [f for f in folder_path.iterdir() if f.is_file() and f.suffix == ".txt"]
print("Text files:")
print(text_files)
The suffix attribute makes it easy to get a file’s extension.
Retrieving files recursively
To include files in subdirectories, use the rglob() method.
from pathlib import Path
# Get all files in subdirectories
folder_path = Path("sample_folder")
files = list(folder_path.rglob("*"))
print("All files in subdirectories:")
print(files)
Additionally, you can recursively search for files that match a specific pattern.
from pathlib import Path
# Get files with a specific extension in subdirectories
folder_path = Path("sample_folder")
text_files = list(folder_path.rglob("*.txt"))
print("Text files retrieved recursively:")
print(text_files)
Benefits and use cases
Write intuitive, object-oriented code.
Recursive operations and filtering are easy to perform.
Available since Python 3.4 and recommended in recent Python versions.
Notes
Not available on unsupported Python versions (earlier than 3.4).
Be mindful of performance when operating on very large directories.
5. How to Recursively Retrieve Files in Subdirectories
When retrieving files from a folder, you may want to include files in subdirectories and retrieve them recursively. This section explains techniques for recursively obtaining files using the os module and other methods.
Using os.walk()
os.walk() recursively traverses the specified directory and its subdirectories, returning the folder name, subfolder names, and file names.
import os
# Retrieve all files in subdirectories
folder_path = "sample_folder"
for root, dirs, files in os.walk(folder_path):
for file in files:
print(f"File: {os.path.join(root, file)}")
Code explanation:
os.walk() returns a tuple of (folder path, subfolder list, file list).
Uses os.path.join() to construct the full file path.
Collect all files into a list
If you want to store all retrieved files in a list, you can write it like this:
import os
# Store all files in subdirectories in a list
folder_path = "sample_folder"
all_files = []
for root, dirs, files in os.walk(folder_path):
for file in files:
all_files.append(os.path.join(root, file))
print("All files in subdirectories:")
print(all_files)
Retrieve only files with a specific extension
For example, if you only want to retrieve .txt files, add the following condition:
import os
# Retrieve files with a specific extension in subdirectories
folder_path = "sample_folder"
txt_files = []
for root, dirs, files in os.walk(folder_path):
for file in files:
if file.endswith(".txt"):
txt_files.append(os.path.join(root, file))
print("Text files:")
print(txt_files)
Recursive retrieval using the glob module
You can easily perform recursive searches using ** from the glob module.
It’s also simple to retrieve only files with a specific extension.
import glob
# Recursively retrieve files with a specific extension
folder_path = "sample_folder"
txt_files = glob.glob(f"{folder_path}/**/*.txt", recursive=True)
print("Text files retrieved recursively:")
print(txt_files)
Recursive retrieval using the pathlib module
The rglob() method of the pathlib module provides an intuitive way to retrieve files recursively.
from pathlib import Path
# Recursively retrieve all files
folder_path = Path("sample_folder")
all_files = list(folder_path.rglob("*"))
print("Files retrieved recursively:")
print(all_files)
It’s also easy to specify a particular extension.
from pathlib import Path
# Recursively retrieve files with a specific extension
folder_path = Path("sample_folder")
txt_files = list(folder_path.rglob("*.txt"))
print("Text files retrieved recursively:")
print(txt_files)
Comparison of the methods
Method
Characteristics
Advantages
Considerations
os.walk()
Manually implement recursive file traversal
Highly customizable
Code can become lengthy
glob
Allows concise wildcard-based expressions
Enables recursive searches with short code
Requires crafting appropriate wildcard patterns
pathlib.rglob()
Allows an object-oriented, Pythonic style
Highly readable and fits modern Python code
Only available on Python 3.4 and later
6. Filtering File Names by Specific Conditions and Use Cases
Filtering files by specific conditions helps organize the list of retrieved file names and is useful when performing data operations for a particular purpose. This section explains how to narrow down files in Python using file extensions, file names, and regular expressions.
Filtering by File Extension
To get only files with a specified extension, you can use list comprehensions or conditional checks. Example: Using the os module
import os
# Get files with the specified extension
folder_path = "sample_folder"
txt_files = [f for f in os.listdir(folder_path) if f.endswith(".txt")]
print("Text files:")
print(txt_files)
Example: Using the pathlib module Using the pathlib module, you can write extension filtering more concisely.
from pathlib import Path
# Get files with the .txt extension
folder_path = Path("sample_folder")
txt_files = [f for f in folder_path.iterdir() if f.suffix == ".txt"]
print("Text files:")
print(txt_files)
When File Names Contain a Specific String
To determine whether a file name contains a specific string, use the in operator.
import os
# Get files whose names contain 'report'
folder_path = "sample_folder"
report_files = [f for f in os.listdir(folder_path) if "report" in f]
print("Files containing 'report':")
print(report_files)
Advanced Filtering Using Regular Expressions
Using regular expressions allows you to filter files with more flexible conditions. For example, it’s useful when searching for specific patterns in file names.
import os
import re
# Get files whose names consist only of digits
folder_path = "sample_folder"
pattern = re.compile(r"^d+$")
files = [f for f in os.listdir(folder_path) if pattern.match(f)]
print("File names consisting only of digits:")
print(files)
Filtering by File Size
If you want to filter by file size, use os.path.getsize().
import os
# Get files that are 1 MB or larger
folder_path = "sample_folder"
large_files = [f for f in os.listdir(folder_path) if os.path.getsize(os.path.join(folder_path, f)) > 1 * 1024 * 1024]
print("Files 1 MB or larger:")
print(large_files)
Practical Examples
To group files by multiple extensions, you can write it like this.
import os
# Categorize files by extension
folder_path = "sample_folder"
files_by_extension = {}
for f in os.listdir(folder_path):
ext = os.path.splitext(f)[1] # Get the extension
if ext not in files_by_extension:
files_by_extension[ext] = []
files_by_extension[ext].append(f)
print("Grouped by extension:")
print(files_by_extension)
2. Filter by date To filter by a file’s modification timestamp, use os.path.getmtime().
import os
import time
# Get files updated within the past week
folder_path = "sample_folder"
one_week_ago = time.time() - 7 * 24 * 60 * 60
recent_files = [f for f in os.listdir(folder_path) if os.path.getmtime(os.path.join(folder_path, f)) > one_week_ago]
print("Recently updated files:")
print(recent_files)
Use Cases and Convenience of Each Method
Filter Condition
Usage
Purpose
File extension
f.endswith(".txt")
Classify by file type
Specific substring
"keyword" in f
Find files related to a specific purpose
Regular expressions
re.match(pattern, f)
Search for files with complex name patterns
File size
os.path.getsize()
Detect large files
Modification date/time
os.path.getmtime()
Find recently used files
7. Practical Examples: How to Use the Retrieved File Name List
How you use the retrieved file name list varies widely depending on your needs and goals. In this section, we’ll introduce common use cases and several concrete code examples.
Batch Processing of File Contents
This shows an example of using the retrieved file list to read and process the contents of each file. Example: Read the contents of text files in bulk
import os
# Get text files in the folder
folder_path = "sample_folder"
text_files = [f for f in os.listdir(folder_path) if f.endswith(".txt")]
# Read all file contents in bulk
all_content = ""
for file in text_files:
with open(os.path.join(folder_path, file), "r", encoding="utf-8") as f:
all_content += f.read() + "
"
print("Contents of all text files:")
print(all_content)
Renaming Files
You can use the retrieved file name list to rename files in bulk. Example: Add a prefix to file names
import os
# Get files in the folder
folder_path = "sample_folder"
files = os.listdir(folder_path)
# Add a prefix to file names
for file in files:
old_path = os.path.join(folder_path, file)
new_path = os.path.join(folder_path, f"new_{file}")
os.rename(old_path, new_path)
print("Renamed the files.")
Saving the File List
It’s also useful to save the retrieved file name list to an external file (a text file or Excel) so you can review it later. Example: Save the file name list to a text file
import os
# Get files in the folder
folder_path = "sample_folder"
files = os.listdir(folder_path)
# Save file names to a text file
with open("file_list.txt", "w", encoding="utf-8") as f:
for file in files:
f.write(file + "
")
print("Saved the file name list.")
Example: Save the file name list as CSV
import os
import csv
# Get files in the folder
folder_path = "sample_folder"
files = os.listdir(folder_path)
# Save file names to a CSV
with open("file_list.csv", "w", encoding="utf-8", newline="") as csvfile:
writer = csv.writer(csvfile)
writer.writerow(["File Name"]) # Add header
for file in files:
writer.writerow([file])
print("Saved the file name list to CSV.")
Creating File Backups
This is an example of using the file name list to create backups of files in a specified folder.
import os
import shutil
# Get files in the folder
source_folder = "sample_folder"
backup_folder = "backup_folder"
os.makedirs(backup_folder, exist_ok=True)
files = os.listdir(source_folder)
# Copy files to the backup folder
for file in files:
shutil.copy(os.path.join(source_folder, file), os.path.join(backup_folder, file))
print("Backup created.")
Restricting Processing to Specific Files
Use the retrieved file list to execute processing only on files that meet specific conditions. Example: Delete files with a specific extension
import os
# Delete files in the folder with a specific extension
folder_path = "sample_folder"
files = [f for f in os.listdir(folder_path) if f.endswith(".tmp")]
for file in files:
os.remove(os.path.join(folder_path, file))
print("Deleted unnecessary files.")
Advanced Scenarios
Data collection: Gather many CSV files and consolidate their data.
Log management: Identify and delete old log files.
Image processing: Resize or convert images in a specific folder.
8. Troubleshooting: Tips for Resolving Errors
When performing file operations or retrieving filenames within folders, various errors can occur. This section explains common errors and how to resolve them.
Error 1: Folder Not Found
When it occurs If the specified folder does not exist, FileNotFoundError occurs. Causes
The folder path is incorrect.
The folder has been deleted.
How to fix
Check whether the folder exists using os.path.exists() or pathlib.Path.exists().
Example: Code to check if a folder exists
import os
folder_path = "sample_folder"
if not os.path.exists(folder_path):
print(f"Error: Folder not found ({folder_path})")
else:
print("The folder exists.")
Error 2: Permission Error During File Operations
When it occurs When attempting to read or write a file, PermissionError occurs. Causes
No access permissions for the file or folder.
The file is locked by another process.
How to fix
Check and correct access permissions.
Ensure the file is not in use.
Example: Handling access permission errors
import os
file_path = "sample_folder/sample_file.txt"
try:
with open(file_path, "r", encoding="utf-8") as f:
content = f.read()
print(content)
except PermissionError:
print(f"Error: You don't have permission to access the file ({file_path})")
Error 3: File Path Too Long
When it occurs On Windows, errors can occur when a path exceeds 260 characters. How to fix
Enable the Windows setting that supports long paths.
Shorten file or folder names.
Example: Code to shorten paths
import os
# Shorten a long path
long_path = "a/very/long/path/to/a/folder/with/a/long/file_name.txt"
short_path = os.path.basename(long_path)
print(f"Shortened path: {short_path}")
Error 4: Handling Filenames with Special Characters
When it occurs Errors can occur if a filename contains special characters (for example: spaces, special symbols, or non-ASCII characters). How to fix
When it occurs When processing very large directories, you may run out of memory and encounter errors. How to fix
Process files in smaller batches instead of all at once.
Use generators when generating file lists.
Example: Processing using a generator
import os
def get_files(folder_path):
for root, _, files in os.walk(folder_path):
for file in files:
yield os.path.join(root, file)
folder_path = "sample_folder"
for file in get_files(folder_path):
print(f"Processing: {file}")
Error 6: File Is Locked
When it occurs A specific application may have the file open, preventing deletion or editing. How to fix
Identify and terminate the process using the file.
Wait until the file is released.
Error 7: UnicodeDecodeError
When it occurs Occurs when the file encoding is unknown. How to fix
Open the file with an explicit encoding.
Detect the encoding using the chardet library.
Example: Opening a file with a specified encoding
import os
file_path = "sample_folder/sample_file.txt"
try:
with open(file_path, "r", encoding="utf-8") as f:
content = f.read()
print(content)
except UnicodeDecodeError:
print(f"Error: Unknown file encoding ({file_path})")
Summary
By using these troubleshooting steps, you can efficiently resolve errors that occur during file operations and improve the reliability of your scripts.
9. Summary
This article explained how to get filenames inside a folder using Python, from basics to practical applications. We reviewed the features and appropriate use cases of each method and summarized which method to choose.
Features of the methods and when to use them
Method
Features
Use cases
os module
– Part of the standard library and easy to use.
Ideal for processing small folders or basic file retrieval.
– For recursive searches, use os.walk().
When you need to operate on files including subdirectories.
glob module
– Allows flexible pattern searches with wildcards.
When you want to efficiently search by file extension or name patterns.
– Recursive searches are possible with recursive=True.
Searching for files in subdirectories that meet specific criteria.
pathlib module
– Enables modern, object-oriented code.
When you’re using Python 3.4+ and readability is important.
– You can search files recursively and intuitively with rglob().
When you want to write concise folder operations including subdirectories.
Recap of usage examples
Creating a file list: Use the os module and the glob module to list all files in a folder.
Filtering by specific conditions: Use extensions, names, or regular expressions to select only the files you need.
Batch processing: Use the obtained file list to streamline reading and writing of file contents.
Advanced operations: Renaming, creating backups, deleting unnecessary files, etc.
Importance of troubleshooting
We also touched on common errors that can occur when performing file operations. Being mindful of the following can improve script reliability:
Check for the existence of files and folders in advance.
Consider access permissions and handling of special characters.
Be aware of memory shortages and performance issues when handling large numbers of files.
Benefits of mastering filename retrieval in Python
Retrieving filenames in folders with Python is useful in many scenarios. In particular, it offers the following advantages.
Improved efficiency: Automating routine tasks and data organization.
Flexibility: Ability to operate on files under various conditions.
Scalability: Can handle datasets from small to large scale.
Finally
Python offers a variety of modules and features. By appropriately choosing among the modules introduced here — os, glob, and pathlib — you can maximize efficiency and accuracy in file operations. I hope this article helps beginners and intermediate users improve their Python file-handling skills. Try applying these techniques in real projects or at work to experience how useful Python can be!