Python Scripting for File Management
Automation is super important in Cloud, DevOps, and Security Engineer roles. A major part of automation includes building scripting tools that make performing repetitive tasks more efficient.
After being given the task to build a script that allows the user to extract information about files in a directory, I figured, why not share? Whether you’re just starting out in Python or you’ve been coding for a while, this is for you.
🪜I’m going to show you two versions of the script. First, a basic script that lists files in the current directory along with the file names and sizes. Then, I’ll tweak it to make a more advanced version that can take in different file paths.
1️⃣Basic Script — Listing Files in Your Current Directory
2️⃣Advanced Script — Custom Paths and Recursive File Listing
🏴Prerequisites:
🔸Basic knowledge of Python: You don’t need to be a pro, but knowing the basics of Python will help a lot.
🔸Text editor or IDE along with Python installed on your computer: This is where and how we’ll write and test our script.
🔸Basic understanding of file systems: Have a general idea of how files and directories are organized on your computer.
There are numerous instances these scripts are useful for, including:
✅Resource Allocation: Understanding file sizes and distributions can help in allocating resources better and managing costs, especially in cloud environments where storage can incur costs.
✅Data Management and Optimization: In cloud and IT environments, managing data efficiently is key to optimizing performance and reducing costs. Being able to extract file information such as the sizes and types of files allows organizations to better identify redundant or obsolete data to help optimize storage utilization.
✅Inventory and Asset Management: Using scripting to automate the process of cataloging files across multiple systems aids in effective asset management.
The use of the following scripts are designed for simplicity, making it accessible to everyone.
I’ll walk you through how these scripts can make your daily work as a Cloud Engineer more efficient!
We’ll need to start by importing the os Module into our scripts.
Note: The os Module provides a portable way of using operating system dependent functionality. Details about the os Module are found here: https://docs.python.org/3/library/os.html.
Part 1:
1️⃣Basic Script
Let’s start with the basics. This script lists the files in the current directory and displays their names and sizes.
import os
# Define a new function that initializes a list to hold file information
def get_files_info():
files_info = []
# Loop through each file in the current directory
for filename in os.listdir('.'):
# Check if the entry is a file
if os.path.isfile(filename):
# Create a dictionary with file name and size
file_info = {'name': filename, 'size': os.path.getsize(filename)}
# Append the file info to the list
files_info.append(file_info)
return files_info
# Execute the function and print the results
file_list = get_files_info()
print(file_list)
🔸Import os Module: Allows us to interact with the operating system.
🔸Define get_files_info() Function: Retrieves file information.
🔸Loop Through Files: Identifies each file in the current directory.
🔸Create Dictionary: Stores file name and size.
🔸Print Results: Displays the list of file information.
Once the script and target files are placed within the current working directory, you can run the script with the following command:
python script_name.py
Executing this script will look similar to the below screenshot and produce results based on the files in the directory. There were two files in my directory.
Part 2:
2️⃣Advanced Script
Next, I enhanced the script to accept a directory path and recursively list files in all subdirectories. Once executed, this script will prompt the user for a file path.
import os
def get_files_info_recursive(path='.'):
# Initialize a list to store file information
files_info = []
# Check if the given path is a valid directory
if not os.path.isdir(path):
print(f"The path {path} is not a valid directory.")
return files_info
# Recursively walk through the directory
for root, dirs, files in os.walk(path):
# Iterate over each file in the directories
for name in files:
# Join the directory path and file name
filepath = os.path.join(root, name)
# Create a dictionary with file path and size
file_info = {'name': filepath, 'size': os.path.getsize(filepath)}
# Append the file info to the list
files_info.append(file_info)
return files_info
# Input the path or use the current directory as default
input_path = input("Enter the directory path (leave blank for current directory): ").strip()
input_path = input_path or '.'
# Execute the function and print the results
file_list = get_files_info_recursive(input_path)
for file in file_list:
print(file)
🔸get_files_info_recursive() Function: Accepts a specified directory path.
🔸Use os.walk(): Traverses directories recursively.
🔸File Path Creation: Combines directory path and file name.
🔸Dictionary and List Creation: Similar to the basic script, but includes file paths.
The screenshot below shows how the script is executed and the response returned:
Note: If the script is not functioning as expected, there might be issues related to permissions or the specific environment you’re running the script in. Double-check that the path you are using is valid and ensure that your Python environment has the necessary permissions to access and read the directory and its contents.
And there you have it — simple yet effective Python scripting to extract file information.
You can experiment and modify the scripts for your individual needs. The scripts have been pushed to GitHub and can be accessed here:
To receive my latest projects, playbooks, and posts, follow my Medium page, and Subscribe to get email notifications when I post new stories.
For a more personal connection, connect with me on LinkedIn to network and grow together. 🔗