DEV Community

Cover image for Python Automation 02 - Extract Discography to "Find Most Recent Versions"
Elliot Mangini
Elliot Mangini

Posted on

Python Automation 02 - Extract Discography to "Find Most Recent Versions"

How to Find and Copy Newest Version of Files to a New Folder

Have you ever had a cluttered music production folder with dozens of song folders, each containing multiple versions and bounces of the same song? It can be difficult and time-consuming to sift through all those files and determine which is the most recent version. That's where this script comes in handy!

This script will scan through your current directory and its subdirectories, find the newest audio bounce of each song, and copy it to a new folder called "Current Discography" within the current directory. You can use this for files of any type (maybe with a little adjusting), so it's something like a productivity version management tool.

Here is the script on my GitHub as well as the full sript here which I'm calling "Extract Discography":

import shutil

import os
current_directory = os.getcwd()  # => this is a string for our current directory

import pathlib
cwd_as_obj = pathlib.Path(current_directory)  # => turn that string into a path obj
items = list(cwd_as_obj.iterdir())  # => list the paths inside our cwd

folder_count = 0
bounce_count = 0
found_count = 0

new_folder_path = os.path.join(current_directory, "Current Discography")
os.makedirs(new_folder_path, exist_ok=True)

for item in items:  # loop through the scripts siblings
    if item.is_dir():  # if this is a folder
        folder_count = folder_count + 1  # count it

        subitems = list(item.iterdir())

        most_recent_bounce = None
        latest_time = 0

        for subitem in subitems:  # looping through each item in each song folder
            # print(subitem)
            if subitem.is_file() and str(subitem).endswith((".mp3", ".wav", ".flac", ".ogg")):
                bounce_count = bounce_count + 1
                bounce_time = float(os.stat(subitem).st_birthtime)
                # print(f"current item {subitem} created {bounce_time}")
                if bounce_time > latest_time:
                    # print(f"replacing file with time {latest_time} with {bounce_time}")
                    latest_time = bounce_time
                    most_recent_bounce = subitem

        if most_recent_bounce is not None:
            print(f"\u2705 Found: {most_recent_bounce.name}")
            shutil.copy2(most_recent_bounce, new_folder_path)
            found_count = found_count + 1
        else:
            print(f"\U0001F6A8\U0001F6A8\U0001F6A8 No bounces found for: {subitem.name} \U0001F6A8\U0001F6A8\U0001F6A8.")

print(f"{folder_count} Song Folders found in this directory.")
print(f"{bounce_count} Audio Bounces found within subfolders.")
print(f"{found_count} Newest Bounces copied to {new_folder_path}.")
Enter fullscreen mode Exit fullscreen mode

Let's take a look at how it works.

Importing Libraries

import shutil
import os
import pathlib
Enter fullscreen mode Exit fullscreen mode

We start by importing the necessary libraries: shutil, os, and pathlib.

Getting the Current Directory and its Contents

current_directory = os.getcwd()
cwd_as_obj = pathlib.Path(current_directory)
items = list(cwd_as_obj.iterdir())
Enter fullscreen mode Exit fullscreen mode

Next, we use the os library to get the current working directory as a string. We then convert this string into a pathlib.Path object so we can easily manipulate it. Finally, we list all the contents of the current directory and its subdirectories using cwd_as_obj.iterdir().

Initializing Counters and Creating the New Folder

folder_count = 0
bounce_count = 0
found_count = 0

new_folder_path = os.path.join(current_directory, "Current Discography")
os.makedirs(new_folder_path, exist_ok=True)
Enter fullscreen mode Exit fullscreen mode

We initialize three counters: folder_count to count the number of song folders found, bounce_count to count the number of audio bounces found, and found_count to count the number of newest bounces copied to the new folder.

We then create a new folder called "Current Discography" within the current directory using os.makedirs(new_folder_path, exist_ok=True).

Looping through the Song Folders

Here is where the magic happens. We loop through each item in the items list, which contains all the contents of the current directory and its subdirectories. If an item is a directory, we increment the folder_count counter and create a subitems list containing all the contents of that directory.

for item in items:
    if item.is_dir():
        folder_count += 1
        subitems = list(item.iterdir())
Enter fullscreen mode Exit fullscreen mode

Next, we loop through each item in the subitems list. If the item is a file and its extension is one of .mp3, .wav, .flac, .ogg, we increment the bounce_count counter and get the birth time of the file using os.stat(subitem).st_birthtime. We then compare the birth time of the current file with the latest birth time we have seen so far, and if it is greater, we update most_recent_bounce with the current file and latest_time with the current birth time.

for subitem in subitems:
    if subitem.is_file() and str(subitem).endswith((".mp3", ".wav", ".flac", ".ogg")):
        bounce_count += 1
        bounce_time = float(os.stat(subitem).st_birthtime)
        if bounce_time > latest_time:
            latest_time = bounce_time
            most_recent_bounce = subitem
Enter fullscreen mode Exit fullscreen mode

If we found a file to copy, we print a success message, copy the file to a new directory named Current Discography located in the current directory, and increment found_count.

if most_recent_bounce is not None:
    print(f"\u2705 Found: {most_recent_bounce.name}")
    shutil.copy2(most_recent_bounce, new_folder_path)
    found_count += 1
Enter fullscreen mode Exit fullscreen mode

If we did not find a file to copy, we print a warning message.

else:
    print(f"\U0001F6A8\U0001F6A8\U0001F6A8 No bounces found for: {subitem.name} \U0001F6A8\U0001F6A8\U0001F6A8.")
Enter fullscreen mode Exit fullscreen mode

Finally, we print a nice little report with the total number of song folders found, the total number of audio bounces found, and the total number of newest bounces copied to the Current Discography directory.

print(f"{folder_count} Song Folders found in this directory.")
print(f"{bounce_count} Audio Bounces found within subfolders.")
print(f"{found_count} Newest Bounces copied to {new_folder_path}.")
Enter fullscreen mode Exit fullscreen mode

Using the script I was able to go from this:

Screenshot of raw folder structure.

Where I have thousands of audio files to look through...

to this:

Screenshot of Current Discography Folder

With our nice little report here:

Screenshot of Terminal Output

Challenges

One of the challenges in implementing this script was finding a reliable way to get the creation time of a file. There are several ways to do this using Python, such as using the os.stat() function or the pathlib.Path.stat() method. However, the accuracy of these methods can vary depending on the platform and file system. Therefore, testing was necessary to ensure that we were indeed getting the most recent file when running the script. By using the st_birthtime attribute of os.stat(). In my case this was how I was able to accurately get the creation time of a file and find the most recently created audio file in each subdirectory.

Let me know if you try this out or have any thoughts or ideas,
Cheers,
-Elliot

Top comments (0)