bkane1

Posted on Dec 16, 2020

Building an endless Spotify playlist of the greatest albums of all time

#showdev #tutorial #python

Want a playlist of great music that never ends?

Like most people, I love listening to music. I'm not a music buff by any means, but since work became remote for many of us, I spend a large portion of my days listening to music. I have Spotify, which I love, but sometimes the sheer volume of music on the platform—50 million songs total—can be overwhelming. There's too many songs I want to listen to, too many recommendations to keep up with, too many "great" albums that I still haven't sat down and listened to all the way though.

This became clearer to me when I saw the latest 500 Greatest Albums of All Time from the Rolling Stone Magazine, and realized I'd only listened to a tiny percentage of the supposedly "great" albums all the way through. The Rolling Stone determined "greatness" by aggregating the survey results of some 300 musicians, journalists, producers and music executives (including Beyonce, Taylor Swift, and Billie Eilish) who each submitted their top 50 albums. If you're worried the list has a bias for recent music given the contemporary musicians polled, don't. The 2010's were the second-least represented decade behind only the 1950's.

The now infamous list is endlessly polarizing and, of course, imperfect (how can you really compare the Beatles and Kendrick Lamar in the same list?). But if enough music industry titans voted for an album for it to make the list, it's probably worth listening to, right? So why not listen to all 500 albums all the way through, start to finish?

Problem

But it's not that simple to listen to 500 albums and the 7,000 songs comprising them. It'd be too tedious to add all 500 albums one-by-one to your Spotify library (or other streaming service library). There are public playlists on Spotify that have all the albums from the list, but keeping track of where you are in a playlist that big seems impossible. I don't want to be tempted to skip around and end up overlooking any. In fact, I actually want to be forced to listen to every album all the way through—even the ones I'm not in love with right away. I just want a queue of the greatest albums that I can only progress through by listening to each song on each album. In short, I want an endless playlist.

Solution (using Python and the Spotify Web API)

So with Python and the Spotify Web API at my disposal, I decided to make my own Spotify playlist off the Rolling Stone's Greatest Albums list that automatically updates when I listen to a song from it, deleting the ones I listened to and adding new ones automatically. You can think of it as a giant conveyor belt of songs that you can only see a small part of, and that only turns when you listen to it. Every hour, it takes the songs I listened to (if any), deletes them, and adds the next songs up from the next album. I also randomized the albums (but kept the songs within each album in order) so I never know what's coming next and have no expectations. Best of all, if I listen to other music for an hour, I can come back to my endless playlist and know exactly where I left off: the first song. Here's how I did it, and how you can too (disclaimer: you must have Spotify Premium to do this).

Connecting to the Spotify API

First you have to make a registered app with Spotify in order to use the Spotify Web API. I prefer Python so I use the Spotipy Python library to access the API. Once you've registered your app and added a redirect URI, you can start playing with the API! Here's some sample code to authorize your credentials—just replace the client credentials below with the ones generated by your own app. A new window will open asking if you want to give the app access to your Spotify account. After you click "Accept", copy and paste the resulting url where it prompts for it. Try it out in a Jupyter Notebook or your preferred IDE.

import spotipy
import pandas as pd
import numpy as np
import requests
from spotipy import oauth2
import re

SPOTIPY_CLIENT_ID = 'YOUR CLIENT ID'
SPOTIPY_CLIENT_SECRET = 'YOUR CLIENT SECRET'
SCOPE = ('user-read-recently-played,user-library-read,user-read-currently-playing,playlist-read-private,playlist-modify-private,playlist-modify-public,user-read-email,user-modify-playback-state,user-read-private,user-read-playback-state')
SPOTIPY_REDIRECT_URI = 'YOUR REDIRECT URI'
sp_oauth = oauth2.SpotifyOAuth( SPOTIPY_CLIENT_ID, SPOTIPY_CLIENT_SECRET,SPOTIPY_REDIRECT_URI,scope=SCOPE )

#click "Accept" in your browser when the auth window pops up
code = sp_oauth.get_auth_response(open_browser=True)
token = sp_oauth.get_access_token(code)
refresh_token = token['refresh_token']
sp = spotipy.Spotify(auth=token['access_token'])
username = sp.current_user()['id']

Making your endless playlist

Once you're authorized and the API is working, you can create your playlist using Spotify's user_playlist_create() method. I called mine "myEndlessPlaylist1" but feel free to be more creative. You'll also want to save the playlist ID of your newly created playlist for future use (as I did with "pl_id").

#creating your playlist
pl_name = 'myEndlessPlaylist1'
result = sp.user_playlist_create(username,
 name=pl_name)
pl_id = result['id']

Now we need to get a list of the top 500 albums, randomize the order, and initialize the playlist. I found a playlist on Spotify that seemed to have all the albums from the list. The code below collects all songs in that playlist (each of which is a dictionary object with data like name, artist, track ID, and other fields) into a list, and then loops through that list to make a DataFrame with fields like track, artist, album, track_id and album_id. We'll use that resulting DataFrame ("album_df") to populate the endless playlist.

#ID for the public spotify playlist we're getting all the albums from 
top_album_pl = '70n5zfYco8wG777Ua2LlNv'

#top_albums is list we'll use to make the top albums playlist
top_albums = []
offset = 0
while True:
    response = sp.playlist_items(top_album_pl,
                                 offset=offset)

    if len(response['items']) == 0:
        break
    top_albums +=response['items']
    offset = offset + len(response['items'])

#Here we make a DataFrame of all the top albums by looping through that list of response of dictionaries
album_df = []
for album in top_albums:

    track = album['track']['name']
    artist = album['track']['artists'][0]['name']
    album_name = album['track']['album']['name']
    track_id = album['track']['id']
    album_id = album['track']['album']['id']
    album_df.append([track,artist,album_name, track_id,album_id])
album_df = pd.DataFrame(album_df, columns =['track','artist','album','track_id','album_id'])

Randomizing and cleaning your album list

Once you have your DataFrame of all the albums, it's time to randomize the order. I just made an "all albums" DataFrame, gave a random number from 0-1 to each album and then merged it back with the full top albums DataFrame. Then I sorted by that random number ("rand_key" in the code below) and added an index column "idx" which will make it simpler to add tracks from this DataFrame to my endless playlist. There are a few compilations and incomplete albums (due to unavailability of some songs on Spotify), that lead to some one-track albums. You can delete those if you want or comment out that code to keep them. I also cleaned out characters like backslashes and quotes from the data so it will play nice.

#random seed so others can get the same album order as me
np.random.seed(10)

#make a DataFrame of all albums and add a random number between 0-1 "rand_key" for each album
all_albums = album_df.drop_duplicates('album_id').album_id
all_albums = pd.DataFrame(all_albums).reset_index(drop=True)
all_albums['rand_key'] = np.random.rand(len(all_albums))

#merge the albums DataFrame back with the full top albums DataFrame
album_df = pd.merge(album_df, all_albums, how='inner', on='album_id')

#sort by the rand_key (and by index so the songs within albums stay in order)
album_df['idx'] = album_df.index
album_df = album_df.sort_values(['rand_key','idx']).reset_index(drop=True)

#run the code below to exclude one-track albums
album_df_piv = pd.pivot_table(album_df, index='album_id',values='track_id',aggfunc='count')
album_df_piv.columns = ['num_tracks']
album_df = pd.merge(album_df, album_df_piv, how='left',left_on='album_id',right_index=True)
album_df = album_df[album_df.num_tracks>1].reset_index(drop=True)

#do some data cleaning (getting rid of quotes, backslashes) so it will play nice 
df.loc[(df['track'].str.contains('"')),'track'] = df.track.str.replace('"','')
df.loc[(df['artist'].str.contains('"')),'artist'] = df.artist.str.replace('"','')
df.loc[(df['album'].str.contains('"')),'album'] = df.album.str.replace('"','')
df.loc[(df['track'].str.contains(r"\\")),'track'] = df.track.str.replace('\\','')

album_df['idx'] = album_df.index

#copy this album df to your clipboard to paste into google sheets or elsewhere.
album_df.to_clipboard(index=False)

Now with this randomized "album_df" DataFrame, you can finally initialize your playlist! Choose whatever size playlist you want, but the smaller it is the simpler it is to manage—plus it creates more suspense if you can only see the next 1 or 2 albums in your queue. Since I'll be updating mine every hour, I chose 30 songs, and simply added the first 30 songs from the "album_df" DataFrame to my playlist with the playlist_add_items() method shown below.

#initialize playlist of length 30
pl_length = 30
last_tracks_added = album_df.loc[0:pl_length-1]
tracks_to_add = last_tracks_added.track_id.tolist()
sp.playlist_add_items(pl_id,tracks_to_add )

You can copy the randomized DataFrame of all the albums to your clipboard and paste it into a Google Sheet to make it easier to work with. My first album up is "Crazysexycool" by TLC.

Automating your playlist updates

Now it's time to automate updating your playlist! To do this I'm going to to grab my recently played songs with the user_recently_played() method, and delete any songs I recently played that are also currently in my endless playlist. I'll also check the "context" of my recently played songs to make sure they were played from my endless playlist (if I play a song in my endless playlist but from a different place, I don't necessarily want to delete it). I'll use the "refresh token" from access credentials dictionary above so I don't have to re-authorize every time I want to update the playlist (which I'll be doing every hour). For this code to work, you'll also need your DataFrame of all albums ("album_df") and a DataFrame of the last tracks you added ("last_tracks_added"). You'll add songs from the "album_df" starting with the index after the max index from the "last_tracks_added" DataFrame . Finally, to make it truly "endless", I'll start adding songs from the beginning again once I reach the end.

import spotipy
import pandas as pd
import numpy as np
from spotipy import oauth2
import re

SPOTIPY_CLIENT_ID = 'YOUR CLIENT ID'
SPOTIPY_CLIENT_SECRET = 'YOUR CLIENT SECRET'
SCOPE = ('user-read-recently-played,user-library-read,user-read-currently-playing,playlist-read-private,playlist-modify-private,playlist-modify-public,user-read-email,user-modify-playback-state,streaming,app-remote-control,user-read-private,user-read-playback-state')
SPOTIPY_REDIRECT_URI = 'YOUR REDIRECT URI'
SPOTIFY_USER_ID = 'YOUR SPOTIFY USER ID'
sp_oauth = oauth2.SpotifyOAuth( SPOTIPY_CLIENT_ID, SPOTIPY_CLIENT_SECRET,SPOTIPY_REDIRECT_URI,scope=SCOPE )

refresh_token = 'YOUR REFRESH TOKEN'
token_info = sp_oauth.refresh_access_token(refresh_token)
sp = spotipy.Spotify(auth=token_info['access_token'])
username = sp.current_user()['id']
pl_id = 'YOUR ENDLESS PLAYLIST ID'


#add these lines if using SeekWell to automate your code (and map those Parameters to correct Google Sheet)
# album_df = {{allAlbums}}
# last_tracks_added = {{lastTracksAdded}}

#make lists of playlist track id's and names to check against your recently played tracks
this_pl = sp.playlist_items(pl_id)['items']
this_pl_ids = [track['track']['id'] for track in this_pl]

#we'll use this list of track+artist names to find tracks that got added to our playlist with a different track_id (which sometimes happens)
this_pl_names = [re.sub('[^0-9a-zA-Z]+', '',track['track']['name']+track['track']['artists'][0]['name']) for track in this_pl]

#make a "to_delete" list of the index and URI of songs you in your playlist that you just listened to
to_delete = []
recents = sp.current_user_recently_played(50)['items']
for track in recents:
    context = track['context']
    if context and 'playlist' in context['uri']:
        this_pl_id = context['uri'].split('playlist:')[1]
                name_artist = re.sub('[^0-9a-zA-Z]+', '',track['track']['name']+track['track']['artists'][0]['name'])
        if track['track']['id'] in this_pl_ids and this_pl_id == pl_id :
            idx = this_pl_ids.index(track['track']['id'])
            uri = track['track']['uri']
            to_delete.append([idx,uri])
        #including a second if statement in case that track id isn't in the playlist but that track+artist is.
        elif this_pl_id == pl_id  and name_artist in this_pl_names:
            idx = this_pl_names.index(name_artist)
            uri = 'spotify:track:' + this_pl_ids[idx]
            to_delete.append([idx,uri]) 

#if there's no songs to delete then there's no updates to make
if len(to_delete) > 0:

    #use that list to create a list of track dictionaries (which have track uri and position in the playlist)
    tracks = [{'uri': track[1], 'positions': [track[0]]} for track in to_delete]

    #use that list of track dictionaries and delete them from your playlist
    sp.user_playlist_remove_specific_occurrences_of_tracks(username,pl_id,tracks)


    #make a 'tracks_to_add' DataFrame that is the same length as the "to_delete" list
    to_add = len(to_delete)
    last_index = last_tracks_added['idx'].astype(int).max()
    tracks_to_add = album_df.loc[last_index+1:last_index+to_add]

        #to make it truly "endless" start adding the first tracks again once you reach the end
    if (last_index+to_add) >= len(album_df):
        tracks_to_add = pd.concat([tracks_to_add, album_df.loc[0:(last_index+to_add)%len(album_df)]],axis=0)

    #add those songs to your playlist 
    sp.playlist_add_items(pl_id,tracks_to_add.track_id.tolist())

    #add this code if using SeekWell to automate your Python so you can send the "tracks_to_add" DataFrame somewhere
    #tracks_to_add
    #seekwell = {'df': tracks_to_add}

Great! Now we have code that goes through your recently played songs, deletes any songs that are from your endless playlist (and that were actually played from it), and then adds the same number of new songs to your endless playlist. This method forces you to listen to every song all the way through (which I actually prefer) so it shows up in your recently played. If you want to be able to skip songs and still have them deleted, you can change the code to take the highest index from your full "album_df" DataFrame that's also in your recently played and delete songs from your playlist with indexes lower than that.

Scheduling your code with SeekWell

Now we just need to run this code every hour, and store the "last_tracks_added" DataFrame somewhere. I'm a little biased since I work there, but my preferred tool for automating Python code is SeekWell. It lets you run any Python code and automate it on a schedule (e.g. daily, hourly, every five minutes). It also gives you native access to DataFrames from Google Sheets or your database, which in this case is helpful so I can import my "album_df" and "last_tracks_added" DataFrames to know which tracks to add next. You can sign up for your two week free trial here and check out the docs for using Python here. You can also use a different tool for scheduling your Python code or just run the update code manually when you get the chance. One caveat is that Spotify's user_recently_played() method only stores up to 50 songs so if you listen to more than 50 songs in between updating your playlist, some won't actually get deleted.

If you do use SeekWell, add this code at the end of your Python script, and you can send that DataFrame wherever you want. In this case I'm sending it to Google Sheets so I can read it in again the next hour to know which tracks I just added.

seekwell = {'df': tracks_to_add}

I'm also reading in my "album_df" and "last_tracks_added" DataFrames from Google Sheets via SeekWell's Parameters, which is like a no-code import method. To do this yourself, create an alias name for the DataFrames you want to import in double curly brackets, and where it says "Parameters" on the right, change the Type to "Sheets" and enter the name of your Google Sheet.

Now just choose your schedule. I listen to my endless playlist often while I'm working so hourly updates makes the most sense for me. But if you're only going to listen occasionally or your playlist is bigger, you can also choose daily or even weekly updates.

Enjoy! Remember to save songs you like to your library right away before they get deleted. Use the same random seed as me (10) and let's listen together. Feel free to reach out if you have any issues and follow my listening journey here.

👨🏻‍💻
All code used for this project can be found on my Github here.

By Brian Kane @SeekWell.

Originally published here.

DEV Community

Building an endless Spotify playlist of the greatest albums of all time

Want a playlist of great music that never ends?

Problem

Solution (using Python and the Spotify Web API)

Connecting to the Spotify API

Making your endless playlist

Randomizing and cleaning your album list

Automating your playlist updates

Scheduling your code with SeekWell

Top comments (0)

Read next

Building a Full Stack Web Application using Flask (Python Web Framework) - Part Two

Building Scalable GraphQL Microservices With Node.js and Docker: A Comprehensive Guide

4 mistakes that will bring any developer to tears😿

Build a Custom ChatGPT-like Chatbot with Chainlit