I'm interested in getting a user's past listening history from Spotify and being able to suggest songs from the Charts that a user may be interested in listening too.
It's clear that you could use Genre to support this type of algorithm, the more times a user listenes to a genere, it has a higher likelihood of being suggested from the Charts.
I'd like to focus on the activity of a single user and not necessarily other users using the system (content based filtering)
How can this be done in practice? Anybody have any examples of algorithms or tutorials anywhere?
Top comments (7)
I would try to use Graph Database for it. Example
So query would look like:
Get all liked songs for Joe, get all users who like the same songs (at least N same songs), get top N songs that those users like and Joe haven't marked as liked.
Main trick here, that this request would kill any relational DB (on join on join...), but GraphDB can handle it just fine.
Thanks but this is collaborative filtering and I'm looking for simply a content filtering approach based on a sole user's past activity.
Try Pandora.fm approach. They categorise each song: instruments used, specific vocals or specific guitar rifts etc. Now you have similarity of each song based on N dimensions. As far as I know categorisation is done by humans, not by machines.
UPD quick internet search Automatic Musical Instrument Recognition and Related Topics
Having N dimension you can find similarity of objects with ML algorithm for example with Decision Trees.
I suggest using KMeans algorithm to predict the likelihood a user listen to a song. For example, you determine:
Thanks for this detailed reply! I guess this works well for collaborative filtering but I'm super interested in content based filtering that works off that users activity only.