Downloading your YouTube subscriptions in CSV format (because Google Takeout takes too long)

#webscraping #userscript #youtube

I wanted to import my YouTube subscriptions into the open-source Android YouTube client NewPipe. The normal way to do that is to export the subscriptions from Google Takeout, a service Google provides to allow you to retrieve data about your account. NewPipe kindly explains the process:

Excuse the comic sans, it's my favorite font.

NewPipe's instructions are as follows:

Import YouTube subscriptions from Google takeout:

Go to this URL: https://takeout.google.com/takeout/custom/youtube

Log in when asked

Click on "All data included", then on "Deselect all", then select only "subscriptions" and click "OK"

Click on "Next step" and then on "Create export"

Click on the "Download" button after it appears

Click on IMPORT FILE below and select the downloaded .zip file

[If the .zip import fails] Extract the .csv file (usually under "YouTube and YouTube Music/subscriptions/subscriptions.csv"), click on IMPORT FILE below and select the extracted csv file

What they neglect to mention is that Google Takeout can take many hours to complete.

I tried using Google Takeout, but after an hour of waiting, I decided I'd try something else. I would scrape the list of channels I'm subscribed to, and I'd save that list as a CSV file I can import into NewPipe.

Finding out how the Google Takeout CSV is formatted

In order to make my own file that NewPipe would accept as though it were a Google Takeout CSV, I had to find out the format Google Takeout uses.

https://github.com/TeamNewPipe/NewPipeExtractor/pull/709/commits/94a29fd63ff6bb0c1805c44ef5ebf4d915427454

I found the commit that added the Google Takeout importing support. Inside that commit was a description of the file.



// Expected format of CSV file:
//      Channel Id,Channel Url,Channel Title
//      UC1JTQBa5QxZCpXrFSkMxmPw,http://www.youtube.com/channel/UC1JTQBa5QxZCpXrFSkMxmPw,Raycevick
//      UCFl7yKfcRcFmIUbKeCA-SJQ,http://www.youtube.com/channel/UCFl7yKfcRcFmIUbKeCA-SJQ,Joji
//
// Notes:
//      It's always 3 columns
//      The first line is always a header
//      Header names are different based on the locale
//      Fortunately the data is always the same order no matter what locale

This was simple enough. Now I needed to find out how to get a list of the Channel ID and Channel Title of each of the channels I'm subscribed to.

YouTube scraping

I found a page on YouTube that lists all the channels you're subscribed to. I got to looking at how this page worked, and realized that the page stores data inside a variable called ytInitialData. This variable stores the list of the channels you're subscribed to, as well as some other data. YouTube paginates the list, though, so the variable won't always have everything right off the bat. YouTube loads more of the list whenever you scroll to the bottom of the page, though, which means I can just automate the scrolling.

Another bit of data included in the ytInitialData variable is the API token required to load the rest of the list. And when the list is fully loaded, the token is removed from ytInitialData.

This means I can check if that token exists in order to know whether to keep scrolling down.

I wrote a script to scroll to the bottom of the page by checking the height of the container <div> and then scrolling with the function window.scrollTo.

I then wrote a script to get the Channel IDs and Channel Titles from this list. I would also need the Channel URL, but this was as easy as adding the channel ID after the string "http://www.youtube.com/channel/". It then joined the data of all the channels one by one. Finally, it would log the data to the console.

I combined these scripts and ran them together. It worked.

To make it just a bit more convenient, I made another script which made a <div> to put the CSV data in. This <div> would use position: fixed to cover the screen.

To make it even easier to use, I decided to make a download button to save the text as a file so you wouldn't need to copy-paste it yourself.

The script

Here's my code:



function getLast() {
  return ytInitialData.contents.twoColumnBrowseResultsRenderer.tabs[0].tabRenderer.content.sectionListRenderer.contents.slice(-1)[0]
}
function canContinue() { return getLast().continuationItemRenderer != null }
(async () => {
  while (canContinue()) {
    let current = getLast().continuationItemRenderer.continuationEndpoint.continuationCommand.token;
    scrollTo(0, document.getElementById('primary').scrollHeight);
    while (canContinue() && current == getLast().continuationItemRenderer.continuationEndpoint.continuationCommand.token) {
      await new Promise(r => setTimeout(r, 100));
    }
  }
  scrollTo(0, 0);
  let floatDiv = document.createElement('div');
  let preText = document.createElement('pre');
  floatDiv.setAttribute('style', `position: fixed;
  background: #0f0f0f;
  z-index: 100000;
  inset: 2rem;
  overflow: auto;
  font-size: 2rem;
  white-space: pre;
  color: white;
  padding: 1rem;`);
  let csvText = "Channel Id,Channel Url,Channel Title\n" + ytInitialData.contents.twoColumnBrowseResultsRenderer.tabs[0].tabRenderer.content.sectionListRenderer.contents.map(e => {
    if (!e.itemSectionRenderer) return;
    return e.itemSectionRenderer.contents[0].shelfRenderer.content.expandedShelfContentsRenderer.items
  }).flat().map(e => {
    if (e && e.channelRenderer) return `${e.channelRenderer.channelId},http://www.youtube.com/channel/${e.channelRenderer.channelId},${e.channelRenderer.title.simpleText}`;
    return '';
  }).join('\n');
  preText.innerText = csvText;
  let downloadLink = document.createElement('a');
  downloadLink.innerText = 'Download CSV';
  downloadLink.setAttribute('target', '_blank');
  downloadLink.setAttribute('style', `color: #bf3838;
  font-weight: bold;
  margin-bottom: 1rem;
  display: block;
  padding: 1rem;
  border-radius: 0.5rem;
  border: 2px solid #bf3838;
  width: fit-content;
  text-decoration: none;`);
  var t = new Blob([csvText], { type: "text/plain" });
  downloadLink.href = window.URL.createObjectURL(t)
  floatDiv.appendChild(downloadLink);
  floatDiv.appendChild(preText);
  document.body.appendChild(floatDiv);
})()

You can run this in DevTools at youtube.com/feed/channels. You can then save the file by clicking "Download CSV".

Conclusion

By the time I had finished writing the script, Google Takeout still hadn't sent me a download of my subscriptions list. I eventually received this download hours later, but by that point I had already imported my own list into NewPipe and no longer needed theirs.

My guess is that Google Takeout runs on a queue, sending one person their data and then the next, rather than working on each request immediately. This probably saves them some cost, but it also wastes a lot of time.

I hope you find my script useful. If you make any improvements, please tell me about them! Whether it be in the discussion under this post, the comments on my GitHub gist about this script, or even on my Twitter, I'd love to hear what you've done with it.

Top comments (8)

Kazard • Aug 2

Hey, I made an account just to reach out and say thanks!

It seems like legacy youtube accounts that were improperly merged with a google account just do not work at all with the google takeout route you would normally take. Going through all the steps and exporting just your subscription list in that instance literally spits out an empty zip file with no Subscriptions.csv file to be found.

So this little script you tossed together is the only way I (and others who made a post on reddit that led me here) can easily throw together that list for use in newpipe.
So, thanks! If you have a kofi or something, I'd absolutely love to toss you a little something for the help!!

hazycora • Aug 22

Omg! I forgot I even made this :) I don’t have a ko-fi, if you’re still up for it you could toss a one time donation my way on GitHub Sponsors: github.com/sponsors/hazycora
Tysm for the kind words !!!

Kazard • Aug 23

My pleasure! It's not much, but I hope you get some sort of tasty snack or treat with it!

hazycora • Aug 23

Thanks so so much! It means a lot :)

Kazard • Aug 23

You saved me more than an hour trying to do this manually after getting frustrated by google's thing not working for me. So, actually, thank YOU! I wish you the best! <3