The first project I worked on after leaving my full time job was to create Youtube videos programmatically, at scale. This post explains what I did, and what happened next.
watch a video?
If you're better with visuals / audio, watch this video instead (it's better at 1.5x )
Ok, let's go -
After leaving my full time job at Google and steady pay, I asked myself: what projects could I do that could potentially bring some passive income?
The project should be exciting enough for me to take a dab, should help me learn something absolutely new, and make a great story regardless of its outcome.
I had been thinking about Youtube as a platform for quite some time because it probably ranks number 1 in my list of favorite products.
Second, it provides hosting videos for FREE, to everyone.
The second one is huge, if you think of it, especially as an engineer.
The biggest cost to serving videos online is their hosting, and youtube takes care of that for everyone, for free.
Not just that, if your videos are public and get views, it even pays you for them!
When Tensorflow was announced, there was also an announcement of a project that could create Music with AI (Project Magenta).
This idea that AI could create music resonated with me a lot. The problem with this project was that while AI can create a lot of Music, most of it is average at best.
Even the most popular Artists usually have only a few super viral songs, and with AI, this percentage is an improbable fraction.
Let's just say that it would take AI 100,000 songs to come up with one great one.
Can we crowd source listening to all of this music to the world and let them decide which song is the best?
I could create some decent videos of slideshows -- programmatically of course, for each of the music files generated, and upload them to Youtube for the world to figure it out.
If there's some traction, it would motivate me to spend more time tweaking the AI training model, as well as earn $$ :).
Creating music with AI felt like a big task, so I thought, let's just do something basic.
Let's create videos of just some text converted to speech with slideshow, and upload it to Youtube programmatically. Yep, that's a good start. Based on how that goes, we can work on creating music with AI, instead of that text -> speech thingy.
To make this MVP of a process, I decided using data from Wikipedia first. I could even incorporate live news to this concept!
Youtube allows uploading 50 videos daily. This means, that if I had 10 channels, I could upload 500 videos a day. Take a minute to fathom that.
500 x 365 = 182,500 videos a year.
If each video gets 10 views, that alone is over 1 million views. Very fascinating. WDYT?
Here's how it would work:
- Get Wikipedia trending queries (thanks to Wiki APIs!)
- Get the First Paragraph from each Query
- Check for Adult Content (important!)
- Convert text to Speech (I used Google's text-to-speech API)
- Find images available for commercial use on this topic
- Stitch the images to make a slide show along with the Audio
- Upload to Youtube using Youtube's API
- Drink lemonade and enjoy.
I chose Python to write this code in. Why? Because most libraries that I was going to use had best support in Python. Also, I like Python.
Step by step I wrote the code to do all the steps. Yes the code was scrappy but it worked.
Simple API calls to get all the data, used ffmpeg to create slideshow videos. While there was no parallel processing (no threads), the process was fast enough to create 50 videos in 10-15 minutes.
There were many bugs of course, for example the APIs would timeout, or some special characters would break the sequence, but one by one I fixed them all. If no images were found for a particular topic, I would make it a black screen with the Title's text on top of it.
Also embedded the channel watermark in one corner.
To upload the videos, I used Youtube's API from GCP. What was amazing was that I could also set up Description, Title and Keywords in each of the videos through their API. Mind blown.
All was well in test runs, and then I ran in production.
After running the code, I started seeing bugs that I couldn't really understand. After some debugging I realized, the issue.
It turns out that Youtube API has quota limitations that are different from the web UI limitations.
Most prominent for me was that Youtube API only allowed ~3-4 video uploads a day, and not more than that, thanks to their Quota limitations.
I read their Quota Costs for API requests much later than I should have.
Everything has low quota usage except the "Video -> Insert" resource.
Youtube does allow 50 uploads through their User interface, but not through APIs.
Why didn't I think of reading this first? I was super mad and sad.
Also, of course I reached to Youtube team over their Cases consult. I wrote a big doc as much convincing information as I could add. It would take them weeks to revert, which makes sense given their size.
Not that I was motivated to, after several weeks of no response, I did reach out to a friend who worked in Youtube at the time. As I had thought, it didn't really do anything. For the most part, Google is a meritocracy based company, and unless I was one of the early partners , big enough (think SocialBlade), or had good relations with someone higher in management (VP level?), it was not going to happen.
I did try to play the "Anti competitive" song in my subsequent appeals, which I thought was clever, but it didn't work. Yes, yes, I know they are smart.
After several back and forths, they did increase my quota from 10,000 a day to 15,000 per day. To some degree that was fair, because it's not like my content was going to make grow Youtube's user base, or improve user experience somehow.
For few days, I uploaded videos manually to Youtube, and then set their Title / Description and Keywords programmatically. But as you can guess it wasn't with the same excitement as before.
You can find all Channels with this youtube search query.
Here's a sample video.
Well after I left the project, I came back to check on the videos after two years. The metrics are fascinating to look at, so I share those metrics below.
Note: I did nothing to grow these channels. All videos in these channels were the output of a simple Python code, except of course, I had to upload most of the videos manually, and then update the metadata through code.
|SoKnow French||5.4K||90 Hours||53|
|SoKnow Hindi||21K||345 Hours||173|
|SoKnow English||26K||112 Hours||37|
|SoKnow Korean||4.2K||23 Hours||7|
|SoKnow Japanese||35.3K||155 Hours||24|
|SoKnow Russian||6.6K||82 Hours||14|
|SoKnow German||3.5K||18.3 Hours||5|
|SoKnow Finnish||7.4K||7.6 Hours||8|
|SoKnow Arabic||683||2.8 Hours||1|
Two years, and even all channels combined don't reach the "minimum" criteria for monetization (4,000 watch time and 1000 subs). But it's very likely that it would have, if the content was uploaded for 365 days (instead of 2) along with consistent improvements.
While I agree that the videos uploaded might not have added as much value to the world, but I'm pretty sure that some users did find them useful. We can confirm this with the number of Likes on the videos, and watch time.
It's all about perspective.
Most people learn things through videos, and a lot of information that's in English is not readily available in other languages, for example look at the SoKnow Japanese Channel. Not only would the search engines fail at bringing information to those users through search -- discovering and translating the content is another major hurdle for users in different languages.
"Quality" content is subjective, and I can easily argue that majority of content online today is not only useless, but harmful.
There were lot of fundamental learnings from this little experiment. Some of them are "of course", when you think of them deeply, probably not.
I'm happy I didn't go down the path of first learning how to build music with AI. It would've taken me fairly long time.
It's very likely that the fact that Youtube Quota is a thing, and it's so limiting for uploading videos would've skipped me for months, and I would have regretted spending time making sub-par music files built with AI.
Thanks to using Wikipedia text, I was able to identify this issue relatively early on.
If you've ever played Slither.io, or at least heard of it, note that the creator built it mostly alone, and declined to use Cloud services to host the game. He ran all of it from his own house on bare metal.
Most great engineers don't like dependencies, and there's a valid reason for that. The more dependencies you have, the more areas of surprise elements there will always be.
Yes, it's worth it.
Before starting a project, always note the limitations of the services you're planning on using.
Note: Smarter thing would be to first write a Design doc about what the project is, what are the dependencies etc., but their efficacy for personal projects is questionable.
What I definitely recommend is to create a sort of check list or breakdown for myself, to make sure all grounds are covered, but in this case, "quota limitation" was left out, thanks to my mind subconciously convincing me about "50 UI uploads"
Maybe Youtube had the API quota set to 50 uploads a day, but they could still change it anytime.
Youtube is an independent product, continuously being improved, experimented and built upon.
If you're building a product that's a derivative of another product, it will always have the big risk of failing at anytime if the product you're relying on falters.
I hope you found this experiment interesting and it brings out some thoughts and ideas in your mind.
I'd love to hear your thoughts. Leave a comment below!
I originally published this blog at: https://sudcha.com/i-made-youtube-videos-using-python/ but sharing here for the love of the community!