DEV Community

Cover image for I Made Youtube Videos using Python
Sudeep Chauhan
Sudeep Chauhan

Posted on

I Made Youtube Videos using Python

The first project I worked on after leaving my full time job was to create Youtube videos programmatically, at scale. This post explains what I did, and what happened next.

Would you rather ...

watch a video?

If you're better with visuals / audio, watch this video instead (it's better at 1.5x )

No?

Ok, let's go -

After leaving my full time job at Google and steady pay, I asked myself: what projects could I do that could potentially bring some passive income?

The project should be exciting enough for me to take a dab, should help me learn something absolutely new, and make a great story regardless of its outcome.

For the love of Youtube

I had been thinking about Youtube as a platform for quite some time because it probably ranks number 1 in my list of favorite products.

Second, it provides hosting videos for FREE, to everyone.

The second one is huge, if you think of it, especially as an engineer.

The biggest cost to serving videos online is their hosting, and youtube takes care of that for everyone, for free.

Not just that, if your videos are public and get views, it even pays you for them!

Music with AI?

When Tensorflow was announced, there was also an announcement of a project that could create Music with AI (Project Magenta).

This idea that AI could create music resonated with me a lot. The problem with this project was that while AI can create a lot of Music, most of it is average at best.

Image description

Even the most popular Artists usually have only a few super viral songs, and with AI, this percentage is an improbable fraction.

Crowdsource Discovery of Good Music?

Let's just say that it would take AI 100,000 songs to come up with one great one.

Can we crowd source listening to all of this music to the world and let them decide which song is the best?

I could create some decent videos of slideshows -- programmatically of course, for each of the music files generated, and upload them to Youtube for the world to figure it out.

If there's some traction, it would motivate me to spend more time tweaking the AI training model, as well as earn $$ :).

Baby Steps

Creating music with AI felt like a big task, so I thought, let's just do something basic.

Let's create videos of just some text converted to speech with slideshow, and upload it to Youtube programmatically. Yep, that's a good start. Based on how that goes, we can work on creating music with AI, instead of that text -> speech thingy.

To make this MVP of a process, I decided using data from Wikipedia first. I could even incorporate live news to this concept!

Youtube allows uploading 50 videos daily. This means, that if I had 10 channels, I could upload 500 videos a day. Take a minute to fathom that.

500 x 365 = 182,500 videos a year.

If each video gets 10 views, that alone is over 1 million views. Very fascinating. WDYT?

Project Sound of Knowledge

I called the project: SoKnow (Sound of Knowledge). Other name I had chosen was Sound of Gold, which I forget why, but SoKnow sounded cool
Image description

Here's how it would work:

  1. Get Wikipedia trending queries (thanks to Wiki APIs!)
  2. Get the First Paragraph from each Query
  3. Check for Adult Content (important!)
  4. Convert text to Speech (I used Google's text-to-speech API)
  5. Find images available for commercial use on this topic
  6. Stitch the images to make a slide show along with the Audio
  7. Upload to Youtube using Youtube's API
  8. Drink lemonade and enjoy.

I chose Python to write this code in. Why? Because most libraries that I was going to use had best support in Python. Also, I like Python.

Write the CODE

Step by step I wrote the code to do all the steps. Yes the code was scrappy but it worked.

Simple API calls to get all the data, used ffmpeg to create slideshow videos. While there was no parallel processing (no threads), the process was fast enough to create 50 videos in 10-15 minutes.

There were many bugs of course, for example the APIs would timeout, or some special characters would break the sequence, but one by one I fixed them all. If no images were found for a particular topic, I would make it a black screen with the Title's text on top of it.

Also embedded the channel watermark in one corner.

To upload the videos, I used Youtube's API from GCP. What was amazing was that I could also set up Description, Title and Keywords in each of the videos through their API. Mind blown.

All was well in test runs, and then I ran in production.

Youtube knows about this

After running the code, I started seeing bugs that I couldn't really understand. After some debugging I realized, the issue.

It turns out that Youtube API has quota limitations that are different from the web UI limitations.

Most prominent for me was that Youtube API only allowed ~3-4 video uploads a day, and not more than that, thanks to their Quota limitations.

I read their Quota Costs for API requests much later than I should have.

Everything has low quota usage except the "Video -> Insert" resource.

Youtube does allow 50 uploads through their User interface, but not through APIs.

Let's raise a consult!

Why didn't I think of reading this first? I was super mad and sad.

Also, of course I reached to Youtube team over their Cases consult. I wrote a big doc as much convincing information as I could add. It would take them weeks to revert, which makes sense given their size.

Not that I was motivated to, after several weeks of no response, I did reach out to a friend who worked in Youtube at the time. As I had thought, it didn't really do anything. For the most part, Google is a meritocracy based company, and unless I was one of the early partners , big enough (think SocialBlade), or had good relations with someone higher in management (VP level?), it was not going to happen.

Image description

I did try to play the "Anti competitive" song in my subsequent appeals, which I thought was clever, but it didn't work. Yes, yes, I know they are smart.



I'm innocent.


After several back and forths, they did increase my quota from 10,000 a day to 15,000 per day. To some degree that was fair, because it's not like my content was going to make grow Youtube's user base, or improve user experience somehow.

For few days, I uploaded videos manually to Youtube, and then set their Title / Description and Keywords programmatically. But as you can guess it wasn't with the same excitement as before.

What did the videos look like?

You can find all Channels with this youtube search query.

Here's a sample video.

What Happened to the Content?

Well after I left the project, I came back to check on the videos after two years. The metrics are fascinating to look at, so I share those metrics below.

Note: I did nothing to grow these channels. All videos in these channels were the output of a simple Python code, except of course, I had to upload most of the videos manually, and then update the metadata through code.

Metrics from Two Years

List of Channels

Channel Views Watch Time Subscribers
SoKnow French 5.4K 90 Hours 53
SoKnow Hindi 21K 345 Hours 173
SoKnow English 26K 112 Hours 37
SoKnow Korean 4.2K 23 Hours 7
SoKnow Japanese 35.3K 155 Hours 24
SoKnow Russian 6.6K 82 Hours 14
SoKnow German 3.5K 18.3 Hours 5
SoKnow Finnish 7.4K 7.6 Hours 8
SoKnow Arabic 683 2.8 Hours 1
Total 110,000 836 Hours 322

Two years, and even all channels combined don't reach the "minimum" criteria for monetization (4,000 watch time and 1000 subs). But it's very likely that it would have, if the content was uploaded for 365 days (instead of 2) along with consistent improvements.

This is not adding value!

While I agree that the videos uploaded might not have added as much value to the world, but I'm pretty sure that some users did find them useful. We can confirm this with the number of Likes on the videos, and watch time.



It's all about perspective.

Most people learn things through videos, and a lot of information that's in English is not readily available in other languages, for example look at the SoKnow Japanese Channel. Not only would the search engines fail at bringing information to those users through search -- discovering and translating the content is another major hurdle for users in different languages.

"Quality" content is subjective, and I can easily argue that majority of content online today is not only useless, but harmful.

Some Learnings from this Experiment

There were lot of fundamental learnings from this little experiment. Some of them are "of course", when you think of them deeply, probably not.

1. Break down of Grand Vision into mini ideas

I'm happy I didn't go down the path of first learning how to build music with AI. It would've taken me fairly long time.

It's very likely that the fact that Youtube Quota is a thing, and it's so limiting for uploading videos would've skipped me for months, and I would have regretted spending time making sub-par music files built with AI.

Thanks to using Wikipedia text, I was able to identify this issue relatively early on.

Limitations of Dependencies

If you've ever played Slither.io, or at least heard of it, note that the creator built it mostly alone, and declined to use Cloud services to host the game. He ran all of it from his own house on bare metal.

Most great engineers don't like dependencies, and there's a valid reason for that. The more dependencies you have, the more areas of surprise elements there will always be.



Yes, it's worth it.

Before starting a project, always note the limitations of the services you're planning on using.

Note: Smarter thing would be to first write a Design doc about what the project is, what are the dependencies etc., but their efficacy for personal projects is questionable.

What I definitely recommend is to create a sort of check list or breakdown for myself, to make sure all grounds are covered, but in this case, "quota limitation" was left out, thanks to my mind subconciously convincing me about "50 UI uploads"

Beware of Derivative Products

Maybe Youtube had the API quota set to 50 uploads a day, but they could still change it anytime.

Youtube is an independent product, continuously being improved, experimented and built upon.

If you're building a product that's a derivative of another product, it will always have the big risk of failing at anytime if the product you're relying on falters.

Was it interesting?

I hope you found this experiment interesting and it brings out some thoughts and ideas in your mind.

I'd love to hear your thoughts. Leave a comment below!


I originally published this blog at: https://sudcha.com/i-made-youtube-videos-using-python/ but sharing here for the love of the community!

Discussion (0)