A few weeks back I decided I wanted a cartoon version of my profile picture.
Sadly, as I'm not much of an artist, drawing it myself was out of the question. So I set about on what seemed the only other logical course of action...
I trained a neural network to do it for me!
Harrison Reid@harrison_g_reidSo, I trained a neural network to turn portrait shots of real people into Archer style cartoons.
👉 Also... I made it into a bot!
Follow then tweet me with "cartoonify me" in the text, and in a few minutes you'll get a reply with a cartoonish version of your profile pic 😁09:26 AM - 14 May 2020
Let's get this out of the way - if you'd like a cartoon version of your profile picture, here's how:
Tweet at me (@harrison_g_reid) and include "cartoonify me" somewhere in the text of the tweet.
Within a few minutes, my (hopefully?) dependable twitter bot will reply with a cartoon-ified version of your profile picture.
I should warn you, the results are...mixed. But it's funnier when it does a terrible job anyway, so 🤷♂️. Here's a GIF demo:
Read on if you want to learn how I built it...
The first thing I did after deciding that this would be a fun project was google around to see if there were any existing libraries I could use.
As usual, the open source world did not disappoint! I soon came across a tensorflow implementation of the CartoonGAN generative adversarial neural network.
The Github repo has some really cool examples of converting images and gifs into anime style cartoons - I recommend checking it out.
But this wasn't quite the style I was after. I wanted something a little more comic-book style - heavy black lines & flat colors. I wanted it to look like Archer!
Fortunately, the repo contains some pretty detailed instructions on how to train the network on your own training data.
So, I set about gathering a lot of images.
To train CartoonGAN, I would need two sets of images:
A large set of real life images of human faces.
An equally large set of cartoon faces (from Archer)
It was relatively easy to find a good dataset of human faces. I found the VGGFace2 face dataset, which is an enormous dataset, and far exceeded my needs.
Of course, there's no dataset of faces from Archer available, so I'd need to create my own.
Since I was aiming for a dataset of about 3500 images, there was no way I could realistically do this manually.
It took a little creativity, but I managed to mostly automate this. It basically ended up as a four stage process.
Using ffmpeg, extract a frame for every 4 seconds of video, for every episode of the first season of Archer. (If you're interested, the ffmpeg command to do this for a single video is:
ffmpeg -i video.mov -r 0.25 video-images/%04d.png.)
Detect the location of all the faces in every frame using facedetect. Yes, this works surprisingly well on cartoon faces!
Crop images for each located face using Jimp.
Manually check the extracted images, and remove all weird things that had incorrectly been identified as faces.
The end result was a set of ~3700 images. Just about every face from the first season of archer:
Now we're talking.
This was the easy part - it basically involved cloning the the CartoonGAN repo mentioned above, copying the images to the correct directory and running the python script as per the instructions in the repo.
It was a real workout for my computer though - it took several days running the training in the background to make it through 30 epochs of training.
Here's a gif of the training progress over the first few epochs.
In any case, the Node API's for TensorFlow.js let you directly load the format of model output by the CartoonGAN training process (SavedModel format).
Viola! A cartoon generating neural network running on Node.
If you're interested in how I've deployed the model as a twitter bot, stay tuned! I'll provide a walkthrough in a future post.
Note: The code for this isn't yet available on my Github, but will be made available soon.