Some days ago I posted an article on “Detecting Facial Features with Python” and I got many questions from people on twitter on how to do that with JavaScript. Today we are going to be answering that and we will add some extras like masking your face with a spiderman filter, or the classic, dog filter. It has been super fun to work on this project and I hope you enjoy it.
The article will cover two main topics:
- Face features recognition
- Adding filters
How to detect facial features?
Similarly to how DLib works, for JavaScript, we have a library called clmtrackr which will do the heavy work of detecting where the face is on an image, and will also identify face features such as nose, mouth, eyes, etc.
This library provides some generic models which are already pre-trained and ready to use following the numbering of the features as follows:
When we process an image with the library, it will return an array for each of the points on that map, where each point is identified by its position on x
and y
axis. This will turn out very important when we are building the filters. As you can already probably guess, if we want to draw something replacing the nose of the person, we can use the point 62
which is the center of the nose.
But enough theory, let’s start working on something cool!
What are we building?
In this article, we will make use of clmtrackr
to identify faces on a video stream (in our case a webcam or camera) and apply custom filters that can be selected by a dropdown on the screen. Here is the demo of the app on codepen (please make sure you allow in your browser for the app to access the camera, otherwise it won’t work):
Awesome! It may not be perfect but looks amazing!
Let’s break the code down and explain what we are doing.
Basic code structure
To build the app we are using p5.js library, which is a JavaScript library designed for working mainly with canvas, and that fits perfectly for our use case. P5JS is not your traditional UI library, it instead works with events which define when to build the UI, and when to update it. Similarly to some game engines.
There are 3 main events from p5 which I want to cover:
-
preload
: which is executed right after the library loads and before building any UI or drawing anything on the screen. This makes it perfect to load assets. -
setup
: which is also executed once, right after thepreload
, and is where we prepare everything and build the initial UI -
draw
: which is a function called in a loop, and it’s executed every time the system requires to render the screen.
Preload
As by definition, we will use the preload
event to load the images that we will be using later in the code as follows:
function preload() {
// Spiderman Mask Filter asset
imgSpidermanMask = loadImage("https://i.ibb.co/9HB2sSv/spiderman-mask-1.png");
// Dog Face Filter assets
imgDogEarRight = loadImage("https://i.ibb.co/bFJf33z/dog-ear-right.png");
imgDogEarLeft = loadImage("https://i.ibb.co/dggwZ1q/dog-ear-left.png");
imgDogNose = loadImage("https://i.ibb.co/PWYGkw1/dog-nose.png");
}
Very simple. The function loadImage
from p5, as you may expect, will load the image and make it available as a P5 Image object.
Setup
Here things get a bit more interesting as it is in here where we load the UI. We will break down the code executed in this event into 4 parts
Creating the canvas
As we want our code to be responsive, our canvas will have a dynamic size which will be calculated from the window size and using an aspect ratio of 4:3. It’s not ideal to have the aspect ratio in code like that, but we will make some assumptions to keep the code concise for the demo. After we know the dimensions for our canvas, we can create one with the P5 function createCanvas
as shown next.
const maxWidth = Math.min(windowWidth, windowHeight);
pixelDensity(1);
outputWidth = maxWidth;
outputHeight = maxWidth * 0.75; // 4:3
createCanvas(outputWidth, outputHeight);
Capturing the video stream
After we have our canvas working we need to capture the video stream from the webcam or camera and place it into the canvas, fortunately P5 makes it very easy to do so with the videoCapture
function.
// webcam capture
videoInput = createCapture(VIDEO);
videoInput.size(outputWidth, outputHeight);
videoInput.hide();
Building the filter selector
Our app is awesome and can provide options for more than one filter, so we need to build a way to select which filter we want to activate. Again… we could get really fancy here, however, for simplicity, we will use a simple dropdown, that we can create using P5 createSelect()
function.
// select filter
const sel = createSelect();
const selectList = ['Spiderman Mask', 'Dog Filter']; // list of filters
sel.option('Select Filter', -1); // Default no filter
for (let i = 0; i < selectList.length; i++)
{
sel.option(selectList[i], i);
}
sel.changed(applyFilter);
Creating the image tracker
The image tracker is an object that can be attached to a video feed and will identify for each frame all the faces and their features. The tracker needs to be set up once for a given video source.
// tracker
faceTracker = new clm.tracker();
faceTracker.init();
faceTracker.start(videoInput.elt);
Drawing the video and filters
Now that everything is set up, we need to update our draw
event from P5, to output the video source to the canvas, and apply any filter which is selected. In our case the draw
function will be very simple, pushing the complexity into each filter definition.
function draw() {
image(videoInput, 0, 0, outputWidth, outputHeight); // render video from webcam
// apply filter based on choice
switch(selected)
{
case '-1': break;
case '0': drawSpidermanMask(); break;
case '1': drawDogFace(); break;
}
}
Building the Spiderman mask filter
Building filters can be an easy or very complex task. It will depend on what the filter is supposed to do. For the Spiderman mask, we simply need to ask the Spiderman mask image to the center of the screen. To do that, we first make sure our faceTracker object actually detected a face by using faceTraker.getCurrentPosition()
.
Once we have our face detected we use P5 to render the image using the face point 62, which is the center of the nose as the center of the image, and with width and height which represent the size of the face as follows.
const positions = faceTracker.getCurrentPosition();
if (positions !== false)
{
push();
const wx = Math.abs(positions[13][0] - positions[1][0]) * 1.2; // The width is given by the face width, based on the geometry
const wy = Math.abs(positions[7][1] - Math.min(positions[16][1], positions[20][1])) * 1.2; // The height is given by the distance from nose to chin, times 2
translate(-wx/2, -wy/2);
image(imgSpidermanMask, positions[62][0], positions[62][1], wx, wy); // Show the mask at the center of the face
pop();
}
Pretty cool right?
Now the dog filter works the same way but using 3 images instead of one, one for each ears and one for the nose. I won’t bored you with more of the same code, but if you want to check it out, review the codepen, which contains the full code for the demo.
Conclusion
With the help of JavaScript libraries is very easy to identify facial features and start building your own filters. There are a few considerations though that we did not cover in this tutorial. For example, what happens if the face is not straight to the camera? How do we distort our filters so that they follow the curvature of the face? Or what if I want to add 3d objects instead of 2d filters?
I know many of you will play with it and build some cool things, I’d love to hear what you built and if you can also share your examples with me. You can always reach me by on twitter.
Thanks for reading!
If you like the story, please don't forget to subscribe to our free newsletter so we can stay connected: https://livecodestream.dev/subscribe
Top comments (0)