Article originally posted on my blog.
This article is about how you can add blink detection in your project using TensorFlow.js. As an example, we will toggle the dark mode on a page. It's just a fun project to get started with Machine Learning and TensorFlow.js. Here is the result.
What should you know before starting?
Well, nothing as such. In simple terms, you can think of TensorFlow as a library that infers patterns from data and identify those patterns when they occur. You can either use pre-trained models or train your models with Teachable Machine.
Let's get started and set up a TensorFlow.js project
- Load model: You need to load the model to be able to use it. In this case, we are using face landmarks detection model.
import * as faceLandmarksDetection from '@tensorflow-models/face-landmarks-detection';
import * as tf from '@tensorflow/tfjs-core';
import '@tensorflow/tfjs-backend-webgl';
const loadModel = async () => {
await tf.setBackend('webgl');
model = await faceLandmarksDetection.load(
faceLandmarksDetection.SupportedPackages.mediapipeFacemesh,
{ maxFaces: 1 }
);
};
- Setup camera: To detect the face, the model needs a video stream. We will create a video element and pass its stream to estimate features on the face.
const setUpCamera = async (videoElement) => {
video = videoElement;
const mediaDevices = await navigator.mediaDevices.enumerateDevices();
const defaultWebcam = mediaDevices.find(
(device) =>
device.kind === 'videoinput' && device.label.includes('Built-in')
);
const cameraId = defaultWebcam ? defaultWebcam.deviceId : null;
const stream = await navigator.mediaDevices.getUserMedia({
audio: false,
video: {
facingMode: 'user',
deviceId: cameraId,
width: 500,
height: 500,
},
});
video.srcObject = stream;
video.play();
video.width = 500;
video.height = 500;
}
- Start estimation: For estimation, we will have to continuously monitor the face and keep checking if the user is blinking or not.
const predictions = await model.estimateFaces({
input: video,
returnTensors: false,
flipHorizontal: false,
predictIrises: true,
});
This returned prediction is an array of objects with values likefaceInViewConfidence
, boundingBox
, mesh
, scaledMesh
, annotations
. The facemesh keypoints can be obtained from the annotations
key.
These were the common steps you will need to do to set up a TensorFlow.js project with the face landmarks detection model. If all goes well, the browser should ask for camera permission when you run the project to be able to detect live video, and once the model is loaded, it starts predicting!
Adding blink detection
With the keypoints, we can calculate the Eye Aspect Ratio(EAR) with the following formula.
EAR = (|| p2 - p6 || + || p3 - p5 ||) / (2 * (|| p1 - p4||))
Here, || d ||
represents the Euclidean distance which is essentially the length of a line segment between the two points. And the points p1, p2 and so on, used here are mapped on the facemesh in the image.
So, when the person blinks, the EAR will drop close to zero. To detect the blink, the EAR will have to be lesser than a threshold which is close to zero. So, I used an EAR threshold that works for varying distances between the user and the camera. Once the EAR drops below the threshold, we know that eye is blinked. We have separate keypoints for both the eyes, so we can detect both eyes' blink separately.
To the dark side
Some time back, I read about an easy way to force the dark mode, which uses the CSS filter
property. Coupling it with the color-scheme
property, we can add dark mode to a majority of the pages. So, I would add these CSS properties to the root HTML element.
filter: invert(1) hue-rotate(180deg);
color-scheme: dark;
Note: There are some caveats to this method. But this works well enough for us to learn the basics. And hey, you get the dark mode with two lines of CSS. Which I think is pretty cool.
Blink Detection
To be able to reuse the logic, I created an NPM package named Blink Detection which does the calculations and returns the result.
import blink from 'blink-detection';
const blinkPrediction = await blink.getBlinkPrediction();
Here, the blinkPrediction will be an object with the keys blink
, wink
, longBlink
, left
, right
, rate
. Each key except rate
will be a boolean value.
Challenges
With the face landmarks detection model, you will need to manually map facemesh keypoints on a graph from the data readings. There are a lot of annotations that can be used, but I could not find any source explaining them.
To decide the EAR(Eye-Aspect-Ratio) threshold, I had to try different values. I use the one that works the best for different distances between the user and the camera. So, it is not bullet-proof.
More ideas with blink detection
Chrome extension: You can create a chrome extension for the same demo. Another extension idea is to gauge the eye blink rate and use it for different purposes.
Blink rate: There have been numerous studies on how blinking patterns can measure or apprise certain aspects of an individual or a situation.
Wink detection: Similar to blinking, wink detection can also be used to perform some actions.
Wrapping up
That's it! Overall, this project wasn't really about toggling dark mode. I wanted to play around with TensorFlow and create something while doing it! Also, I could not find any project related to blink detection in Javascript, so I developed the package. Check out the demo and the code.
theankurkedia / blink-detection
Detect the user's blink and wink using machine learning
References
I found the following resources helpful. If you want to experiment and create something similar, you can start here.
Pre-trained TensorFlow.js models for different use cases.
If you are using the face landmarks detection model, the facemesh keypoints can help you map the points needed for your use case.
This amazing project was a great starting point. Honestly, I just cloned the repo and started experimenting.
Teachable Machine to train your models.
Thanks for reading. Hope you found it helpful! Happy coding!
Top comments (0)