At Gling I faced the challenge of rendering captions with transparent bg from the browser Canvas to a video file. I had the code for painting the captions to the Canvas by keyframe (for playback purposes), but we needed to render and encode those frames into a file the user could use.
Using WebCodecs (Doesn't support transparent bg)
Initially, I tried using the new WebCodecs API to pass each Canvas image to a VideoFrame and encode them using VideoEncoder using a VP9 codec that supports transparency, And finally muxing the video stream into a webm file using webm-muxer.
That worked great (and relatively very fast), but then I saw that even though VP9 codec supports alpha channel (for the transparent background) it is not implemented in WebCodecs API (yet?).
So I had to use FFmpeg to process the raw frames into a codec and format the supports alpha channel.
But how do you transfer Canvas bitmap data to FFmpeg for encoding and muxing?
Part 1 - Creating frames image data
Assuming I need to create a video with a 60-second length, at 30 frames per second, that means I need to create 1,800 frames. For each frame, I should paint the right captions to the canvas, and convert it to raw image data.
// renderer.ts
function createFramesData(
seconds: number,
framerate: number,
dimensions: { width: number, height: number },
onFrame: (data: Uint8ClampedArray) => Promise<unknown>
) {
const frames = 60 * 30;
const canvas = new OffscreenCanvas(dimensions.width, dimensions.height);
const ctx = canvas.getContext('2d', {
alpha: true,
willReadFrequently: true,
}) as OffscreenCanvasRenderingContext2D;
for (let i = 1; i <= frames; i++) {
ctx.clearRect(0, 0, canvas.width, canvas.height);
// The function that paints to the canvas by the frame number
paintCaptionFrame(ctx, i);
const rawFrameImage = ctx.getImageData(x, y, width, height).data;
await onFrame(rawFramImage);
}
}
Part 2 - Sending raw frames to FFmpeg
Now it is the interesting part. We need to listen to the onFrame
callback that sends the raw image data, and pipe it to a FFmpeg process. FFmpeg should do the magic: ingest raw frames, encode them to any codec we choose (one that supports alpha channel), and muxing it into a file.
// renderer.ts
const duration = 60;
const framerate = 30;
const dimensions = {
width: 1920,
height: 1080,
}
const processId = await startFFmpeg(framerate, dimensions);
await createFramesData( // The function from above
duration,
framerate,
dimensions,
(data) => {
return pipeFramesToFFmpegProcess(processId, data);
}
);
await closeFFmpegPipe(processId);
To start the FFmpeg process, we need to create an IPC call to the Electron Node.js backend, which will create an FFmpeg process.
To understand Electron IPC mechanism, there is an official Electron guide here: https://www.electronjs.org/docs/latest/tutorial/ipc
In short, we need to have a listener in the Node.js backend, that listens for IPC calls and executes FFmpeg, pipes data to a process and closes the process std input.
In main.ts Node.js Electron backend
// main.ts
import { ChildProcess, execFile } from 'child_process';
const processes = new Map<number, ChildProcess>();
ipcMain.on('exec', (_, executable, args) => {
const child = execFile(
executable,
args.map((arg) => arg.toString()),
{ maxBuffer: 100 * 1024 * 1024 },
);
childProcesses.set(child.pid, child);
return child.pid;
});
ipcMain.on('pipeIn', (_, pid, data: ArrayBuffer) => {
return new Promise<void>((res, rej) => {
const child = childProcesses.get(pid);
if (!child?.stdin) {
rej(new Error('No child process found with pid ' + pid));
return;
}
child.stdin.write(Buffer.from(data), 'utf8', (error) => {
if (error) {
rej(error);
} else {
res();
}
});
});
});
ipcMain.on('endPipeIn', (_, pid) => {
return new Promise<void>((res, rej) => {
const child = childProcesses.get(pid);
if (!child?.stdin) {
rej(new Error('No child process found with pid ' + pid));
return;
}
child.stdin.end(() => {
res();
childProcesses.delete(pid);
});
});
});
Then expose IPC calls in the renderer using a preload file:
// preload.ts
import { contextBridge, ipcRenderer } from 'electron/renderer';
contextBridge.exposeInMainWorld('api', {
exec: (executable, args) => ipcRenderer.send('exec', executable, args),
pipeIn: (pid, data) => ipcRenderer.send('pipeIn', executable, args),
endPipeIn: (pid) => ipcRenderer.send('endPipeIn', pid),
});
Part 3 - Connecting everything
Now we can execute FFmpeg, and send the raw frames to the FFmpeg process for encoding and muxing.
//renderer.ts
const duration = 60;
const framerate = 30;
const dimensions = {
width: 1920,
height: 1080,
}
const processId = await startFFmpeg(framerate, dimensions);
await createFramesData( // The function from above
duration,
framerate,
dimensions,
(data) => {
return pipeFramesToFFmpegProcess(processId, data);
}
);
await closeFFmpegPipe(processId);
function createFramesData(
seconds: number,
framerate: number,
dimensions: { width: number, height: number },
onFrame: (data: Uint8ClampedArray) => Promise<unknown>
) {
const frames = 60 * 30;
const canvas = new OffscreenCanvas(dimensions.width, dimensions.height);
const ctx = canvas.getContext('2d', {
alpha: true,
willReadFrequently: true,
}) as OffscreenCanvasRenderingContext2D;
for (let i = 1; i <= frames; i++) {
ctx.clearRect(0, 0, canvas.width, canvas.height);
// The function that paints to the canvas by the frame number
paintCaptionFrame(ctx, i);
const rawFrameImage = ctx.getImageData(x, y, width, height).data;
await onFrame(rawFramImage);
}
}
startFFmpeg(
framerate: number,
dimensions: { width: number, height: number });
) {
return window.api.exec('ffmpeg', [
'-f',
'rawvideo',
'-pix_fmt',
'rgba',
'-s',
`${dimensions.width}x${dimensions.height}`,
'-r',
framerate,
'-i',
'-',
'-c:v',
'libvpx-vp9',
'-pix_fmt',
'yuva420p',
'out.webm',
]);
}
pipeFramesToFFmpegProcess(pid: number, data: Uint8ClampedArray) {
return window.api.pipeIn(pid, data);
}
closeFFmpegPipe(pid: number) {
return window.api.endPipeIn(pid);
}
And that's all, at the end you will have a webm file with the video containing the frames created by the canvas.
Some things might be missing here (like waiting for the ffmpeg process to complete), I removed them to keep the example as simple as possible.
You don't have to use vp9 codec, there are other codecs there that have alpha channel (but not many)
We hire at Gling
I'm always looking for talented developers to join our small, but excellent, team. If you want to join us and build a product that users love, with a focus on video editing on the web, feel free to send me an email to yonatan@gling.ai. We are fully remote and hire globally! Visit our website: https://gling.ai
Top comments (3)
Thanks for the useful article! Could you share your code (on GitHub or smth)? That would be great! (I am especially interested in "Some things might be missing here" bits :))
Thanks! More than the code above will require some scaffolding from my side, which I'm not sure I will have the time for now. What is it that you are looking for/not working for you?
I was just curious :)