DEV Community

Cover image for Contextual metadata just in time
Erik Hoffman for Eyevinn Video Dev-Team Blog

Posted on

Contextual metadata just in time

One of the biggest hurdles when building enhanced experiences on top of video streaming has historically been to deliver the data needed in the correct time for it to show up. You either rely on the end user's clock to be correctly set up, or you try to deliver something from your server, ending up not knowing how far behind the live edge your user is due to buffering and other circumstances.


Let's bring up the example of showing the ongoing program on a linear channel.
Historically you would have some kind of API delivering the metadata on the side, probably in an EPG format such as XMLTV, causing you as the client consuming the video and the EPG to try to keep the timed sync between those. Match the current position of the user with the current program in the EPG according to the start and end times of the programs.

The Problem

This creates the need for doing the same implementation in all your clients. Having the correct client clock in sync on every end user's device to match correctly against the EPG. Handle the timestamps correct whether it is UTC or with any timezone applied. You probably don't want to check against the EPG data at every tick either, leading you to implement some logic to check either every X seconds, which would cause a mismatch for some time during the stream or some logic to check once again when the ongoing program is ending. Yet another risk area to implement accurately on all clients. Yet another function dependent on the end user's device clock to be correct.

The Solution

A solution to this "sidecar" solution, is of course to deliver your metadata inside your manifests as timed metadata, in context with your content. This is possible both with MPEG Dash as well as HLS. For this example, we will go through HLS.
In HLS you are expected to deliver the metadata in your manifest through the tag EXT-X-DATERANGE, on which you then probably want to apply a list of keys and their values.
An example in the context of this problem described earlier, would be to deliver your program and its start and end time as well as the title.

#EXT-X-DATERANGE:TITLE=“Lorem Ipsum Dolor Site Amet”, START-DATE=“2021-03-02T11:00:00Z”, END-DATE=“2021-03-02T12:00:00Z”, DURATION=3600

Playing this stream in a native HLS player in Safari you would continuously get all this metadata applied as a metadata track to the video element, nicely split up on their keys and values, simple to read and act on just in time.

videoElement.textTracks.addEventListener("addtrack", (evt) => {
  if (evt.track.kind === "metadata") {
    evt.track.mode = "hidden";
    evt.track.addEventListener("cuechange", (evt) => {
      const cues =;
      for (let i = 0; i < cues.length; i++) {
        if (!cue[i] || !cue[i].value) return;
        const cueObject = cue[i].value;
        // act on your data
        console.log(`${cueObject.key}: ${}`);
Enter fullscreen mode Exit fullscreen mode

which would print

TITLE: Lorem Ipsum Dolor Site Amet
START-DATE: 2021-03-02T11:00:00Z
END-DATE: 2021-03-02T12:00:00Z
Enter fullscreen mode Exit fullscreen mode

Simple and nice to act further on.

Looking for a solution in the other browsers not supporting HLS natively, we're looking towards the common MSE player hls.js which do as well expose this metadata in a fairly simple and reachable event, though not as structured data-wise.

hls.on(Hls.Events.FRAG_CHANGED, (evt, data) => {
  const tags = data.frag.tagList;
  tags.forEach((tag) => {
    if (
      Array.isArray(tag) &&
      tag.length > 1 &&
      tag[0] === "EXT-X-DATERANGE"
    ) {
      // tag[1] will include our entire metadata string, titles and values all together
      const data = tag[1].split(",");
      if (!data || !Array.isArray(data)) return;
      for (let i = 0; i < data.length; i++) {
        const dataPair = data[i].split("=");
        // act on your data
        console.log(`${dataPair[0]}: ${dataPair[1]}`);
Enter fullscreen mode Exit fullscreen mode

which will end up in the same output as the earlier example. As shown you will get pretty obvious pairs of the keys and values to act upon and you might from there dispatch events for the rest of your application to act upon, whether it is to show the metadata in the skin or to handle some logic.


As this data will appear in the stream just in time, in the context, of the stream content which it relates to - you will have the possibility to act on all your clients in the correct time without having to rely on the device clock nor having the need of doing any implementations concerning dates and timestamps to request the correct data at the correct moment. You will always only get the data needed, no need to search through an EPG or any other list of data for the correct object.

The insertion of metadata is supported on all the major platforms and if you are rather building your own virtual channel, which you might do through our open source channel engine library, we do have support for adding metadata in the vodtolive library through the method addMetadata.

If you need assistance in the development and implementation of this our team of video developers are happy to help out. If you have any questions or comments just drop a line in the comments section to this post.

Top comments (0)