DEV Community

Rafael Razeira
Rafael Razeira

Posted on

How to consume Kinesis-Video-Stream with python and getting metadata from Kinesis.

Amazon Kinesis Video provides simple APIs for playback videos, capture, process and store media streams. Normally the posts, questions and examples in python show how to consume the stream itself from Kinesis-Video-Stream, which is not a big deal, however depending of your project you need more information for each frame consumed, like the producer or server timestamp when the frame was created. And get this information using python its not simple as should be.
In this post we will show a solution to how get this kind of information in python.

Who am I?
Hello there, my name is Rafael M. Razeira, currently i’m a Computer Vision Software Engineer in Nuveo and in the day to day of the company, this need arose to get extra information from kinesis. At first impression we think “Oh! this is kind easy, like just consuming the stream, some big lib as open-CV will support this”. Such a illusion. After a week and joining an entourage in the company, we get this solution.

Other language solutions
From Amazon Docs others languages have a easy support to get access to this information, in javascript the hls.js provides a easy and intuitive way to get this information. In Java, the Amazon provides a full resource lib to parse this metadata and consume, but in python none of the “big players” like open-CV or other libs have support to parse this kind metadata in HLS or DASH streaming.
Python solution using a FIFO file and boto3 GET_MEDIA
To consume the stream and get the metadata we need to get the raw stream data from AWS and parse this information. The way we find to do this, is using the GET_MEDIA from boto3, this data is send with the structure as follows:
Structure of data send from kinesis

Every request from GET_MEDIA a Chunk will be returned. This Chunk has a metadata containing the information we want and a Fragment. This Fragment has N frames depending of the source configuration. If you want to get deep in this structure this article explains didactically.
The idea of the solution is kind of simple, but to achieve a good performance we need to use python multi-threading and force the solution to be not quite simple for beginners, the pipeline idea are in following:
Read the chunks from kinesis and write them to an OS FIFO file
Read the binary data from OS FIFO file and extract the video frames using Open-CV
Synchronizes frames output based on processing demand.
Use the frames as you want

All in all, there will be 3 threads: one for read the chunks and write in FIFO file, other for read the FIFO file and read the frames and a last one to synchronize the timestamps got it from chunk and all frames returned.

Read the chunks from kinesis and write them to an OS FIFO file
The below function handles request the chunks and write them in FIFO file. If you see other examples using GET_MEDIA from boto3, the big difference here is just each part of the chunk are write in FIFO file and is verified if this piece of the chunk has the metadata information if we want (Producer Timestamp), if it is put in a queue to consume later.

Note: the buffer_size was picked arbitrary.

Read the binary data from OS FIFO file and extract the video frames using Open-CV
Now to reading the FIFO file and extract all frame from them is quite simple. We just use the cv2.VideoCapture to deal with. The only thing what is quite trouble is the fact we need a accurate fps handle to consume the Kinesis stream, this is because if the consumer can't handle process the frame from metadata the video will got a delay which is just increase with time, and if the consumer process the video to fast the showing frames will pass trough too fast leading to a bad experience.
So the most important function in this solution is the _estimate_fps which was the far away the most troublesome to come up with a solution. Note the line in self.fps the equation was just made it by empirical tests.

Synchronizes frames output based on processing demand
Finally, the last thing to do is synchronize the frames from FIFO file source with a timestamp got it from metadata. This part is not difficult as the previous, but it not quite simple too. Here the same logic to discarding frames to got the desired fps, but this fps must be the must exactly possible to your machine can handle, if not so, the two problems comes up.

And that is it, is not "wow, such incredible thing!" but this problem in python was not quite simple to handle that should be, so I hope this post helped you in your work, and see you next time! :D

Acknowledgements
As I said before, this solution is a aggregate of knowledge of more them one people. To be honest the ones which most collaborate was them Diego Gomes and Guilherme Esgario.

If you liked the post and get interest in work with us, at this moments we are hiring!!! :D

Top comments (0)