GSOC-2021 Work Product Submission, Xiph.Org Foundation

Student: Aleksandr Gushchin
Github Handle: @aleksandrgushchin
Project: Improve fast scene detection modes proposal
Mentor: Luca Barbato
Organisation: Xiph.Org Foundation

Goals

This summer, I contributed to Xiph.Org Foundation. The main aim of this project was to improve scene change detection algorithm. This algorithm determines where to split video sequences for optimal encoding efficiency. Currently implemented fast scene detection method is not optimal and sometimes give false results. This is also detrimental to per scene visual metric quality targeting.

Change Log

Dataset has been made to test the algorithm
Metric value peaks have been made more distinctive for algorithm to detect resulting in better accuracy
Adjusting threshold for both versions of the algorithm
Adaptive threshold implementation for slow version
The more accurate version of the algorithm has been implemented
Downsampling for this new version has been added
Detailed description for all three versions has been added
CLI option of scene detection speed mode has been added
Unit-tests have been updated according to new version
av-scenechange has been updated according to the new version of rav1e
- CLI option of scene detection speed mode has been added
- CLI option for file to write result in has been added
- Speed measurement has been added

Brief summary of new version of the algorithm

Version	F score on BBC Planet Earth	F score on open source videos
New fast version	0.7441	0.6652
Old fast version	0.6543	0.5951
New medium version	0.7802	0.7032
New slow version	0.9217	0.7504
Old slow version	0.7024	0.5628

Development process

To fairly test the algorithm I needed a big representative dataset. I found BBC Planet Earth dataset, but I still needed sequences with bigger resolutions and different theme (all of BBC videos were documentaries with 388x280 resolution). I downloaded and manually marked-up 20 videos from vimeo. More description of final dataset can be found here.
After collecting the data I calculate the results of current solution. It can be found here.
Detailed analysis of the current algorithm. I made charts and visualizations regarding different algorithm options and threshold. It can be found here and here. I made several conclusions on how to improve current solution.
Improve current solution by adjusting thresholds and update metric values. Detailed description is here and here. I made a pull request with these changes
New metrics development. I experimented with motion vectors and color histograms to build a new dissimilarity metric upon them. For histogram-based metric I experimented also with distance functions. I tried to implement edge change ratio but failed because it turned out to be too slow. I focused on histogram-based metric since it was the most accurate. I experimented with block-based approach, combining with previous versions of the algorithm and shifting by motion vectors. Results can be found here and here.
After third version was ready I added it to repo, provide CLI option for users to manually choose version and update unit-tests.
Detailed description of final result you can read here alongside with unsuccessful ideas and possible improvements.

Code

Pull requests:

#2765
#162

Blog posts

All posts can be found here

Acknowledgement

I'd like to thank my mentor Luca Barbato for always monitoring my progress, immediately responding and guiding me whenever I needed help and whole Xiph team!