This week I finished analysis of the new metric based on color histograms. It can be read here.
I experimented with combining this new metric with the current one. Here are the F scores on BBC Planet Earth dataset for different versions:
- Current rav1e version:
- Improved threshold (version of my latest pr):
- Version with color histogram based metric:
- Union of the latest two versions (frame is considered to be a scene change if one of the above-mentioned metrics said so):
- Version with color histogram metric with block-based approach, when each frame is divided into blocks (more details on the bottom):
When I say the intersection of algorithms, I mean an algorithm that marks the frame as scene change only if both algorithms have marked it.
Here is the picture that explains why I chose union rather than an intersection. It should also be taken into account that the recall of the algorithms is higher than accuracy:
Numbers represent the amount of frames considered as scene changes in two versions of the algorithm and ground truth.
Each number represents one color area.
It can be seen that ground truth contains around 90% of the intersection of these versions.
I improved histogram-based metric by dividing frames into blocks. The results for it can be seen below along with the regular histogram-based approach. The improved version is marked in the legend as "with blocks"
Results on BBC dataset:
Results on open-source videos:
But the calculation speed became 0.75x of the current version on the BBC dataset (resolution 360x288) and 0.56x on open-source videos (resolution 1280x720).
I check if it worth it to combine this metric with the current one. The average increase in the F score is about 0.01-0.02, which, considering the even greater decrease in the speed rate, is unreasonable.
Also, I implemented block-based histogram approach considering the motion vectors. The results will be published here soon.