We already discovered that
ffmpeg is using Neon architecture extension in SIMD code.
To continue with the previou blog post, this blog will be about the recommendations how
ffmpeg should be extended to support SVE2 (Scalable Vector Extensions v2)
SVE2 is a superset of SVE and Neon. In terms of data-level parallelism, SVE2 allows for more function domains. SVE2 inherits SVE's concepts, vector registers, and operation principles, and developers can choose to use hardware with a maximum of 2048 bits.
Some advantages that SVE2 has compared to the traditional SIMD are:
- The SVE2 program can be built once and run on a variety of hardware with different vector lengths
- SVE2 allows for more vectorization options
- SVE2 expands the instruction set, allowing for more application possibilities
ffmpeg is mainly handling heavy works such as decode, encode, transcode, so using SVE2 will enable it to have more options, more flexibility, and so on.
I found that
libswscale library - implements color conversions and scaling routines can be improved with SVE2 instead of traditional SIMD instruction. The reason is because the scaling process will take a quite of time, SVE2 would allow the tool to boost up this process.
Neon usage in
ffmpeg can be changed to take the advantages of SVE2 because it allows vector code to adapt to different vector lengths during execution time. SVE2 has 32 scalable vector registers
Z0-Z31, and it can be implemented from 128 bits up to 2045 bits with 128 increment.
ffmpeg can save more time when building and compiling program if the SVE2 is used, as well as user can get better experience while using the tool.
In conclusion, we can consider to take the advantages of SVE2 functionalities for
ffmpeg to maximize the flexibility of the tool. As well as increasing the ability to use variable-length instruction sets up to 2048 bits.