DEV Community


Posted on

Who else can match such speed? A performance leap prompted by a single issue

Recently, I open-sourced a small project about audio feature extraction and analysis. As someone in the AI audio field, I felt that I lacked a deep understanding of audio features during my research, so I created this project as a way to learn and practice.

Although it was a small project for learning and practice, I was confident in it because most of the core algorithms were implemented in C and wrapped in Python. I thought it should be faster than libraries implemented purely in Python. I also did a simple performance comparison with other related Python libraries, and the results were indeed faster. However, I didn't expect to hit a snag later on!!!

Two weeks ago, I received an issue from a user saying 'Speed is slow, am I missing something?' When I took a closer look, I was shocked to find that my library was the slowest. I quickly ran it on my own computer, and it was even worse than the results given by the user. This was a big blow!!! The relevant issue can be found here:

After careful analysis, I found that the sample size I used for testing was too small, and when the sample size was large, the performance was slow, mainly because of matrix multiplication. After subsequent optimization, it was much faster than other libraries, but there was still a performance gap compared to PyTorch's official torchaudio library.

I accepted my fate since torchaudio is simply superior. After a week of hard work, I tried various technical optimizations such as OpenBLAS, Eigen, MKL, FFTW, SIMD, and parallel computing. I tested the performance with different sample sizes, CPUs, and system platforms. The results are shown in the following figure:

The graphs show the benchmark results on Linux/AMD and macOS/Intel, respectively.

Here is the detailed benchmark report:


  • on Linux/AMD processors, audioflux is slightly faster than torchaudio, but on Linux/Intel, it is slightly slower.
  • On macOS, for large sample sizes, audioflux is faster than torchaudio, and Intel is significantly faster than M1; for small sample sizes, torchaudio is faster than audioflux.

After various arduous optimizations, the performance of audioflux is much faster than the previous version and other related libraries. I have done everything I can to optimize its performance, but it still cannot beat torchaudio. I hope everyone can give me a thumbs up and follow me, and I look forward to outperforming torchaudio in the future!!!

If you are interested, please give us a star.
Project address:

Top comments (0)