DEV Community

van
van

Posted on

Mel spectrogram and MFCC

Algorithm flow

Algorithm flow

  • Read 220Hz audio data
import audioflux as af

audio_path = af.utils.sample_path('220')
audio_arr, sr = af.read(audio_path)
Enter fullscreen mode Exit fullscreen mode
  • Extract spectrogram of dB
low_fre = 0
spec_arr, fre_band_arr = af.mel_spectrogram(audio_arr, samplate=sr, low_fre=low_fre)
spec_dB_arr = af.utils.power_to_db(spec_arr)
Enter fullscreen mode Exit fullscreen mode
  • Show mel spectrogram plot
import matplotlib.pyplot as plt
from audioflux.display import fill_spec
import numpy as np

# calculate x/y-coords
audio_len = audio_arr.shape[0]
x_coords = np.linspace(0, audio_len/sr, spec_arr.shape[1] + 1)
y_coords = np.insert(fre_band_arr, 0, low_fre)

fig, ax = plt.subplots()
img = fill_spec(spec_dB_arr, axes=ax,
                x_coords=x_coords,
                y_coords=y_coords,
                x_axis='time', y_axis='log',
                title='Mel Spectrogram')
fig.colorbar(img, ax=ax, format="%+2.0f dB")
Enter fullscreen mode Exit fullscreen mode

mel spectrogram

  • Extract mfcc data
cc_arr, _ = af.mfcc(audio_arr, samplate=sr)
Enter fullscreen mode Exit fullscreen mode
  • Show mfcc plot
# calculate x-coords
audio_len = audio_arr.shape[0]
x_coords = np.linspace(0, audio_len/sr, cc_arr.shape[1] + 1)

fig, ax = plt.subplots()
img = fill_spec(cc_arr, axes=ax,
                x_coords=x_coords, x_axis='time',
                title='MFCC')
fig.colorbar(img, ax=ax)
Enter fullscreen mode Exit fullscreen mode

mfcc

Top comments (0)