How Shazam Works in a Nutshell

#machinelearning #ai #algorithms #shazam

Time, as it grows old, teaches all things. -Aeschylus

Shazam is an application that identifies the music used in movies, advertising, and television shows based on a short sample played. In this article, I will explain the technology Shazam uses to make its audio discovery software work.

How does Shazam Work?

Shazam identifies songs through something called an audio/acoustic fingerprint and a Spectogram. Now to explain these words.

What is an audio/acoustic fingerprint?7

An audio/acoustic fingerprint is a condensed digital summary that is generated by an audio signal. An audio signal is a representation of sound typically using either a changing level of electricity voltage for analog signals or a series of binary numbers for digital signals.

In the case of Shazam, these audio signals are a series of binary numbers used to represent digital signals, these binary numbers can be sued to identify an audio sample or quickly locate similar items in an audio database (In case you aren't aware, an audio database is a database for audio).

What is a Spectrogram

A spectrogram is a graph representation of audio, each piece of audio is split into some segments over time, and from these audio segments, a graph is generated which plots 3 dimensions of audio -Frequency vs Intensity vs time.

To efficiently search for a sound you need to efficiently describe it and the way to do this is by using a spectrogram.

How does all this work in Shazam

We've successfully explained the technologies used in Shazam as single concepts, now let us understand how they work together cumulatively to make up Shazam.

When you ask Shazam to tell you information about a song like its name, author, etc, you give it an audio stream of the song in question via a microphone or some other audio input device. It represents the audio stream as a spectrogram, the shazam algorithm then picks out the peak point in the audio stream via the spectrogram graph representation -Peak points are points of less background noise.

The Shazam algorithm then creates an audio fingerprint from the peak point and then indexes through the audio database for a song with a similar audio fingerprint and when it finds a successful match it then returns its results to the user.

How Shazam updates its audio database

Going through this article you might have been able to infer that a core piece of technology behind the success of Shazam is its extensive audio database, to put it simply without an up to date audio database Shazam won't efficiently meet the demands of its users and this will lead to a resulting loss in revenue. So how does Shazam keep its audio database updated?

They do this through Industry partnerships with companies who document music. Shazam gets these companies to document music for them and then uses the data it gets from these companies to improve its audio database.

DEV Community

How Shazam Works in a Nutshell

How does Shazam Work?

What is an audio/acoustic fingerprint?7

What is a Spectrogram

How does all this work in Shazam

How Shazam updates its audio database

Top comments (0)

Read next

Scribe - An Obsidian plugin that places your voice notes right where you want them..with your other notes

6 Common Data Structures in Programming

Part 9: Building Your Own AI - Natural Language Processing (NLP) for Language Understanding

Top 10 AI Code Editors and Developer Tools in 2024