DEV Community

Cover image for Build a Shazam clone with Node.js
@Jonath-z
@Jonath-z

Posted on

Build a Shazam clone with Node.js

I.Introduction

Shazam is the popular music recognition application made by Chris Barton and Philip Inghelbrecht both student of university of california. Shazam identifies songs based on an audio fingerprint based on a time-frequency graph called a spectrogram. It uses a smartphone or computer's built-in microphone to gather a brief sample of audio being played. Shazam stores a catalogue of audio fingerprints in a database. The user tags a song for 10 seconds and the application creates an audio fingerprint. Shazam works by analyzing the captured sound and seeking a match based on an acoustic fingerprint in a database of millions of songs. If it finds a match, it sends information such as the artist, song title, and album back to the user. Some implementations of Shazam incorporate relevant links to services such as iTunes, Apple Music, Spotify, YouTube, or Groove Music.

I.1.Prerequisites

To follow this article, you need to have Node.js installed and to understand the basics such as how to set up a simple server and to configure it, knowledge real time communication with socket.io and http requests, you have also to understand basics of javascript. Here we are ,so let start, and clone the shazam app. In our case we are going to get all data we need about the song which can be used and displayed as we want,but in this article were going only to fetch data.

II. Server side

II.1.Setting up our server

Start by creating a new Node.js project. Create a folder with the name of your choice and run the following commands from the terminal.

 npm init -y
 npm install axios ejs express socket.io http
Enter fullscreen mode Exit fullscreen mode

npm init initializes a new node.js application whereas the second command installs express ,ejs ,axios and soceket.io,http With that done, let’s go ahead and create our server. Create an index.js file and write the following code in it.

const express = require('express');
const app = express();
const server = require('http').createServer(app);
const io = require('socket.io')(server, { cors: { origin: "*" } });

app.use(express.json());
app.use('/static', express.static(path.join(__dirname, './src')));
app.set('view engine', 'ejs');
app.set('socketio', io);
app.set("views", path.join(__dirname, "views"));

function socket() {
        io.on('connection', (socket) => {
           socket.on('song', (blob) => {
           console.log(blob);
           const buffer = blob64);
           const base64 = buffer.toString('base64');


     const options = {
        method: 'POST',
        url: 'https://shazam.p.rapidapi.com/songs/detect',
       headers: {
           'content-type': 'text/plain',
          'x-rapidapi-key': 
        `${process.env.SHAZAM_RAPID_API_KEY}`,
         'x-rapidapi-host': 'shazam.p.rapidapi.com'
     },
    data: `${base64}`
   };
axios.request(options).then(function (response) {
       console.log(response.data);
       const matches = response.data.matches.length;
       console.log(matches)
    });
  }
}
 socket();
 const port = process.env.PORT || 7070
 server.listen(port, () => { console.log("server is running 
 on 7070") 
});
Enter fullscreen mode Exit fullscreen mode

In the code above we initialize our server with express, and the socket.io ,this package allows a real time communication between the sever and the client without refreshing the browser.

const app = express();
const server = require('http').createServer(app);
const io = require('socket.io')(server, { cors: { origin: "*" } });
Enter fullscreen mode Exit fullscreen mode

Then we set middlewares ,get documentation about middleware in express.js here

app.use(express.json());
app.use('/static', express.static(path.join(__dirname, './src')));
app.set('view engine', 'ejs');
app.set('socketio', io);
app.set("views", path.join(__dirname, "views"));
After that ,in the function called socket we create a connection between the server and the client .

function socket() {
io.on('connection', (socket) => {
    ...
}}
Enter fullscreen mode Exit fullscreen mode

Then we catch the event called “song” sent from the client with the data “blob”, as we said, the shazam API from rapid API,accepts only “base64” as the body ,that mean we converted this data like a “blob” on this line :

const buffer = blob;
const base64 = buffer.toString('base64');
Enter fullscreen mode Exit fullscreen mode

here we are ,let make the request to the end point. Before that go to the rapid API(link here) and subscribe to the API to get the API KEY. That done,let make the request :

const options = {
method: 'POST',
url: 'https://shazam.p.rapidapi.com/songs/detect',
headers: {
'content-type': 'text/plain',
'x-rapidapi-key': `${process.env.SHAZAM_RAPID_API_KEY}`,
'x-rapidapi-host': 'shazam.p.rapidapi.com'
},
data: `${base64}`
};
axios.request(options).then(function (response) {
console.log(response.data);
const matches = response.data.matches.length;
console.log(matches)
})
Enter fullscreen mode Exit fullscreen mode

III. Client side

The shazam API from RapidAPI accepts only base64 string as a body of the request. That means ,we are going to record a piece of song , and get his blob then convert that blob to the base64 string. Note that,the song recorded must be a mono channel. (put the ref link)

II.1.Create an basic "ejs" template in "views" folder,named "record.ejs"

<!DOCTYPE html>
<html lang="en">
  <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <script src="https://cdn.socket.io/3.1.3/socket.io.min.js" integrity="sha384-cPwlPLvBTa3sKAgddT6krw0cJat7egBga3DJepJyrLl4Q9/5WLra3rrnMcyTyOnh"           crossorigin="anonymous"></script>
          <meta name="viewport" content="width=device-width, initial-scale=1.0">
          <title>Shazam</title>
 </head>
  <body class="main-body">
           <button id=’record’>Record</button>
  <script src="/static/record.js"></script>
 </body>
</html>
Enter fullscreen mode Exit fullscreen mode

The code above as you can see, it’s a simple ejs template that contains a button called record. After that let create the in the src folder the javascript file called record.js,that is imported in record.ejs by the script tag.

var gumStream;
//stream from getUserMedia() 
var rec;
//Recorder.js object 
var input;
//MediaStreamAudioSourceNode we'll be recording 
// shim for AudioContext when it's not avb. 
const AudioContext = window.AudioContext || window.webkitAudioContext;
const audioContext = new AudioContext;
const record = document.getElementById('record');

record.addEventListener('click', startRecording);
              function startRecording() {
              console.log('recoding started');
              var constraints = {
                      audio: true,
                      video: false
                   }
navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
            /* assign to gumStream for later use */
           gumStream = stream;
          /* use the stream */
            input = audioContext.createMediaStreamSource(stream);
        /* Create the Recorder object and configure to record mono sound (1 channel) Recording 2 channels will double the file size */
            rec = new Recorder(input, {
                   numChannels: 1
            });
       //start the recording process 
         rec.record()
      // console.log("Recording started");
      setTimeout(stopRecording, 3000);
  }).catch(function (err) {
      console.log(err);
      });
}

function stopRecording() {
//tell the recorder to stop the recording 
    rec.stop();
//stop microphone access
     gumStream.getAudioTracks()[0].stop();
//create the wav blob and pass it on to createDownloadLink 
    rec.exportWAV(createDownloadLink);
}

function createDownloadLink(blob) {
// console.log(blob)
socket.emit('song', blob);
}
Enter fullscreen mode Exit fullscreen mode

Let explain what happens in the code above: we created gloabals variables that is going to store our media stream.

var gumStream;
//stream from getUserMedia() 
var rec;
//Recorder.js object 
var input;
//MediaStreamAudioSourceNode we'll be recording
Enter fullscreen mode Exit fullscreen mode

To work with web audio API, we created the audio context that allows us to get all the fonctionnalities of the audio API

const AudioContext = window.AudioContext || window.webkitAudioContext;
const audioContext = new AudioContext;
Enter fullscreen mode Exit fullscreen mode

An other way to create a context is :

const audioContext = new AudioContext();
Enter fullscreen mode Exit fullscreen mode

Now we get our principle variable,and also we have access to all fonctionnalities of dealing with the audio API. To start recording is launched by the ‘’click’’ event on the ‘’record button’’.

record.addEventListener('click', startRecording);
Enter fullscreen mode Exit fullscreen mode

Then we created the ‘’startRecoding function”.

function startRecording() {
       ...
}
Enter fullscreen mode Exit fullscreen mode

First in this function we define which kind of media stream we need to record,in this case we need only audio.

function startRecording() {
    
    var constraints = {
    audio: true,
    video: false
       }
}
Enter fullscreen mode Exit fullscreen mode

Then navigator.mediaDevices.getUserMedia(constraints) , prompts the user for the permission to use media input which produces a media stream with tracks specified in the constraints object . Note that,this method returns a “Promise” that resolves to a MediaStream object. If the user denies permission, or matching media is not available, then the promise is rejected with NotAllowedError or NotFoundError.

The promise resolved,we get the stream which will be stored in var gumStream .

function startRecording() {
    
    navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
        /* assign to gumStream for later use */
        gumStream = stream;
    ...
      }
}
Enter fullscreen mode Exit fullscreen mode

By default the song recorded by the media device has two channels. So,to manipulate our stream, for that we used the createMediaStreamSource() method of the audioContext. A new MediaStreamAudioSourceNode object representing the audio node whose media is obtained from the specified source stream is stored in the input variable.

function startRecording() {
    
    navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
        /* assign to gumStream for later use */
        gumStream = stream;
    .   input = audioContext.createMediaStreamSource(stream);
    ...
      }
}
Enter fullscreen mode Exit fullscreen mode

Then we created a Recorder object and configure to record mono sound (1 channel).

function startRecording() {
    
    navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
        /* assign to gumStream for later use */
        gumStream = stream;
    .   input = audioContext.createMediaStreamSource(stream);
    .   rec = new Recorder(input, {
                numChannels: 1
            })
        ...
      }
}

Enter fullscreen mode Exit fullscreen mode

After that, now we are able to start the recording with rec.record() ,but we don’t need to record all the song,just few second that can be useful to the shazam API.let take 3 seconds. To handle that, we used the setTimeout(stopRecording, 3000) method; stopRecording() ,stop the record process with rec.stop();and stop the access to the record media device (in this case the microphone). Also,inside this function we created the WAV(Waveform Audio File Format is an audio file format standard for storing an audio bitstream ) blob and pass it to the createDownloadLink() function,then we sent it to the server using socket.io.

function stopRecording() {
//tell the recorder to stop the recording 
rec.stop();
//stop microphone access
gumStream.getAudioTracks()[0].stop();
//create the wav blob and pass it on to createDownloadLink 
rec.exportWAV(createDownloadLink);
}
Enter fullscreen mode Exit fullscreen mode
function createDownloadLink(blob) {
// console.log(blob)
//send the blog to the server .
socket.emit('song', blob);
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

Thanks for reading.
You can find a full code here.

Top comments (2)

Collapse
 
derick1530 profile image
Derick Zihalirwa

Nice article, well explained 🫡

Collapse
 
jonathz profile image
@Jonath-z

thank you bro

Some comments may only be visible to logged-in visitors. Sign in to view all comments.