DEV Community

Vikram Vaswani
Vikram Vaswani

Posted on • Originally published at docs.rev.ai

Integrate Rev AI's Topic Extraction API with a Node.js Application

By Vikram Vaswani, Developer Advocate

This tutorial was originally published at https://docs.rev.ai/resources/tutorials/integrate-topic-extraction-api-nodejs/ on Jun 13, 2022.

Introduction

Topic extraction attempts to detect the topics or subjects of a document. It is useful in a number of different scenarios, including

  • Auto-generated agendas for meetings and phone calls
  • Automated classification or keyword indexing for digital media libraries
  • Automated tagging for Customer Service (CS) complaints or support tickets

Rev AI offers a Topic Extraction API that identifies important keywords and corresponding topics in transcribed speech. For application developers, it provides a fast and accurate way to retrieve and rank the core subjects in a transcribed conversation and then take further actions based on this information.

This tutorial explains how to integrate the Rev AI Topic Extraction API into your Node.js application.

Assumptions

This tutorial assumes that:

NOTE: The Topic Extraction API is under active development. Always refer to the API documentation for the most up-to-date information.

Step 1: Install Axios

The Topic Extraction API is a REST API and, as such, you will need an HTTP client to interact with it. This tutorial uses Axios, a popular Promise-based HTTP client for Node.js.

Begin by installing Axios into your application directory:

npm install axios
Enter fullscreen mode Exit fullscreen mode

Within your application code, initialize Axios as below:

const axios = require('axios');
const token = '<REVAI_ACCESS_TOKEN>';

// create a client
const http = axios.create({
  baseURL: 'https://api.rev.ai/topic_extraction/v1beta/',
  headers: {
    'Authorization': `Bearer ${token}`,
    'Content-Type': 'application/json'
  }
});
Enter fullscreen mode Exit fullscreen mode

Here, the Axios HTTP client is initialized with the base endpoint for the Topic Extraction API, which is https://api.rev.ai/topic_extraction/v1beta/.

Every request to the API must be in JSON format and must include an Authorization header containing your API access token. The code shown above also attaches these required headers to the client.

Step 2: Submit transcript for topic extraction

To perform topic extraction on a transcript, you must begin by submitting an HTTP POST request containing the transcript content, in either plaintext or JSON, to the API endpoint at https://api.rev.ai/topic_extraction/v1beta/jobs.

The code listings below perform this operation using the HTTP client initialized in Step 1, for both plaintext and JSON transcripts:

const submitTopicExtractionJobText = async (textData) => {
  return await http.post(`jobs`,
    JSON.stringify({
      text: textData
    }))
    .then(response => response.data)
    .catch(console.error);
};

const submitTopicExtractionJobJson = async (jsonData) => {
  return await http.post(`jobs`,
    JSON.stringify({
      json: jsonData
    }))
    .then(response => response.data)
    .catch(console.error);
};
Enter fullscreen mode Exit fullscreen mode

If you were to inspect the return value of the functions shown above, here is an example of what you would see:

{
  id: 'W6DvsEjteqwV',
  created_on: '2022-04-13T09:16:07.033Z',
  status: 'in_progress',
  type: 'topic_extraction'
}
Enter fullscreen mode Exit fullscreen mode

The API response contains a job identifier (id field). This job identifier will be required to check the job status and obtain the job result.

Learn more about submitting a topic extraction job in the API reference guide.

Step 3: Check job status

Topic extraction jobs usually complete within 10-20 seconds. To check the status of the job, you must submit an HTTP GET request to the API endpoint at https://api.rev.ai/topic_extraction/v1beta/jobs/<ID>, where <ID> is a placeholder for the job identifier.

The code listing below demonstrates this operation:

const getTopicExtractionJobStatus = async (jobId) => {
  return await http.get(`jobs/${jobId}`)
    .then(response => response.data)
    .catch(console.error);
};
Enter fullscreen mode Exit fullscreen mode

Here is an example of the API response to the previous request after the job has completed:

{
  id: 'W6DvsEjteqwV',
  created_on: '2022-04-13T09:16:07.033Z',
  completed_on: '2022-04-13T09:16:07.17Z',
  word_count: 13,
  status: 'completed',
  type: 'topic_extraction'
}
Enter fullscreen mode Exit fullscreen mode

Learn more about retrieving the status of a topic extraction job in the API reference guide.

Step 4: Retrieve topic extraction report

Once the topic extraction job's status changes to completed, you can retrieve the results by submitting an HTTP GET request to the API endpoint at https://api.rev.ai/topic_extraction/v1beta/jobs/<ID>/result, where <ID> is a placeholder for the job identifier.

The code listing below demonstrates this operation:

const getTopicExtractionJobResult = async (jobId) => {
  return await http.get(`jobs/${jobId}/result`,
    { headers: { 'Accept': 'application/vnd.rev.topic.v1.0+json' } })
    .then(response => response.data)
    .catch(console.error);
};
Enter fullscreen mode Exit fullscreen mode

If the job status is completed, the return value of the above function is a JSON-encoded response containing a sentence-wise topic extraction report. If the job status is not completed, the function will return an error instead.

Here is an example of the topic extraction report returned from a completed job:

{
  "topics": [
    {
      "topic_name": "incredible team",
      "score": 0.9,
      "informants": [
        {
          "content": "We have 17 folks and, uh, I think we have an incredible team and I just want to talk about some things that we've done that I think have helped us get there.",
          "ts": 71.4,
          "end_ts": 78.39
        },
        {
          "content": "Um, it's sort of the overall thesis for this one.",
          "ts": 78.96,
          "end_ts": 81.51
        },
        {
          "content": "One thing that's worth keeping in mind is that recruiting is a lot of work.",
          "ts": 81.51,
          "end_ts": 84
        },
        {
          "content": "Some people think that you can raise money and spend a few weeks building your team and then move on to more",
          "ts": 84.21,
          "end_ts": 88.47
        }
      ]
    },
    {
      ...
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

It’s also possible to filter the result set to return only topics which score above a certain value by adding a threshold query parameter to the request.

Learn more about obtaining a topic extraction report in the API reference guide.

Step 5: Create and test a simple application

Using the code samples shown previously, it's possible to create a simple application that accepts a JSON transcript and returns a list of topics detected in it, as shown below:

const main = async (jsonData) => {
  const job = await submitTopicExtractionJobJson(jsonData);
  console.log(`Job submitted with id: ${job.id}`);

  await new Promise((resolve, reject) => {
    const interval = setInterval(() => {
      getTopicExtractionJobStatus(job.id)
        .then(r => {
          console.log(`Job status: ${r.status}`);
          if (r.status !== 'in_progress') {
            clearInterval(interval);
            resolve(r);
          }
        })
        .catch(e => {
          clearInterval(interval);
          reject(e);
        });
    }, 15000);
  });

  const jobResult = await getTopicExtractionJobResult(job.id);
  console.log(jobResult);
};

// extract topics from example Rev AI JSON transcript
http.get('https://www.rev.ai/FTC_Sample_1_Transcript.json')
  .then(response => main(response.data));
Enter fullscreen mode Exit fullscreen mode

This example application begins by fetching Rev AI's example JSON transcript and passing it to the main() function as input to be analyzed. The main() function submits this data to the Topic Extraction API using the submitTopicExtractionJobJson() method. It then uses setInterval() to repeatedly poll the API every 15 seconds to obtain the status of the job. Once the job status is no longer in_progress, it uses the getTopicExtractionJobResult() method to retrieve the job result and prints it to the console.

Here is an example of the output returned by the code above:

Job submitted with id: xgKIzeODYYba
Job status: completed
{
  topics: [
    { topic_name: 'quick overview', score: 0.9, informants: [Array] },
    { topic_name: 'concert tickets', score: 0.9, informants: [Array] },
    { topic_name: 'dividends', score: 0.9, informants: [Array] },
    { topic_name: 'quick background', score: 0.6, informants: [Array] }
  ]
}
Enter fullscreen mode Exit fullscreen mode

NOTE: The code listing above polls the API repeatedly to check the status of the topic extraction job. This is presented only for illustrative purposes and is strongly recommended against in production scenarios. For production scenarios, use webhooks to asynchronously receive notifications once the topic extraction job completes.

Next steps

Learn more about the topics discussed in this tutorial by visiting the following links:

Top comments (0)