DEV Community

Vikram Vaswani
Vikram Vaswani

Posted on • Originally published at docs.rev.ai

Get Started with Speech Recognition in PHP

By Vikram Vaswani, Developer Advocate

This tutorial was originally published at https://docs.rev.ai/resources/tutorials/get-started-php/ on May 31, 2022.

Introduction

Rev AI offers a suite of speech-to-text APIs to help developers build automatic speech recognition (ASR) into their applications. These APIs cover a variety of use cases, including live and pre-recorded audio transcription, language identification, sentiment analysis and topic extraction.

To help developers integrate these APIs into their applications, Rev AI also offers SDKs for Node, Java and Python. However, because most of these APIs are REST APIs, it's also easy to use them with other languages...including one of my most frequently-used ones, PHP.

In this tutorial, I'll introduce you to the basics of using Rev AI's Asynchronous Speech-to-Text API using PHP and Guzzle. If you've ever wondered if you could add speech recognition to your PHP application, but didn't know where to start, this tutorial will give you all the information you need to start making requests to, and handling responses from, Rev AI's ASR APIs.

Assumptions

This tutorial assumes that:

Step 1: Install Guzzle

The Asynchronous Speech-to-Text API is a REST API and, as such, you will need an HTTP client to interact with it. This tutorial uses Guzzle 7.x, a popular PHP HTTP client.

Begin by installing Guzzle into your application directory with Composer:

composer require guzzlehttp/guzzle:^7.0
Enter fullscreen mode Exit fullscreen mode

Within your application code, initialize Guzzle as below. Replace the <REVAI_ACCESS_TOKEN> placeholder with your Rev AI access token:

<?php

require __DIR__ . '/vendor/autoload.php';

use GuzzleHttp\Client;

$token = '<REVAI_ACCESS_TOKEN>';

$client = new Client([
    'base_uri' => 'https://api.rev.ai/speechtotext/v1',
    'headers' => ['Authorization' => "Bearer $token"]
]);
Enter fullscreen mode Exit fullscreen mode

Here, the Guzzle HTTP client is initialized with the base endpoint for the Asynchronous Speech-to-Text API, which is https://api.rev.ai/speechtotext/v1/.

Every request to the API must be in JSON format and must include an Authorization header containing your API access token. The code shown above also attaches this required header to the client.

Step 2: Submit a file for transcription

To generate a transcript from an audio file, you must begin by submitting an HTTP POST request to the API endpoint at https://api.rev.ai/speechtotext/v1/jobs.

The following example demonstrates how to submit a remote audio file for transcription.

To use this example, replace the <URL> placeholder with the public URL to the file you wish to transcribe and the <REVAI_ACCESS_TOKEN> placeholder with your Rev AI account's access token.

<?php

require __DIR__ . '/vendor/autoload.php';

use GuzzleHttp\Client;

$token = '<REVAI_ACCESS_TOKEN>';
$fileUrl = '<URL>';

// create client
$client = new Client([
    'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
    'headers' => ['Authorization' => "Bearer $token"]
]);

// send POST request and get response body
$response = $client->request(
    'POST',
    'jobs',
    ['json' => ['media_url' => $fileUrl]]
)
->getBody()
->getContents();

// decode response JSON and print
print_r(json_decode($response));
Enter fullscreen mode Exit fullscreen mode

This example makes a POST request to the API, passing it the URL to the audio file to be transcribed as a JSON document. The response body is then received, parsed and decoded into a PHP object and printed to the console.

To run this example, save it as a file, such as example.php and then execute php example.php.

Here is an example of the script output, representing the API response:

stdClass Object
(
    [id] => sTfRgVlLCYkt
    [created_on] => 2022-04-06T13:35:40.6Z
    [name] => FTC_Sample_1.mp3
    [media_url] => https://www.rev.ai/FTC_Sample_1.mp3
    [status] => in_progress
    [type] => async
    [language] => en
)
Enter fullscreen mode Exit fullscreen mode

The API response contains a job identifier (id field). This job identifier will be required to check the job status and obtain the job result.

It is also possible to use a local audio file and submit it to the API as multipart/form-data.

The following example demonstrates how to submit a local audio file for transcription.

To use this example, replace the <FILEPATH> placeholder with the path to the file you wish to transcribe and the <REVAI_ACCESS_TOKEN> placeholder with your Rev AI account's access token.

<?php

require __DIR__ . '/vendor/autoload.php';

use GuzzleHttp\Client;

$token = '<REVAI_ACCESS_TOKEN>';
$file = '<FILEPATH>';

// create client
$client = new Client([
    'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
    'headers' => ['Authorization' => "Bearer $token"]
]);

// send POST request and get response body
$response = $client->request(
    'POST',
    'jobs',
    ['multipart' => [['name' => 'media','contents' => fopen($file, 'r')]]]
)
->getBody()
->getContents();

// decode response JSON and print
print_r(json_decode($response));
Enter fullscreen mode Exit fullscreen mode

Learn more about submitting an asynchronous transcription job in the API reference guide.

Step 3: Check transcription status

To check the status of the transcription job, you must submit an HTTP GET request to the API endpoint at https://api.rev.ai/speechtotext/v1/jobs/<ID>, where <ID> is a placeholder for the job identifier.

The following example demonstrates how to check the status of an asynchronous transcription job.

To use this example, replace the <ID> placeholder with the job identifier and the <REVAI_ACCESS_TOKEN> placeholder with your Rev AI account's access token.

<?php

require __DIR__ . '/vendor/autoload.php';

use GuzzleHttp\Client;

$token = '<REVAI_ACCESS_TOKEN>';
$jobId = '<ID>';

// create client
$client = new Client([
  'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
  'headers' => ['Authorization' => "Bearer $token"]
]);

// send GET request and get response body
$response = $client->request(
    'GET',
    "jobs/$jobId"
)
->getBody()
->getContents();

// decode response JSON and print
print_r(json_decode($response));
Enter fullscreen mode Exit fullscreen mode

Here is an example of the script output after the job has completed:

stdClass Object
(
    [id] => sTfRgVlLCYkt
    [created_on] => 2022-04-06T13:35:40.6Z
    [completed_on] => 2022-04-06T13:36:16.275Z
    [name] => FTC_Sample_1.mp3
    [media_url] => https://www.rev.ai/FTC_Sample_1.mp3
    [status] => transcribed
    [duration_seconds] => 107
    [type] => async
    [language] => en
)
Enter fullscreen mode Exit fullscreen mode

Learn more about retrieving the status of an asynchronous transcription job in the API reference guide.

Step 4: Retrieve the transcript

Once the job's status changes to transcribed, you can retrieve the results by submitting an HTTP GET request to the API endpoint at https://api.rev.ai/speechtotext/v1/jobs/<ID>/result, where <ID> is a placeholder for the job identifier.

The following example demonstrates how to retrieve the results of an asynchronous transcription job.

To use this example, replace the <ID> placeholder with the job identifier and the <REVAI_ACCESS_TOKEN> placeholder with your Rev AI account's access token.

<?php

require __DIR__ . '/vendor/autoload.php';

use GuzzleHttp\Client;

$token = '<REVAI_ACCESS_TOKEN>';
$jobId = '<ID>';

// create client
$client = new Client([
    'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
    'headers' => ['Authorization' => "Bearer $token"]
]);

// send GET request and get response body
$response = $client->request(
    'GET',
    "jobs/$jobId/transcript",
    ['headers' => ['Accept' => 'application/vnd.rev.transcript.v1.0+json']]
)
->getBody()
->getContents();

// decode response JSON and print
print_r(json_decode($response));
Enter fullscreen mode Exit fullscreen mode

If the job status is transcribed, the return value of the above function is a JSON-encoded response containing a sentence-wise sentiment analysis report. If the job status is not transcribed, the function will return an error instead.

Here is an example of the transcript returned from a successful job, represented as a PHP object:

stdClass Object
(
    [monologues] => Array
        (
            [0] => stdClass Object
                (
                    [speaker] => 0
                    [elements] => Array
                        (
                            [0] => stdClass Object
                                (
                                    [type] => text
                                    [value] => Hi
                                    [ts] => 0.27
                                    [end_ts] => 0.48
                                    [confidence] => 1
                                )
                            [1] => stdClass Object
                                (
                                    [type] => punct
                                    [value] => ,
                                )
                            [2] => stdClass Object
                                (
                                    [type] => punct
                                    [value] =>
                                )
                            ...
                        )
                )
        )
)
Enter fullscreen mode Exit fullscreen mode

Learn more about obtaining a transcript in the API reference guide.

Step 5: Create and test a simple application

Using the code samples shown previously, it's possible to create a custom Rev AI API client class encapsulating these functions:

<?php

require __DIR__ . '/vendor/autoload.php';

use GuzzleHttp\Client;

class RevAiApiClient extends Client
{
    /**
     * @var $client GuzzleHttp client object
     *
     */
    private $client;

    /**
     * Construct API client with default base path
     * and authorization
     *
     * @param string $token Rev AI access token
     */
    public function __construct($token)
    {
        if (!isset($token)) {
            throw new Exception('Access token missing');
        }

        $this->client = new Client([
            'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
            'headers' => ['Authorization' => "Bearer $token"],
        ]);
    }


    /**
     * Submit a remote audio file for transcription
     *
     * @param string $fileUrl URL to remote file
     *
     * @return stdClass Rev AI Jobs API endpoint response object
     */
    public function submitAsychronousJobRemote($fileUrl)
    {
        return json_decode(
            $this->client->request(
                'POST',
                'jobs',
                ['json' => ['media_url' => $fileUrl]]
            )->getBody()->getContents()
        );
    }

    /**
     * Submit a local audio file for transcription
     *
     * @param string $file Path to local file
     *
     * @return stdClass Rev AI Jobs API endpoint response object
     */
    public function submitAsychronousJobLocal($file)
    {
        return json_decode(
            $this->client->request(
                'POST',
                'jobs',
                ['multipart' => [['name' => 'media','contents' => fopen($file, 'r')]]]
            )->getBody()->getContents()
        );
    }

    /**
     * Get transcription job status
     *
     * @param string $id Transcription job ID
     *
     * @return stdClass Rev AI Jobs API endpoint response object
     */
    public function getAsychronousJobStatus($id)
    {
        return json_decode(
            $this->client->request(
                'GET',
                "jobs/$id"
            )->getBody()->getContents()
        );
    }

    /**
     * Get transcription job result
     *
     * @param string $id Transcription job ID
     *
     * @return stdClass Rev AI Transcript API endpoint response object
     */
    public function getAsychronousJobResult($id)
    {
        return json_decode(
            $this->client->request(
                'GET',
                "jobs/$id/transcript",
                ['headers' => ['Accept' => 'application/vnd.rev.transcript.v1.0+json']]
            )->getBody()->getContents()
        );
    }
}
Enter fullscreen mode Exit fullscreen mode

Save the above client as RevAiApiClient.php.

You can now use this client in a simple application that accepts a local audio file and returns its transcript, as shown below:

<?php
require __DIR__ . '/RevAiApiClient.php';

$token = '<REVAI_ACCESS_TOKEN>';
$file = '<FILEPATH>';

// initialize the Rev AI API client
$client = new RevAiApiClient($token);

// submit a local file for transcription
$jobSubmissionResponse = $client->submitAsychronousJobLocal($file);

// get the job ID and status
$jobId = $jobSubmissionResponse->id;
$jobStatus = $jobSubmissionResponse->status;
echo "Job submitted with id: $jobId" . PHP_EOL;

// check the job status periodically
while ($jobStatus == 'in_progress') {
    $jobStatus = $client->getAsychronousJobStatus($jobId)->status;
    echo "Job status: $jobStatus" . PHP_EOL;
    sleep(30);
}

// retrieve and print the transcript
if ($jobStatus == 'transcribed') {
    print_r($client->getAsychronousJobResult($jobId));
}
Enter fullscreen mode Exit fullscreen mode

This example application begins by initializing an instance of the RevAiApiClient object defined previously, passing the Rev AI access token to the object constructor. It then submits a local file for transcription using the object's submitAsychronousJobLocal() method. It then uses the getAsychronousJobStatus() method to repeatedly poll the API every 30 seconds to obtain the status of the job. Once the job status is no longer in_progress, it uses the getAsychronousJobResult() method to retrieve the transcript and prints it to the console.

Here is an example of the output generated by the example application:

Job submitted with id: RWviMy7nISeS
Job status: in_progress
Job status: transcribed
stdClass Object
(
    [monologues] => Array
        (
            [0] => stdClass Object
                (
                    [speaker] => 0
                    [elements] => Array
                        (
                            [0] => stdClass Object
                                (
                                    [type] => text
                                    [value] => 1, 2, 3
                                    [ts] => 0.03
                                    [end_ts] => 2.31
                                    [confidence] => 0.95
                                )
                            [1] => stdClass Object
                                (
                                    [type] => punct
                                    [value] => .
                                )
                        )
                )
        )
)
Enter fullscreen mode Exit fullscreen mode

NOTE: The example above polls the API repeatedly to check the status of the transcription job. This is presented only for illustrative purposes and is strongly recommended against in production scenarios. For production scenarios, use webhooks to asynchronously receive notifications once the job completes.

Next steps

Learn more about the topics discussed in this tutorial by visiting the following links:

Discussion (1)

Collapse
atabak profile image
atabak

Very interesting and good toturial.(for beginners or advanced users).
Thanks.