DEV Community

Muhammad  Ahmad
Muhammad Ahmad

Posted on

Convert Video to text in Ruby

Setup on Google Cloud Console

  • Create a Project e.g My-Project
  • Start your free trial on Google Cloud Platform by adding credit card because the credit card is required for a free trial
  • Create or select a project
  • Enable the Cloud Speech-to-Text API for that project.
  • Create a service account. IAM -> Service Accounts -> Create Service Account
  • Download a private key as JSON.
  • Create a Bucket. Google Cloud Storage -> Browse

Installations on Local System

Run these command from a terminal

  • Install Google Cloud Storage gem

    gem install google-cloud-storage

  • Install Google Cloud Speech gem

    gem install google-cloud-speech

  • Install ffmpeg
    sudo apt-get install ffmpeg (Ubuntu)
    brew install ffmpeg (Mac)

Ruby Code

require "google/cloud/speech"
require "google/cloud/storage"

#Google cloud project id
project_id = "google_cloud_project_id"
#Downloaded key file
key_file   = "file_name.json"
Enter fullscreen mode Exit fullscreen mode

Convert video to aduio

Convert Video file to audio file using ffmpeg.

What is FLAC?
FLAC stands for Free Lossless Audio Codec, an audio format similar to MP3, but lossless, meaning that audio is compressed in FLAC without any loss in quality.

We will use both of these commands for better results.

system "ffmpeg -i video.mp4 aduio_temp.flac"
system "ffmpeg -i audio_temp.flac -ac 1 audio_final.flac"
Enter fullscreen mode Exit fullscreen mode

Upload audio to Google Storage

First access project by Storage API. Then create a new file in the bucket which will be a copy of adio_final.flac.

Note: Here I'm using the first bucket on Google Cloud Storage. If you have more than one buckets then you can select any storage bucket that you want.

storage = Google::Cloud::Storage.new project: project_id, keyfile: key_file
bucket_name = storage.buckets.first.name
puts bucket_name
bucket  = storage.bucket bucket_name
local_file_path = 'audio_final.flac'
file = bucket.create_file local_file_path, 'audio_cloud.flac'
puts "Uploaded #{file.name}"
Enter fullscreen mode Exit fullscreen mode

Translate Audio to Text

Now we'll convert the audio that we uploaded on Cloud Storage to text.
Access that file in the following mentioned way gs://bucket-name/file-name. We can use different language_code as if we are using German-language video then "de-DE" etc

speech = Google::Cloud::Speech.new
storage_path = "gs://audio_bucket-1/audio_cloud.flac"

config = { encoding: :FLAC,
        language_code: "en-US" }
audio = { uri: storage_path }
operation = speech.long_running_recognize config, audio

audio_text = ''
puts "Operation started"
if !operation.nil?
    operation.wait_until_done!
    raise operation.results.message if operation.error?
    results = operation.response.results
    results.each do |result|
        audio_text << result.alternatives.first.transcript
        puts "Transcription: #{result.alternatives.first.transcript}"
    end
end

puts audio_text
Enter fullscreen mode Exit fullscreen mode

Before running this script set the Google Application Credentials.

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/file_name.json"
Enter fullscreen mode Exit fullscreen mode

Now run this script.

ruby translate_video_to_text.rb
Enter fullscreen mode Exit fullscreen mode

You can find the complete code on Github.
Check it on Github

Top comments (1)

Collapse
 
niconisaw profile image
niconisaw

Hello, I am using your code as my reference to get the transcription of a Japanese video? Unfortunately, I encountered some problems? Kindly visit this StackOverflow question I posted. stackoverflow.com/questions/678317...