Siddhesh Shankar

Posted on Sep 26, 2020

Build a Text to Speech Service with Python Flask Framework

#python #webdev #html #computerscience

Text-to-speech technology reads aloud digital text. It can take words on computers, smartphones, tablets etc and convert them into audio. Python comes with a lot of handy and easily accessible libraries and I am going to show you how we can deliver text-to-speech with Python using pyttsx3. pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline, and is compatible with both Python 2 and 3. It is easy to use the library which converts the text entered, into an audio.

Checkout the video

Steps:

Application invokes the pyttsx3.init() factory function to get a reference to a pyttsx3.Engine instance.
getProperty('rate') method helps you to set the current speaking rate.
Similarly, getProperty('volume') helps you to set the volume (min=0 and max=1)
voice is string identifier of the active voice.
runAndWait() blocks while processing all currently queued commands. Invokes callbacks for engine notifications appropriately. Returns when all commands queued before this call are emptied from the queue. ## Code for converting Text to Speech ```Javascript

import pyttsx3

def text_to_speech(text, gender):
"""
Function to convert text to speech
:param text: text
:param gender: gender
:return: None
"""
voice_dict = {'Male': 0, 'Female': 1}
code = voice_dict[gender]

engine = pyttsx3.init()

# Setting up voice rate
engine.setProperty('rate', 125)

# Setting up volume level  between 0 and 1
engine.setProperty('volume', 0.8)

# Change voices: 0 for male and 1 for female
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[code].id)

engine.say(text)
engine.runAndWait()

1. Create a text that you want to convert into audio.
2. Select voice assistant.
```Javascript


text = 'Hello ! My name is Siddhesh.'
gender = 'Male'  # Voice assistant 
text_to_speech(text, gender)

Building a Web Service

This is a basic flask application with the default route. We also need to import specific flask libraries and a few other libraries as shown below:



# Importing the necessary Libraries
from flask_cors import cross_origin
from flask import Flask, render_template, request
from main import text_to_speech

app = Flask(__name__)


@app.route('/', methods=['POST', 'GET'])
@cross_origin()
def homepage():
    if request.method == 'POST':
        text = request.form['speech']
        gender = request.form['voices']
        text_to_speech(text, gender)
        return render_template('frontend.html')
    else:
        return render_template('frontend.html')


if __name__ == "__main__":
    app.run(port=8000, debug=True)

HTML Template Code



<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta http-equiv="X-UA-Compatible" content="ie=edge" />
    <title>Browser voices</title>
    <style type="text/css">
                            * {
                      box-sizing: border-box;
                    }
                    html,
                    body {
                      min-height: 100vh;
                      margin: 0;
                      padding: 0;
                    }
                    body {
                      font-family: Helvetica, Arial, sans-serif;
                      color: #0d122b;
                      display: flex;
                      flex-direction: column;
                      padding-left: 1em;
                      padding-right: 1em;
                    }
                    h1 {
                      text-align: center;
                      font-weight: 100;
                    }
                    header {
                      border-bottom: 1px solid #0d122b;
                      margin-bottom: 2em;
                    }
                    main {
                      flex-grow: 2;
                      display: flex;
                      justify-content: space-around;
                      align-items: center;
                      background-color: #fff;
                      border-radius: 12px;
                      margin-bottom: 2em;
                    }
                    @keyframes bg-pulse {
                      0% {
                        background-color: #fff;
                      }

                      50% {
                        background-color: #c7ecee;
                      }

                      100% {
                        backgrouond-color: #fff;
                      }
                    }
                    main.speaking {
                      animation: bg-pulse 1.5s alternate ease-in-out infinite;
                    }
                    .input {
                      text-align: center;
                      width: 100%;
                    }
                    label {
                      display: block;
                      font-size: 18px;
                      margin-bottom: 1em;
                    }
                    .field {
                      margin-bottom: 2em;
                    }
                    input {
                      font-size: 24px;
                      padding: 0.5em;
                      border: 1px solid rgba(13, 18, 43, 0.25);
                      border-radius: 6px;
                      width: 75%;
                      display: inline-block;
                      transition: border-color 0.25s;
                      text-align: center;
                    }
                    input:focus,
                    select:focus {
                      border-color: rgba(13, 18, 43, 1);
                    }
                    select {
                      width: 75%;
                      font-size: 24px;
                      padding: 0.5em;
                      border: 1px solid rgba(13, 18, 43, 0.25);
                      border-radius: 6px;
                      transition: border-color 0.25s;
                    }
                    button {
                      font-size: 18px;
                      font-weight: 200;
                      padding: 1em;
                      width: 200px;
                      background: transparent;
                      border: 4px solid #f22f46;
                      border-radius: 4px;
                      transition: all 0.4s ease 0s;
                      cursor: pointer;
                      color: #f22f46;
                      margin-bottom: 2em;
                    }
                    button:hover,
                    button:focus {
                      background: #f22f46;
                      color: #fff;
                    }

                    a {
                      color: #0d122b;
                    }
                    .error {
                      color: #f22f46;
                      text-align: center;
                    }
                    footer {
                      border-top: 1px solid #0d122b;
                      text-align: center;
                    }
    </style>
  </head>
      <body>

          <header>
              <h1>Text to Speech</h1>
          </header>

        <main>
          <form class="input" id="voice-form" method="post">
                <div class="field">
                    <label for="speech">Type Something</label>
                    <input type="text" name="speech" id="speech" required />
                </div>
                <div class="field">
                    <label for="voices">Choose a Voice Assistant</label>
                      <select name="voices" id="voices">
                          <option value="Male">Male</option>
                          <option value="Female">Female</option>
                      </select>
                </div>
              <button>
                    Say it!
              </button>
          </form>
        </main>

        <footer>
            <p>
                Built by <a href=<BLOG LINK>><YOUR NAME></a>
            </p>
        </footer>

      </body>
</html>

There are several APIs available to convert text to speech in python. gTTS can also be used for doing the same task. It is python library and CLI tool to interface with Google translate's text-to-speech API.

I hope you found this post useful. This will help end-users to build a web application using flask and convert text to speech.

Octocat will take you to my GitHub repository...

Top comments (4)

Vicky Kumar • Jun 4 '21

audio is audible on localhost but not when deployed on herokuapp !!

imvickykumar999.herokuapp.com/news

Niranjanadas M M • Jan 26 '22

When iam deploying on heroku its showing error of pywin32 and pytsx error , like :
ERROR: Could not find a version that satisfies the requirement pywin32>=223 (from pypiwin32) (from versions: none)
ERROR: No matching distribution found for pywin32>=223

can you help , how you overcome this error ! thank

Abdullah Al Masum • Mar 19 '22

How did you solved the issue?

Niranjanadas M M • Mar 19 '22

Well the heroku depolyment error was solved by launching in my linux server and starting by a fresh virtual environment . But after the deployment , it shows the common error : ' Internal Server Error ' . which is challenging for me . Any way out ?