Noah Velasco

Posted on May 23, 2023 • Edited on Oct 12, 2024

Flutter + A.I. Text-To-Speech: A Simple Guide

#flutter #api #elevenlabs #dart

Please like and follow me on Github @noahvelasco!

youtube.com

GitHub Code

Overview

Introduction
Get ElevenLabs API Key
Flutter Project Configuration
Basic UI
API Key Setup
ElevenLabs API Code Call
Full Code
Possible Errors
Conclusion

Introduction

Hey there, fellow developers! We're going to dive into the exciting realm of text-to-speech (TTS) integration in Flutter. In today's fast-paced world, multimedia experiences are key to engaging users, and TTS APIs have become our secret weapon. In this tutorial, I'll walk you through the process of harnessing an API to bring text-to-speech functionality to your Flutter applications with a simple Flutter app.

Whether you're building an educational app, adding an accessibility feature, or simply enhancing your user experience, this guide will equip you with all the know-how to get started.

Get ElevenLabs API Key

First things first! Get your API key from your ElevenLabs profile and save it somewhere! Don't worry, it's free for 10,000 characters a month once you sign up. After you're done with this tutorial you're gonna want to pay them - it's REALLY good. Anyways, save the key - we will need it later!

Flutter Project Configuration

Create a new flutter project and follow these steps. Do not skip these steps since enabling certain rules and permissions is necessary to make TTS possible! Follow the below steps for your platform.

Android

Enable multidex support in the android/app/build.gradle file

defaultConfig {
   ...
   multiDexEnabled true
}

Enable Internet Connection on Android in android/app/src/main/AndroidManifest.xml

<uses-permission android:name="android.permission.INTERNET"/>

and update the application tag

<application ... android:usesCleartextTraffic="true">

iOS

Enable internet connection on iOS in the iOS/Runner/Info.plist

<dict>
....
<key>NSAppTransportSecurity</key>
<dict>
    <key>NSAllowsArbitraryLoads</key>
    <true/>
</dict>
...
</dict>

Basic UI

Let's code up a simple text form field and a button. The button will call the ElevenLabs API and play the input text through the speaker once pressed. First, lets set up the front end before any API calls -

import 'package:flutter/material.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'TTS Demo',
      home: MyHomePage(),
    );
  }
}

class MyHomePage extends StatefulWidget {
  @override
  _MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
  TextEditingController _textFieldController = TextEditingController();

  @override
  void dispose() {
    _textFieldController.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: const Text('EL TTS Demo'),
      ),
      body: Padding(
        padding: const EdgeInsets.all(16.0),
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.stretch,
          children: <Widget>[
            TextField(
              controller: _textFieldController,
              decoration: const InputDecoration(
                labelText: 'Enter some text',
              ),
            ),
            const SizedBox(height: 16.0),
            ElevatedButton(
              onPressed: () {
                //Eleven Labs API Call Here
              },
              child: const Icon(Icons.volume_up),
            ),
          ],
        ),
      ),
    );
  }
}

API Key Setup

Let's utilize the flutter package flutter_dotenv, create a .env file and insert our API key into it, and modify the pubspec.yaml file to include the .env file as it states on the instructions. Follow the below steps -

Add package to project

$ flutter pub add flutter dotenv

Make the following changes

Create a .env file in the root directory
Add the ElevenLabs API key to the .env file (as a string)
Add the .env file to the pubspec.yaml assets section
Add import to code (as seen below)
Add the .env variable as a global (as seen below)
Update main method code (as seen below)

import 'package:flutter_dotenv/flutter_dotenv.dart';

String EL_API_KEY = dotenv.env['EL_API_KEY'] as String;

Future main() async {
  await dotenv.load(fileName: ".env");

  runApp(MyApp());
}

ElevenLabs API Code Call

Now for the fun part! Now since we are going to be turning the text into speech using a REST API - we need a couple more packages. Follow the below -

Add package to project

$ flutter pub add http
$ flutter pub add just_audio

Add the following imports

import 'package:just_audio/just_audio.dart';
import 'package:http/http.dart';

Create an AudioPlayer object that will be responsible for playing the audio

final player = AudioPlayer(); //audio player obj that will play audio

To play the Audio, we need to borrow a function from the just_audio package. Place the following outside the main() -

// Feed your own stream of bytes into the player
class MyCustomSource extends StreamAudioSource {
  final List<int> bytes;
  MyCustomSource(this.bytes);

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    start ??= 0;
    end ??= bytes.length;
    return StreamAudioResponse(
      sourceLength: bytes.length,
      contentLength: end - start,
      offset: start,
      stream: Stream.value(bytes.sublist(start, end)),
      contentType: 'audio/mpeg',
    );
  }
}

Now we can add the REST API function 'playTextToSpeech' that fetches the main data from ElevenLabs in the class _MyHomePageState. We pass 'text' and that text will be converted to 'bytes' which our helper class/function 'MyCustomSource' will convert into sound.

  //For the Text To Speech
  Future<void> playTextToSpeech(String text) async {

    String voiceRachel =
        '21m00Tcm4TlvDq8ikWAM'; //Rachel voice - change if you know another Voice ID

    String url = 'https://api.elevenlabs.io/v1/text-to-speech/$voiceRachel';
    final response = await http.post(
      Uri.parse(url),
      headers: {
        'accept': 'audio/mpeg',
        'xi-api-key': EL_API_KEY,
        'Content-Type': 'application/json',
      },
      body: json.encode({
        "text": text,
        "model_id": "eleven_monolingual_v1",
        "voice_settings": {"stability": .15, "similarity_boost": .75}
      }),
    );

    if (response.statusCode == 200) {
      final bytes = response.bodyBytes; //get the bytes ElevenLabs sent back
      await player.setAudioSource(MyCustomSource(
          bytes)); //send the bytes to be read from the JustAudio library
      player.play(); //play the audio
    } else {
      // throw Exception('Failed to load audio');
      return;
    }
  }

If you want to tweak the way the voice sounds you can modify: the voice (in this case we are using the voice ID Rachel - '21m00Tcm4TlvDq8ikWAM'), stability, and the similarity boost. You can view the API docs to go more in depth.

To make this more UI friendly, we can add a linear progress indicator to know if the request is/isn't in progress.

Full Code

import 'package:flutter/material.dart';
import 'dart:convert';

import 'package:flutter_dotenv/flutter_dotenv.dart';
import 'package:just_audio/just_audio.dart';
import 'package:http/http.dart' as http;

String EL_API_KEY = dotenv.env['EL_API_KEY'] as String;

Future main() async {
  await dotenv.load(fileName: ".env");

  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'TTS Demo',
      home: MyHomePage(),
    );
  }
}

class MyHomePage extends StatefulWidget {
  @override
  _MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
  TextEditingController _textFieldController = TextEditingController();
  final player = AudioPlayer(); //audio player obj that will play audio
  bool _isLoadingVoice = false; //for the progress indicator

  @override
  void dispose() {
    _textFieldController.dispose();
    player.dispose();
    super.dispose();
  }

  //For the Text To Speech
  Future<void> playTextToSpeech(String text) async {
    //display the loading icon while we wait for request
    setState(() {
      _isLoadingVoice = true; //progress indicator turn on now
    });

    String voiceRachel =
        '21m00Tcm4TlvDq8ikWAM'; //Rachel voice - change if you know another Voice ID

    String url = 'https://api.elevenlabs.io/v1/text-to-speech/$voiceRachel';
    final response = await http.post(
      Uri.parse(url),
      headers: {
        'accept': 'audio/mpeg',
        'xi-api-key': EL_API_KEY,
        'Content-Type': 'application/json',
      },
      body: json.encode({
        "text": text,
        "model_id": "eleven_monolingual_v1",
        "voice_settings": {"stability": .15, "similarity_boost": .75}
      }),
    );

    setState(() {
      _isLoadingVoice = false; //progress indicator turn off now
    });

    if (response.statusCode == 200) {
      final bytes = response.bodyBytes; //get the bytes ElevenLabs sent back
      await player.setAudioSource(MyCustomSource(
          bytes)); //send the bytes to be read from the JustAudio library
      player.play(); //play the audio
    } else {
      // throw Exception('Failed to load audio');
      return;
    }
  } //getResponse from Eleven Labs

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: const Text('EL TTS Demo'),
      ),
      body: Padding(
        padding: const EdgeInsets.all(16.0),
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.stretch,
          children: <Widget>[
            TextField(
              controller: _textFieldController,
              decoration: const InputDecoration(
                labelText: 'Enter some text',
              ),
            ),
            const SizedBox(height: 16.0),
            ElevatedButton(
              onPressed: () {
                playTextToSpeech(_textFieldController.text);
              },
              child: _isLoadingVoice
                  ? const LinearProgressIndicator()
                  : const Icon(Icons.volume_up),
            ),
          ],
        ),
      ),
    );
  }
}

// Feed your own stream of bytes into the player
class MyCustomSource extends StreamAudioSource {
  final List<int> bytes;
  MyCustomSource(this.bytes);

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    start ??= 0;
    end ??= bytes.length;
    return StreamAudioResponse(
      sourceLength: bytes.length,
      contentLength: end - start,
      offset: start,
      stream: Stream.value(bytes.sublist(start, end)),
      contentType: 'audio/mpeg',
    );
  }
}

Possible Errors

Check the 'Flutter Project Configuration' section above

Conclusion

Hey, you did it! You've now got the superpower to integrate text-to-speech into your Flutter apps like a pro. By adding this awesome feature, you're taking your users' experience to a whole new level, making your app more accessible and engaging. Don't forget to keep exploring the endless possibilities offered by your chosen API and have fun experimenting with different customization options.