DEV Community

Cross browser speech synthesis - the hard way and the easy way

Jan Küster on December 07, 2021

When I implemented my first speech-synthesis app using the Web Speech API I was shocked how hard it was to setup and execute it with cross-browser ...
Collapse
 
balthazur profile image
Balthasar Huber

Hi Jan,
thanks for this post and for creating easy-speech. I am really glad I found this article before diving deep into the Web Speech API.

On iOS, I noticed that only the following german languages are available (see this post).
However, all of them are completely broken and not usable for german language, even though they are tagged with "de". I saw that you are german, so I was wondering if you maybe managed to make a german voice work on iOS? On MacOS, Anna DE seems to work fairly well.

Also, I created typings for easy-speech, so if you are interested I could make a PR.

Best,
Balthasar

Collapse
 
jankapunkt profile image
Jan Küster

Hi Balthasar, sorry for late response, smh I get no notifications using the dev.to app....

Basically, the voices are are system issue we actually can't solve with JavaScript. Some browsers provide additional (remote) voices, such as Chrome with the Google voices. MacOS and iOS are sometimes different even between minor updates and you have to install additional voices on the OS level to reach a decent TTS experience.
I have no iOS device available currently so I can't really tell which voice we used last time.
To compare things:
On most Linux distros you will find only chrome to have decent German voices, basically because it's using the remote google voices.
I really hope the vendors will improve on all this in the near future.
My own mid term goal is to provide my own speech service voices as fallback but I found that I can't implement the SpeechSynthesisVoice interface and simply load my custom voice from my server.
However, any improvement from my end will directly be added to EasySpeech.

Regarding typings I'm curious for your PR! Would this be a types.ts file or a TS rewrite? I have only surface-level knowledge of Typescript so I can't really review a full rewrite. Let's continue on GitHub regarding the TS Integration.

Thanks and all the best

Collapse
 
balthazur profile image
Balthasar Huber

No worries!
Thanks for your insights, really interesting. "Remote google voices" sounds great tbh, maybe it isn't a thing on iOS, as Chrome and all other browser on iOS are WebKit and not Chromium based, but it's just a guess. Too bad that there isn't much we can do to improve it right now.

I created types for the exported functions and classes in a seperate typings file. I will create a PR in the upcoming days and we can further discuss on GitHub :-).

Thread Thread
 
jankapunkt profile image
Jan Küster

Great! Looking forward to your PR!

Collapse
 
balthazur profile image
Balthasar Huber • Edited

Addition:
If I try this demo and this demo on Chrome-iOS, the german voices seem pretty broken.
However, on this demo, even though they sound not great at least they seem to work. Don't know how and why this is possible.

Collapse
 
dariberrie profile image
DariBerrie

I know you posted this a while ago, but I ran into the same issue while trying to create a more accessible web app using SpeechSynthesis to help confirm certain actions.

It was a pain realizing that the voices provided in Chrome were not the same as Safari (let alone any other browsers). After reading through the issues you faced, it was a nice surprise to see your EasySpeech package at the end! Thank you! 🚀

Collapse
 
jankapunkt profile image
Jan Küster

Thank you! If you include EasySpeech in your app and encounter any issues, don't hesitate to open an issue on it's GitHub page or reach out to me here on dev.to. All the best

Collapse
 
codedwells profile image
Abel Misiocha

Great project!

Collapse
 
sunco profile image
SuNcO • Edited

Using your code or the API directly, the iOS voices for es-MX are very ugly 😕

Collapse
 
jankapunkt profile image
Jan Küster

Thanks, the quality of the voice is unfortunately vendor specific. You may need to Install voices on a system level in order to get a decent output.

Collapse
 
vivek-nexus profile image
Vivek • Edited

Thanks for this article!

I have been working on my own text to speech app from 2020. I just published v5 of the app and I call it Listen. I had already implemented many of the workarounds discussed here, but I learnt many new ones that I was not aware of!

I like to think Listen is world-class now, but try it out and let me know if it really is! Source code here.