Introduction
As a developer, we often came across many open source projects in GitHub, in which Readme.md
is one of the first files which we will see π€©. It is the simplest way to understand, what the project is about, how to use it, and other related information (kind of documentation).
Here some known facts about readme from GitHub
A README is often the first item a visitor will see when visiting your repository. README files typically include information on:
- What the project does
- Why the project is useful
- How users can get started with the project
- Where users can get help with your project
- Who maintains and contributes to the project
Moreover having a good readme file which will help and attract many contributors. However, it's always been challenging for any low vision or visually impaired developers and contributors. As most of the readme content is in the form of text, which is quite difficult for them to read and understand. So I developed a very simple tool called "ReadmeAloud" that can convert the raw-text from any public GitHub readme file to Speech π€ and also provides a way to download it in form of an mp3 file π΅
Some of the Existing Solution
There are many solutions that already existing in the market which help to convert text into speech easily. Some of them are
- Microsoft Edge Brower has an inbuilt feature called ReadAloud
Read aloud highlights each word on the webpage as it's being ?>?read. To stop listening, select the Pause button or the X to close Read aloud.
Google Chrome Extension - Read Aloud: A Text to Speech Voice Reader
Word for Microsoft 365 - Read Aloud
So you are saying that it's already existing what so special about "ReadmeAloud"?
Well, I agree with you that most of these tools were great and helpful for everyone. However, I felt there are a couple of factors that are missing like the feature to download the converted speech into a file(say mp3) and in most cases with the existing tools, either they need to pay for some license or they need to be connected with the internet to use the tool. That why I came up with my own little tool for low vision developers and focused on the most needed place for them which is Github, Readme File.
Architecuture
Here I have focused on the use-case rather than the architecture or technology that the reason you can see a very simple architecture like
-Azure front door
-Azure Webapp
-Azure Text-to-Speech Cognitive API
Source Code:
This is an open-source project.
jayendranarumugam / ReadmeAloud
"ReadmeAloud" is a simple tool that can convert the raw-text from any public GitHub readme file to Speech π€ and also provides a way to download it in form of an mp3 file π΅
Workflow
The user will provide a valid Github Raw Readme.md URL e.g, https://raw.githubusercontent.com/jayendranarumugam/DemoSecrets/master/README.md in the azure front door
Once the user clicks the
Search
button the azure front door route the traffic to the azure web app (Blazor) securely then convert the text from the URL to Speech using Azure Cognitive APIOnce the Speech is converted successfully, the audio bytes will be used to play the audio on the browser and also provide the way to download it as an mp3 file.
Code Walkthrough
This is my first time coding a server-side blazor π. Mostly I reused the default boilerplate blazor code, in which I modified and added some parts for my projects e.g, SpeechService
. I hope I learned something about blazor now. I also put some great GitHub repos in the below References section which help me to understand blazor.
The entire logic of converting the text into speech is done using simple Azure Cognitive Speech SDK which you can find in SpeechService.cs
public async Task<byte[]> SynthesizeAudioAsync(string text)
{
SpeechConfig speechConfigForAudioAsync = SpeechConfig.FromSubscription(Configuration["CognitiveAPIKey"], Configuration["CognitiveAPIRegion"]);
speechConfigForAudioAsync.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3);
using (var synthesizer = new SpeechSynthesizer(speechConfigForAudioAsync, null as AudioConfig))
{
using (var result = await synthesizer.SpeakTextAsync(text))
{
if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
return result.AudioData;
}
else if (result.Reason == ResultReason.Canceled)
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
}
return null;
}
}
}
The result.AudioData
is the converted audio in (byte array) byte[]
format which we will use to play in the browser and download as an mp3 file using Javascript functions like downloadFromByteArray
, playAudio
and stopAudio
which you can find in the helper.js
Demo
Improvements:
Readmealoud Is more like a small prototype where we can fit many features on top of that easily. Being said that, there are some limitations with the current design which we can improve in the future. Some of them are
The architecture itself is currently highly coupled with the server(blazor), we can make it more scalable by introducing a separate layer for the cognitive calls i.e, the azure function
Current limitation to convert only the public GitHub repo, we can improve that to private repo also by including some additional authentication.
Engish language is the by default for both text and speech conversation, we can improve that easily, since azure cognitive service has a very wide variety of supported languages
I'm not a front-end guy π. So feel free to contribute to readmealoud with your creative ideas or UI/
If the
Readme
file is too long then the speech conversation would take more time some time it eventually timed out too. Currently, it is suited for smallreadme
files. We can improve that by changing the architecture which we discussed above.We can also improve the design with more accessibility for people with low vision like providing voice search capability input for GitHub repo details.
Conclusion:
The whole idea of the readmealoud is to provide an easy and efficient way to understand open-source projects for everyone especially visually impaired or sight-impaired friends. Though I showcased this for GitHub readme URL, however, we can put any valid URL for given plain text content. This is just a small idea and I hope it will reach its own audience π€
References:
gpeipman / BlazorDemo
Demo application for my writings about Blazor
BlazorDemo
This is my simple Blazor application that demonstrates how to build SPA on Blazor and how to communicate with ASP.NET Core backend Demo application is simple books database.
Solution contains:
- Sample database BACPAC file (can be imported to MSSQL using SSMS)
- Client application with Blazor UI
- Basic select and CRUD operations are implemented in UI and in back-end
- Displaying of delete confirmation dialog and deleting of books
- Fully functioning add/edit form
- Pager component and support for data paging
- Dependency injection with custom service classes
- Protecting Blazor application and Azure Functions based back-end using Azure AD
Azure AD example
For Azure AD there are two project in solution:
- BlazorDemo.AdalClient - Blazor web application that supports Azure AD
- BlazorDemo.AzureFunctionsBackend - Azure Functions project with all functions that form back-end for Blazor application
On Azure the following services are needed:
- Azure AD - free one is okay
- Azure SQL - instance withβ¦
Top comments (0)