DEV Community

Cover image for AI Telegram bot for COVID-19 voice recognition
Serverspace.us
Serverspace.us

Posted on

AI Telegram bot for COVID-19 voice recognition

A CIS Student is developing a new COVID-19 recognition system based on sound data analysis. This program will use loudness level or pitch range as a main feature of COVID-19 detection. The accuracy of prediction is 91.7%.

About project

The idea of COVID-19 voice recognition is based on applying classical machine learning algorithms and selecting the best one. Classical MLAs such as the k-NN method, logistic regression, random forest, and decision tree were used to determine the appropriate approach. All models were trained, and their accuracy was checked on validation data using k-fold cross-validation. The decision tree gave the best accuracy and was allowed as the model with the best parameters.

For these results, the student took a dataset with audio data and extracted features from it. After this, he used only the most significant and non-correlated ones. As a result, the model was trained on these attributes.

Important features for this project that describe aspects of sound vibrations (pitch, loudness, slope, and spectral characteristics) were identified based on data analysis. The system is trained to predict COVID-19 based on the sound signals.

Traits

The system takes various attributes and asks a series of questions to classify a user's voice data as healthy/infected by COVID-19. Here are some of the traits that are used:

  • F0semitoneFrom27.5Hz_sma3nz_pctlrange0-2: range of pitch in semitones between 0-2% of the total range.
  • loudness_sma3_meanFallingSlope: average loudness of a sound and its drop over time.mfcc1V_sma3nz_stddevNorm: describes the change in the first coefficient of the mel-frequency cepstral coefficients (MFCC) over time.
  • F0semitoneFrom27.5Hz_sma3nz_percentile80.0: Describes the 80th percentile of the pitch range in semitones.
  • SlopeV0-500_sma3nz_amean: Describes the average value of the slope of the sound in the frequency range from 0 to 500 Hz.

Image description

Further plans

Currently, the system is being tested based on training and test data. However, the creator plans to conduct additional research and test his system on real people.

The student continues to work on the development of his project and strives to make it available to the public. The creator plans to launch a Telegram bot that will be hosted on Serverspace's cloud servers as part of the practical application of the developed technology. Users will be able to send voice audio files to the bot and receive predictions in the format 0 or 1, where 0 means that the person is healthy and 1 indicates a possible COVID-19 disease.

If you need more details on how the project works, what signs are used, and what was the inspiration for the author, we would create the article "COVID-19 Recognition System by Voice Signs".

✌️Give us a sign if you like this type of content!

Top comments (0)