DEV Community

Cover image for Present like a boss with transcription and analytics based on Deepgram
Miguel A. Calles
Miguel A. Calles

Posted on

Present like a boss with transcription and analytics based on Deepgram

Introduction

I learned about Deepgram from the hackathon on DEV. I had not heard of them prior. After reviewing their docs, it seemed to make speech to text much easier to adopt than some other services. Other services seem complex to build upon and building simple uses would take a long time. Deepgram seems like it could shorten development of simple use cases.

My Deepgram Use-Case

Have you ever heard of someone giving a talk or a presentation and been impressed with their ability to present? You may have noticed they did not use filler words like "umm", "so", "and", etc.

Think about the last time you heard a presentation and the speaker used these filler words frequently. Did you find the presentation hard to follow? Did the filler words seem to get louder and louder?

Organizations like Toastmasters count how many filler words people use in their presentations. What if you could practice a presentation with an app and it would give you stats on filler words?

What if the app could give you a score on sentiment? What if it could tell you how well your presentation would be received?

Dive into Details

The app would Deepgram's text to speech. Either the prerecorded or live streaming options would work.

Initially, the app would count filler words. The app would give statistics per presentation and over time. Whenever the number of filler words drops by every 10%, the presenter would get a badge. Another badge would get a special badge when they present without any filler words.

Based on this Deepgram tutorial, the app could provide statistics on total pause time per presentation and even pauses between words. The app could suggest optimal pause lengths and frequency for a good presentation.

Another feature would be add give statistics similar to reading level. Once the presentation is text form, the text could be analyzed for reading level. The average reading level would be provided and specifics for each sentence. The assumption is that the reading level would indicate what type of audience would be able to understand the presentation. A high reading level would indicate that the presentation is meant for expects, while a lower reading level would indicate a general audience. A presentation with the lowest reading level might be best appropriate for children.

The apps analytics will evolve based on presentation best practices and user feedback.

Conclusion

Deepgram's text to speech API seems to make transcriptions and analyzing the speech much easier than other platforms. Their capability can be used to help individuals improve their presentation and speaking skills by providing meaningful statistics and recommendations.


Photo by Teemu Paananen on Unsplash

Top comments (0)