Speech to text

#devchallenge #assemblyaichallenge #ai #api

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

I built a web app to streamline audio-to-text transcription. Users can submit audio URLs, which the app then transcribes into text. The transcribed text is displayed alongside the original audio, allowing users to follow along word-by-word as the audio plays. Users can also click on any word in the transcription, and it will sync with the corresponding part of the audio. In addition, the app includes sentiment analysis, content safety labels to identify potentially sensitive or inappropriate material, and the confidence level of each transcribed word.

Demo

App Demo

Source Code

Journey

The web app uses AssemblyAI’s Universal-2 Speech-to-Text Model to transcribe audio into text. When a user submits an audio URL, the app sends it to AssemblyAI via their API. Once the transcription is complete, the web app utilizes all the data returned by AssemblyAI, including the transcribed text, confidence levels for each word, sentiment analysis, content moderation labels, summarization, speaker diarization, and word timestamps, to synchronize with the audio file and provide users with an accurate and insightful transcription experience.

DEV Community

Speech to text

What I Built

Demo

Journey

Top comments (0)

Read next

Top 7 Data Careers You Should Know About in 2025

How to chat with Local LLM in Obsidian

Emergent Abilities of Large Language Models – Fact or Mirage?

Getting Responses from Local LLM Models with Python