This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.
What I Built
I built a Conversation Visualizer- a fun and interactive way to bring conversations to life!
This project animates real-time dialogues by transforming speakers into expressive avatars that react to the conversation visually. Each avatar represents a participant, showcasing their emotions and activity dynamically. Whether it’s a heated debate, a casual chat, or a storytelling session, this tool captures and visualizes the essence of the conversation in an engaging and visually appealing way.
Demo
Website Demo
You can try with these sample audios.
Audio Sample 1
Audio Sample 2
Audio Sample 3
Backend Code
Frontend Code
Video Demo
Journey
Universal-2, AssemblyAI’s Speech-to-Text model, was instrumental in building my Conversation Visualizer. It provided speaker identification and insightful summaries to enrich the experience.
How I Used AssemblyAI:
1. Speaker Diarization: This feature identified individual speakers, segmenting the audio into distinct utterances. These utterances were then mapped to animated avatars, creating an engaging and real-time dialogue visualization.
2. Summarization: I utilized the "catchy" summary model with the headline type to generate a concise and captivating conversation overview. This summary acts as a quick takeaway for viewers, encapsulating the essence of the dialogue in a single, impactful headline.
3. Seamless Integration: By combining diarization with summarization, I provided users not only with a dynamic, visual representation of conversations but also a high-level overview for quick context, making the tool versatile for various use cases like podcasts, meetings, or storytelling.
The combination of AssemblyAI's diarization and summarization made it possible to turn ordinary audio conversations into visually compelling and insightful narratives, enhancing functionality and creativity.
Top comments (0)