DEV Community

Javel Rowe
Javel Rowe

Posted on

Linode + DEV Hackathon WavScribe

What I built

I built an app that makes it easy to transcribe audio files.

Category Submission:

SaaS Superstars: Awarded to the team whose app shows the greatest potential for success as a Software-as-a-Service (SaaS) product or money-making venture.

Integration Innovators: Awarded to the team whose app shows the best integration with other services offered by Linode, such as managed databases or object storage.

App Link

WavScribe

Screenshots

Image of the landing page

Upload complete

Transcription

Description

WavScribe is a web app that aims to make audio transcription accessible to everybody! It's free for now for 1 minute of transcription. You just upload your audio file then wait a short while til your transcription job is completed then voila! 🗣📝
Sound to words like magic!

After upload, you get the file id (copy and keep this!!!).
Next step is to click the status link in the nav bar.
You'll be taken to a page where you can paste the file id and poll until your transcription appears 😅

Link to Source Code

WavScribe Web App
WavScribe Python Speech2Text

Permissive License

MIT

Background

A few friends of mine started podcasts and wanted a way to generate transcripts easily. It so happened that I was playing around with AI libs and came upon openai/whisper 😁🤫
I loved the fact that it was performant even on a CPU only machine so I spun up this web app to make it easy for them to get their transcriptions!

How I built it

I used Linode Object storage to keep the audio files and generate presigned urls. I also used Linode Managed MYSQL to store the transcriptions and provide them to users!
Finally, I used Linode Domains to handle the DNS records for my domain. Instead of going back and forth between their and PorkBun, I just did everything in one place 😃 How cool is that??

For the frontend I used Nextjs with Tailwind 💕
For the Speech to Text aspect I used openai/whisper through HuggingFace transformer pipelines. It made it less stressful to work with and get a demo up and running 🤗
I learnt a lot of sys admin stuff on the journey, For example:

  • How to setup and configure NGINX with certbot
  • Using a process manager (PM2)
  • Configuring firewalls

On the backend side, I learnt how to use RabbitMQ! I've been hearing alot about Kafka and other queue systems so it was cool to get the chance to play around with it.
I used it to handle the influx of requests and do them sequentially.
Finally, I learnt a bit more about Docker and passing arguments. 🚢

Additional Resources/Info

I want to add a few more features:

  • User registration so you can see your past transcriptions
  • The ability to play the audio and see the timestamped transcription
  • The ability to generate a shareable link to that transcription with timestamps
  • Payment plans to allow for longer transcriptions!

Top comments (0)