DEV Community

Cover image for Deepgram x DEV Hackathon Help Thread
dev.to staff for The DEV Team

Posted on

Deepgram x DEV Hackathon Help Thread

If you're participating in the Deepgram Hackathon on DEV, we’re so excited to have you joining us! Need some help with your submission or participation? You’re in the right spot.

If you have any questions about how this contest works (ex: due dates, how to post your submission, picking a category/challenge, etc.) the DEV team is here to help you.

How to Use This Thread

If you’ve browsed Deepgram’s documentation but need some help understanding or implementing a feature, @bekahhw from the Deepgram team will be here to jump in and assist you. This thread is a great resource for you whether you’re taking part in the “Build” or “Innovative Ideas” challenge.

Comment below if you need asynchronous assistance with the Deepgram Hackathon on DEV ❤️

Note: Don’t forget that we’re granting a special profile badge to anyone (outside of the DEV and Deepgram teams) who answers a question and to anyone who ASKS a technical question about Deepgram here in the help thread. Additionally, the DEV team will be randomly selecting one person per badge category to receive $50 USD to the Forem Shop! This goes for anyone in the DEV Community – whether you plan to submit an entry or not. For more information, take a look at the official contest rules here. DEV and Deepgram will wait to answer thread questions until we’ve given the community a chance to hop in.

Open Office Hours with Deepgram Devs

Deepgram will also be hosting open “office hours” on their Twitch Stream every Friday at 6:30 PM UTC throughout the contest (March 11, 18, & 25. April 1 & 8). During these streams, Deepgram will be addressing some of the questions asked in this thread and in the live Twitch chat. If they choose a question you posted to the help thread to talk about on the live stream, you'll be entered to win some Deepgram Swag!


If you'd like to share an update on the progress you're making on your project or if you'd like to connect with other participants, please do so in the Community Discussion Thread!

Top comments (118)

Collapse
 
sandy_codes_py profile image
Santhosh (sandy inspires)

@michaeljolley
I'm trying to send real-time audio to deebgram and get the transcription but I get this error while doing so.
I'm using pyaudio to send the binary data.

input_audio = stream_in.read(3200)
await ws.send(input_audio)

DEBUG:websockets.client:< CLOSE 1008 (policy violation) DATA-0000 [11 bytes]

Collapse
 
michaeljolley profile image
Michael Jolley

Hi @sandy_codes_py, I'm not a Python pro, but I can hook you up with @tonyasims. She's an amazing pythonista and one of Deepgram's Developer Advocates.

One thing that might help, is there a GitHub repo where we can review the full block of code?

Collapse
 
sandy_codes_py profile image
Santhosh (sandy inspires)

Please check here
I've used api reference code from the deepgram docs

Thread Thread
 
michaeljolley profile image
Michael Jolley

Sweet. Thanks for that. Tonya is out of the office today, but she'll likely respond Monday.

Mental note: I should really start playing with Python more so I can help more. 🙂

Thread Thread
 
sandy_codes_py profile image
Santhosh (sandy inspires) • Edited

That would do!
I did find a workaround it by using her post "Live Transcription With Python and Flask" but that's not really needed here.
I just want to run it locally to do something cool.
Yay, even I've to learn about async and how it works.

Thread Thread
 
michaeljolley profile image
Michael Jolley

Great! I think she wrote several posts like that: "using Flask," "using Django," "using FastAPI," and more.

And yeah! for learning! 🎉🎉

Collapse
 
tonyasims profile image
Tonya Sims

Hi @sandy_codes_py ! Happy Monday! I'm sorry to hear you were having some trouble. Were you able to figure out the issue? Please let me know if you still need some help and we can work through it together.

Collapse
 
sandy_codes_py profile image
Santhosh (sandy inspires)

Nope, I followed your Flask repo and did that same. But I really want to run that in just Python and not on Flask. I do have limited knowledge on asyncio. Will be learning that soon. I've shared the error and the complete code I'm using the same thread.

Thread Thread
 
tonyasims profile image
Tonya Sims • Edited

Ok, so if I understand correctly you follwed the Flask example in the tutorial but still having some issues? Is the issue still with PyAudio?

Also, is this the error message you received? (I want to make sure this is the correct error):

input_audio = stream_in.read(3200)
await ws.send(input_audio)

DEBUG:websockets.client:< CLOSE 1008 (policy violation) DATA-0000 [11 bytes]

Thread Thread
 
sandy_codes_py profile image
Santhosh (sandy inspires) • Edited

Sorry to misled you!
Actually I wanted to directly send the audio feed using PyAudio to the Deepgram websocket (that's when the above error is occurring).
But I found your Flask tutorial and used that instead which worked in the first go.

The complete code I used can be found here.

Thread Thread
 
tonyasims profile image
Tonya Sims

Oh nice! Good to hear you found a solution with the tutorial 😄.

About asyncio, yea, I totally understand it can be very confusing. It took me awhile to wrap my head around it. How do you plan on learning asyncio? Tutorials? Blog posts? Videos? Something else?

Thread Thread
 
sandy_codes_py profile image
Santhosh (sandy inspires)

Gonna read through this for a while and try some hands-on stuff.
docs.python.org/3/library/asyncio....
I'll try to make a tutorial here once I get a good grasp.

Thread Thread
 
tonyasims profile image
Tonya Sims

Wonderful! Make sure to let me know when it's published so I can read it :)

Thread Thread
 
sandy_codes_py profile image
Santhosh (sandy inspires)

You got it!

Collapse
 
dhravya profile image
Dhravya

I want to participate, but, the post says that it's Open only to 18+.

So, does that mean I'm not eligible ? :(

Collapse
 
bekahhw profile image
BekahHW

I don't think it means that you aren't eligible for badges for helping. So if you contribute in the Help and Community threads, I think that means you can still earn badges.

Collapse
 
dhravya profile image
Dhravya

i have a couple ideas, if i just write a post with he ideas, would I be eligible? (I could even execute them)

Thread Thread
 
denvercoder1 profile image
Jonah Lawrence

The post seems to imply that all parts of the Hackathon are 18+ to participate. It is likely there for legal reasons, and I would think that if you should be able to participate if your parents approve.

Thread Thread
 
graciegregory profile image
Gracie Gregory (she/her)

Hello! Unfortunately, DEV hackathons are currently only open to community members 18 years of age and older. We would absolutely love to open this up to younger community members, but we are legally unable to at this time. We revisit the topic frequently as a team and intend to open up our challenges to younger folks if and when it's possible for us to do so. I'm so sorry to bear this news. We really value having all of you as part of DEV.

Collapse
 
mainrs profile image
mainrs

It's normal for contests to be 18+ due to prizes and legal ruling.
You can ask the dev people if it's OK if you submit but are not considered as a potential winner.

Collapse
 
dhravya profile image
Dhravya

Yeah, I've already submitted my project. It's completely fine if I don't win, or win and not get prizes. atleast I got to learn stuff, that's all that matters

Thread Thread
 
michaeljolley profile image
Michael Jolley

That's a great attitude @dhravya! I'm REALLY looking forward to checking out your submission.

Thread Thread
 
dhravya profile image
Dhravya

Hi there! Thanks a lot!
Here's the post dev.to/dhravya/deepsubtitles-gener...

Thread Thread
 
arndom profile image
Nabil Alamin

Well that was fast, great job 👌👌

Thread Thread
 
dhravya profile image
Dhravya

thanks a lot!

Collapse
 
fp profile image
FrankPohl

Hi,
I would like to participate with a .NET solution using your new .NET SDK.
This should use the audio from the microphone to implement a voice dialog within the app (kind of a bot and some speech control for the app).
Unfortunately, I could not find any sample how to connect the Microphone stream with the CreateLiveTranscriptionClient.
Can you give me some hint show to connect the mic audio with the SendData function?
Thanks Frank

Collapse
 
michaeljolley profile image
Michael Jolley

Hi Frank, do you have a stream of audio to send? As in, have you already captured audio from the mic?

Collapse
 
fp profile image
FrankPohl

Hi Michael,
I can get the audio in from the mic and write it to a file with the help of the NAudio package..
There you can see the code that writes the input to a file. It is in the event handler OnDataAvailable but as a comment.
But this is the same method that has the call SendData to send data to the deepgram service.
I have posted my code in this repository github.com/FrankPohl/HealthAssista...
This is a MAUI app because besides console apps, this is the only platform with .NET 6 support.
I use this because my plan was to build an app to help handicapped people to put in data like their blood pressure with speech. Therefore I need a UI and a permanent connection from the microphone to the recognition service.

Btw., you should have mentioned in your sample code with the logging what nuget packages you have used. Took me some time to figure out why "AddSerilog" did not work.

Thread Thread
 
fp profile image
FrankPohl

I can add more information about the format from the Microphone stream.
Samplerate: 48000
Encoding: IeeeFloat
Bits: 32
Channels: 2
Blockalign: 8
Bytes per Second: 384000

Must this be converted or should it be processable from the interface?

I added the file test.wav to the repository. This file was recorded from my microphone.
I also added a function to send a file for conversion to the deepgram service.
Trying to convert this raise error 400 (Bad Request).
I hope someone can come up with a solution so that I can continue to work on my contribution to the hackathon.

Thread Thread
 
michaeljolley profile image
Michael Jolley

Thanks @fp. I've been playing with it yesterday and realized it was IeeeFloat. The API doesn't support that format. You'll need to convert it to something else. I noticed in your request you're telling the API you're sending Linear16. That's fine once the raw audio is converted to that.

Thread Thread
 
fp profile image
FrankPohl

Hello Michael,
I made a conversion for my input.
I do not get an error when I use the CreateLiveTranscriptionClient but I do not get any result from deepgram but I cannot find out what the problem. Maybe you can have a look? I updated the github project witht he latest code samples.
If I try to convert a file that is PCM with a sample rate of 16000 I get an error from the SDK. The file is TestConverted-16000-Pcm-2.wav and is a recorded and converted audio sample from me.
Would be nice if you could help me with that again.

Frank

Thread Thread
 
michaeljolley profile image
Michael Jolley

I'm looking at this tonight and will let you know something soon!

Thread Thread
 
michaeljolley profile image
Michael Jolley

Okay. I created a quick .NET 6 Console app and have it working based on a slightly modified version of your code. You can find it at gist.github.com/MichaelJolley/b52f...

Hope that helps!

Thread Thread
 
fp profile image
FrankPohl

Thanks for your help Michael.
What a shame to bother you with such a stupid error in my code.
I would like to publish that program as an example on Git, or will you do that?

Thread Thread
 
michaeljolley profile image
Michael Jolley

No way @fp! It took me a while to figure it out so don't feel bad at all. I'm glad it's working for you now.

You can certainly publish it if you'd like.

Collapse
 
rutamhere profile image
Rutam Prita Mishra • Edited

Hello @michaeljolley and DG Team 👋

I just wanted to know do we have any examples where we are writing the transcripts to a separate output file instead of showing the complete json on the console. And by transcript, I mean the only part in the json output that contains the text sentences.

Moreover, let me know how to make sure we scan through the complete length of the audio we are passing instead of just a part of it.

Thanks in advance.

Collapse
 
michaeljolley profile image
Michael Jolley

Hiya @21rutam! Good question. Let me make sure I give you the right answer.

First, are you using one of our SDKs? If so, which one?
Second, are you trying to save the whole payload or just the words?

Collapse
 
rutamhere profile image
Rutam Prita Mishra • Edited

Yeah, I am using NodeJS to do the job. I want to just save the transcript part of the response payload. And do let me know if we can define the duration of the audio for scanning (Complete audio or Just for a timeframe)
CC: @michaeljolley

Thread Thread
 
michaeljolley profile image
Michael Jolley

You can't define the duration. The API will try to transcribe the entire audio file every time, not just a section.

Using the Node SDK, you could send your request with the utterances feature turned on. (e.g. utterances:true)

Then, when the transcription comes back you can use the .toSRT() or .toWebVTT() functions to generate a text based version of the transcript with timestamps. Then you'd want to save it locally using fs.

Example:

const { Deepgram } = require('@deepgram/sdk');
const fs = require('fs');

const deepgram = new Deepgram(DEEPGRAM_API_KEY)
const audioSource = { url: URL_OF_FILE };

deepgram.transcription.preRecorded(audioSource, {
  punctuate:  true,
  utterances: true,
  // other options are available
})
.then((response) => {
  const srtTranscript = response.toSRT();

  fs.writeFile(FILENAME_TO_SAVE, srtTranscript, function (err) {
    if (err) {
      return console.log(err);
    }
    console.log("The file was saved!");
  });
})
.catch((err) => {
  console.log(err);
});

Enter fullscreen mode Exit fullscreen mode
Thread Thread
 
rutamhere profile image
Rutam Prita Mishra • Edited

That was real quick. But I don't really want it like a subtitles file. Rather I just want the transcript text to be saved to the file and not anything else. I am talking about the sentences in that transcript: part (the one marked in purple).
CC: @michaeljolley

Thread Thread
 
michaeljolley profile image
Michael Jolley

You could just use the transcript property itself:

const { Deepgram } = require('@deepgram/sdk');
const fs = require('fs');

const deepgram = new Deepgram(DEEPGRAM_API_KEY)
const audioSource = { url: URL_OF_FILE };

deepgram.transcription.preRecorded(audioSource, {
  punctuate:  true,
  // other options are available
})
.then((response) => {
  const srtTranscript =response.results.channels[0].alternatives[0].transcript;

  fs.writeFile(FILENAME_TO_SAVE, srtTranscript, function (err) {
    if (err) {
      return console.log(err);
    }
    console.log("The file was saved!");
  });
})
.catch((err) => {
  console.log(err);
});
Enter fullscreen mode Exit fullscreen mode
Thread Thread
 
rutamhere profile image
Rutam Prita Mishra

Thanks a bunch @michaeljolley . You rock 🙌🚀

Collapse
 
tqbit profile image
tq-bit

Hello guys,

I'm using the Node.js SDK to stream an audio file from a web application. I'm using Typescript and was wondering why I receive the following ts error here.

Do I really need the version property here? Could anything unexpected happen if I don't?

My current code looks something like this (and works file if I set @ts-ignore on top of it):

import { Deepgram } from '@deepgram/sdk';

...

export default class Transcriber {

   ...

  public translateFromLocalFile = async () => {
    const streamSource = {
      stream: fs.createReadStream(this.filePath),
      mimetype: this.mimeType,
    };

    const response = await this.deepgram.transcription.preRecorded(streamSource, {
      punctuate: true,
    });

    const transscript = this.getHighestRatedTranscript(response)

    return transscript;
  };
  ...
}

...
Enter fullscreen mode Exit fullscreen mode
Collapse
 
michaeljolley profile image
Michael Jolley • Edited

Great find. That's a bug in our SDK. version is an optional parameter.

Collapse
 
michaeljolley profile image
Michael Jolley

New release is on NPM now with that fix. Thanks for catching & reporting it.

Thread Thread
 
tqbit profile image
tq-bit

That was quick. Thanks :-)

Collapse
 
pythonperfection profile image
Eli Ostreicher • Edited

Having a little bit of a hard time figuring out the final part of the listen endpoint.
1) In a local environment, how would I go about sending the empty binary message to the server?
2) Upon ending, where is the final transcript JSON now?

Thank you.

Collapse
 
michaeljolley profile image
Michael Jolley

Great question. It depends on how you're hitting the API. If you are using the Node SDK you'd use the finish function, as in:

const deepgramSocket = deepgram.transcription.live({ punctuate: true });
deepgramSocket.finish();
Enter fullscreen mode Exit fullscreen mode

If you're communicating with the WebSocket directly with JavaScript you can send a new Uint8Array, as in:

socket.send(new Uint8Array(0));
Enter fullscreen mode Exit fullscreen mode

When the Deepgram API receives that it will finish transcribing the audio, send a final transcript, and then close the WebSocket connection.

Collapse
 
pythonperfection profile image
Eli Ostreicher

Appreciate the response.
What about good old Python please?

P.S. Great Twitch btw, really enjoyed it.

Thread Thread
 
michaeljolley profile image
Michael Jolley

Thanks!

For the Python SDK:

deepgramLive = await deepgram.transcription.live()
await deepgramLive.finish()
Enter fullscreen mode Exit fullscreen mode

Against the WebSocket without the SDK would be something like:

await socket.send(b'')
Enter fullscreen mode Exit fullscreen mode
Thread Thread
 
bekahhw profile image
BekahHW • Edited

Thanks for coming to the stream today!

Collapse
 
fp profile image
FrankPohl

@michaeljolley I create a sample to send audio directly to deepgram from the microphone.
I have created a repository for this sample here github.com/FrankPohl/DeepGram.NETS...
But there is one thing I do not understand. I resample the audio input to PCM with a sample rate of 16000. But in the deepgram options is 44100 given as a sample rate. If I change that to 16000 I do not get a transcription. Why is that?

Collapse
 
michaeljolley profile image
Michael Jolley

That's a REALLY good question @fp! That code isn't resampling, it's converting it from 32-bit to 16-bit. IeeeFloat is a 32-bit format. We're basically converting C# long into C# short. This blog post does a good job at describing the differences in the two.

Collapse
 
fp profile image
FrankPohl

@michaeljolley Which code is not resampling? The code in my example on Github converts from to short, that's right. But in a second step it does resampling because I'm averaging 3 consecutive input values into one output value. I thing this means that the sampling rate is reduced from 48000 to 16000. The wav file that is written in parallel has this sampling rate and sounds alright.

Collapse
 
minsu profile image
Minsu

Hi! First of all, Love the idea of this hackathon! Thanks for hosting, Deepgram :)

I have question about real-time transcript with deepgram(node.js SDK).

I tried this tutorial below
developers.deepgram.com/documentat...

I am wondering if there is a way to use 'opus stream' audio instead of 'url' here in the tutorial for transcription??

Thank you!

Collapse
 
michaeljolley profile image
Michael Jolley

Solid question. Normally you wouldn't want to send a stream from a URL in, you'd be accessing a microphone and streaming that audio in. I haven't tried opus stream specifically, but you can certainly try it. Is there a reason you need opus stream? That seems like an unusual format for live streaming.

Collapse
 
minsu profile image
Minsu • Edited

Thanks for reply Michael! yeah We use opus because we are building a real time transcription discord bot for deaf gamers! And discord.js uses opus as a format… My teammates gave up in the middle so I don’t think I can finish it by 11th but still wants to get this one done :)

Thread Thread
 
michaeljolley profile image
Michael Jolley

That's amazing! So I think you may can stream that in but you'd want to make sure you specify the encoding, sample rate, etc. in your request.

Collapse
 
mhasan profile image
Mahmudul Hasan

I was treid to make a Live speech Transcript project, but failed due to unknown error. Can someone please help me on that?

This show in the browser console:
This is error

Here is my GitHub link: github.com/mhasanmeet/DEEPGRAM-liv...

Collapse
 
michaeljolley profile image
Michael Jolley

Nice! I can help with that.

It looks like the credentials you're sending are incorrect.

image

That code looks like you're sending the API Key Id rather than the API Key. When you create an API key, be sure you're copying the key itself. You should see a screen like below. You can click that copy icon to copy the actual key. Be sure to copy it, because you won't be able to see it again for security purposes.

image

Collapse
 
mhasan profile image
Mahmudul Hasan

Thanks Micheal, I'm gonna try it again. I'm really exicited about this Deepgram project. I have so much idea, I will implement one by one. And thanks, you guys already made some awesome demo projects, it really helpful!

Thread Thread
 
michaeljolley profile image
Michael Jolley

Awesome! I can't wait to see what you build. Be sure to reach out if I can help!

Collapse
 
sonu0702 profile image
sonu0702 • Edited

I created Deepgram account. But How do i
Select one of the following four categories

  1. Accessibility Advocates
  2. Analytics Ambassadors
  3. Gram Gamers
  4. Wacky Wildcards

I don't see any of this options on the Deepgram website

Collapse
 
michaeljolley profile image
Michael Jolley

The instructions said "Select one of following categories," but it isn't clear that you're really selecting it in your mind. You're deciding what category you want to build a project for. When you create a submission you'll use this template to create your submission post. That template has a section where you'll enter what category you're submitting your project for.

screenshot of template

Collapse
 
sonu0702 profile image
sonu0702

Thanks, It's helpful