Have you ever wondered how the voice recognition in the Noisy Boy from Real Steel movie works. For people who are new to Real Steel and haven't watched that movie yet. Watch at least the trainer and continue here.
I was wondering if I could do the same with Deepgram's real-time speech to text api. The result of two day hustle and surprise!! surprise!! It just worked.
I initially with the impression it won't since there is a considerable latency between the time you say something, it has to process it, send it back and the action will take place after 100 ms or so. But this was instantaneous or too close less than 3 ms. Seems like you can take on any league games on World Robot Boxing.
- HP Omen 15 (2018)
- 16 GB RAM DDR4
- Intel Core i7-8750H @2.20 GHz
- GTX 1060 - 6 GB GDDR5
- Live Demo on YouTube
- Make GitHub repo public (needs more finetuning work)