Voice commands are intriguing, especially with a speech recognition API. After getting exposure to Deepgramβs real-time transcription, and speech-to-text Python SDK, I thought itβd be cool to scrape a website with my voice.
The way the project works is simple:
- Speak the command βscrapeβ into my computerβs microphone.
- That will kick off the Python scraper, which extracts links from a webpage.
Letβs take a closer look at how I built this project using Python, FastAPI, and Deepgram speech-to-text.
Python Code Web Scraper Using a Voice Command With Speech-to-Text
For this voice command scraper, I used one of Pythonβs newest web frameworks, FastAPI. Iβve already written a blog post about how to get up and running with FastAPI and Deepgramβs live transcription using the Python SDK.
Since thereβs already a tutorial about FastAPI written on Deepgramβs blog, I wonβt go into tremendous detail as my original post covers most of the Python code.
Letβs start with the installation.
I installed two additional Python libraries from my terminal inside of a virtual environment:
pip install beautifulsoup4
pip install requests
Then, I added the import statements to the main.py file:
from bs4 import BeautifulSoup
import requests
import re
BeautifuSoup
is for web scraping.
The requests
library is to get the text from the page source.
The re
import is to get the links in a specific format.
The only new function in this file is scrape_links
. I also defined a new list called hold_links
which will hold all the links extracted from the webpage. I pass in a URL to scrape to requests.get
and loop through a BeautifulSoup object. A link from the webpage gets appended to the list each time through the loop.
hold_links = []
def scrape_links():
url = "https://xkcd.com/"
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
for link in soup.find_all("a", attrs={'href': re.compile("^https://")}):
hold_links.append(link.get('href'))
return hold_links
Next, is the get_transcript
inner function.
ββasync def process_audio(fast_socket: WebSocket):
async def get_transcript(data: Dict) -> None:
if 'channel' in data:
transcript = data['channel']['alternatives'][0]['transcript']
if transcript and transcript == 'scrape':
scrape_links()
await fast_socket.send_text(transcript)
deepgram_socket = await connect_to_deepgram(get_transcript)
return deepgram_socket
The only change here are these lines to check if thereβs a transcript and if the transcript or voice command is βscrapeβ, then call the scrape_links
function:
if transcript and transcript == 'scrape':
scrape_links()
Last but not least, when rendering the template, I passed in the hold_links
list as a context object so the HTML page could display the links using Jinja.
@app.get("/", response_class=HTMLResponse)
def get(request: Request):
return templates.TemplateResponse("index.html", {"request": request, "hold_links": hold_links})
In the index.html file, I added the following line to the <head></head>
section to refresh the page every five seconds:
<meta http-equiv="refresh" content="5" />
The page needs to be refreshed after speaking the voice command βscrapeβ to display the extracted links.
Lastly, in the <body></body>
, add these lines which loop over the extracted links from the webpage and render them to the HTML page, index.html
:
<body>
<p>
{% for link in hold_links %}
{{ link }}</br>
{% endfor %}
</p>
</body>
Finally, to run the FastAPI Python voice-to-text web scraper, type uvicorn main:app --reload
from the terminal and navigate to http://127.0.0.1:8000/
.
After speaking the word βscrapeβ into my computerβs microphone, a list of extracted links for the specified URL appeared on the webpage.
If you found my project exciting or have questions, please feel free to Tweet me! Iβm happy to help!
Top comments (0)