I'm using tkinter,a python module, to create the Graphical User Interface(GUI) in which our application is gonna be wrapped in.
This project is a mere try to produce a simple version of Windows Cortana using python.Though my application may not be as robust as Cortana it could perform some functions that Cortana would.
As you can see the interface contains a button("listen") which turns on the mike of your system to recognize your speech and process your query.
1)Greets user when he/she greets
2)Informs time,date and month to the user when he/she demands it.
3)Opens web browsers and search pharses specified by user.
4)Opens Youtube and searches the video specified by the user.
5)Opens Wikipedia and searches the term specified by the user.
1)Tkinter to create GUI
2)Pyttsx3 to convert text to synthetic voice
3)Speech Recognition and Pyaudio to recognize speech input from user
6)selenium to open webdriver and automate search engine
Make sure the modules are installed in your machine.
Assuming that you know the fundamentals of these modules I write this article.
In order to keep things simple I created a giant class and instance of the class.Within the class I defined every method required.The class contains 5 parts.
The first part is an init function which defines our GUI
and wraps the button inside it.
The second part contains a method which converts text to speech.
The third part contains a method which recognizes speech from user and returns text.
The fourth part is not just one function but a bunch of functions each with it's own functionality(such as a function to greet the user,search web browser,fetch time and date).
The fifth and last part of the class is a method defined to process the input given by the user and to give desired output.
Alright let's get started with the code!!
#Modules used from tkinter import* import pyttsx3 import speech_recognition as sr import pyaudio import random import time from tkinter import messagebox from selenium import webdriver from selenium.webdriver.common.keys import Keys
Having imported the modules we shall start creating the class.
#Main class class Window(Frame): #defining our main window def __init__(self,master): self.master=master master.title("DREAM") A=Label(master,text="try saying *what can you do*") A.pack(side="top") Button(master,text="listen",width=100,relief="groove",command=self.Processo_r).pack(side="bottom") root=Tk() #instance of the class app=Window(root) root.geometry("300x50") #Runs the application until we close root.mainloop()
This class defines a GUI window with a text saying "try saying what can you do" and a button.Note that the button has an attribute named "command" which is linked with a class method.It means that when the button is pressed the "self.processo_r" method(which will be defined further) gets executed.
The upcoming methods are defined inside the class
def speak(self,output): #initiating the speech engine engine = pyttsx3.init() #speaking the desired output engine.say(output) engine.runAndWait()
In order to convert text to speech i'm using Pyttsx3 module.The method has one parameter output which will be spoken by the synthetic voice.
def speech_recog(self): #recognizer class r=sr.Recognizer() #Specifing the microphone to be activated mic = sr.Microphone(device_index=1) #listening to the user with mic as s: audio = r.listen(s, timeout=5) r.adjust_for_ambient_noise(s) #Converting the audio to text try: """I use google engine to convert the speech text but you may use other engines such as sphinx,IBM speech to text etc.""" speech = r.recognize_google(audio) return speech """When engine couldn't recognize the speech throws this""" except sr.UnknownValueError: #calling the text to speech function self.speak("please try again,couldnt identify") """This error shows up when the microphone cant pick up any speech""" except sr.WaitTimeoutError as e: self.speak("please try again")
I use speech Recognition module to recognize speech and to convert it to text.This function when called turns on the microphone and recognizes the speech.Then it converts it to text and returns it.
Now that i have defined three parts of the class i might as well start defining methods with specific functions.
def greet(self): #greets the user with a random phrase from A A=["Hi,nice to meet you","hello","Nice to meet you","hey,nice to meet you","good to meet you!"] b=random.choice(A) self.speak(b)
def tell_time(self): localtime = time.asctime(time.localtime(time.time())) a = localtime[11:16] self.speak(a)
This method uses time module to get local time of the user's device and informs the user when asked.
def tell_day(self): localtime = time.asctime(time.localtime(time.time())) day = localtime[0:3] if day == "Sun": self.speak("it's sunday") if day == "Mon": self.speak("it's monday") if day == "Tue": self.speak("it's tuesday") if day == "Wed": self.speak("it's wednesday") if day == "Thu": self.speak("it's thursday") if day == "Fri": self.speak("it's friday") if day == "Sat": self.speak("it's saturday")
This method uses time module to get day of the week and informs the user when asked.
def tell_month(self): localtime = time.asctime(time.localtime(time.time())) m_onth = localtime[4:7] if m_onth == "Jan": self.speak("it's january") if m_onth == "Feb": self.speak("it's february") if m_onth == "Mar": self.speak("it's march") if m_onth == "Apr": self.speak("it's april") if m_onth == "May": self.speak("it's may") if m_onth == "Jun": self.speak("it's june") if m_onth == "Jul": self.speak("it's july") if m_onth == "Aug": self.speak("it's august") if m_onth == "Sep": self.speak("it's september") if m_onth == "Oct": self.speak("it's october") if m_onth == "Nov": self.speak("it's november") if m_onth == "Dec": self.speak("it's december")
This method uses time module to get month of the year and informs the user when asked.
def search(self,web_name): self.speak("Searching") """Make sure that you have installed the specific driver for your webbrowser.The executable_path could be different for you""" #Opeing the driver driver = webdriver.Chrome(executable_path="C:\Program Files (x86)\chromedriver.exe") #Navigating to google driver.get('https://www.google.com/') #Locating the search engine search_engine = driver.find_element_by_name("q") #Search the phrase(web_name) and hitting enter to show results search_engine.send_keys(web_name + Keys.ENTER)
I have used Selenium for searching the phrase specified by the user.As i have said before make sure you know the fundamentals of the specified modules.Using similar algorithm with some minor changes let's create a method to open google chrome,search Youtube videos and wikipedia articles.
def open_chrome(self): self.speak("opening chrome") driver=webdriver.Chrome(executable_path="C:\Program Files (x86)\chromedriver.exe") driver.get("https://www.google.com/")
def play_tube(self, vid_name): self.speak("Searching youtube") #intializing driver driver = webdriver.Chrome(executable_path="C:\Program Files (x86)\chromedriver.exe") #navigating to Youtube driver.get('https://www.youtube.com/') #Locating the Youtube search engine search_engine = driver.find_element_by_name("search_query") # searching the specified video search_engine.send_keys(vid_name + Keys.ENTER)
I have used the same algorithm which I used to search google to search youtube.The main difference is the driver navigates to google in the former and youtube in the latter.
def search_wiki(self, article): #intializing driver driver=webdriver.Chrome(executable_path="C:\Program Files (x86)\chromedriver.exe") #Navigating to Wikipedia driver.get("https://www.wikipedia.org/") #Locating the wikipedia search engine search_engine=driver.find_element_by_name("search") #Searching the specified phrase search_engine.send_keys(article+Keys.ENTER)
def functions(self): self.speak("here is a list of what i can do") messagebox.showinfo("DREAM functions", "1.Try saying 'Hi','Hello'" + "\n2.Try asking 'What day is this?'" + "\n3.Try asking 'What month is it?'" + "\n4.Try asking 'What time is it?'" + "\n5.You search in google by saying...'Search (or) Google <anything>'" + "\n6.Play youtube by saying'YouTube... <video_name>'" + "\n7.Search in Wikipedia by saying...'wikipedia...<anything>'" + "\n8.To close say 'Bye' or 'Sleep' or 'See you later'")
def shut(self): #bids the user goodbye and quits A=random.choice(["bye", "good bye", "take care bye"]) self.speak(end_greet) exit()
def Processo_r(self): speech=str(self.speech_recog()) if speech=="What can you do": self.functions() A=["hi","hello","hey","hai","hey dream""hi dream","hello dream"] if speech in A: self.greet() if speech =="who are you": self.speak("i'm dream") self.speak("your personal assistant") B=["what day is it","what day is today","what day is this"] if speech in B: self.tell_day() C=["what month is it","what month is this"] if speech in C: self.tell_month() D=["what time is it","what is the time","time please",] if speech in D: self.tell_time() if speech[0:6] =="Google": self.search(speech[7:]) if speech[0:7]=="YouTube": self.play_tube(speech[8:]) if speech=="open Chrome": self.open_chrome() if speech[0:9]=="Wikipedia": self.search_wiki(speech[10:]) E=["bye","bye dream","shutdown","quit"] if speech in C: self.shut() else: self.speak("I am sorry couldn't perform the task you specified")
This method gets executed when we press the Listen button in the interface.It calls the speech_recog function that we defined before and stores the returned text.Then it analyses the text with a series of "if" conditions and gives the user desired output.
After putting together the code the application should be working perfectly.Make sure you are connected to internet.You can also add some new methods to the class which performs something that pleases you!
Thank you for reading:-).
If you have any queries let me know by posting it in discussion.