Hi, just wanted to let you know that I got your demo working on a Linux desktop x86_64 architecture, running Ubuntu 18.04.3, using a Logitech USB C922 webcam.
There were several challenges in finding all the things that needed tweaking, so I thought I'd share for others who may run into some of these issues.
1) deployment.template.json: Edit the azureSpeechServicesKey to match your Azure Cognitive Service's Speech service key (not BingKey, as stated in the tutorial)
2) module.json: In each module's folder, edit the "repository" line to point to your localhost:5000 instead of glovebox
3) azure_text_speech.py: Edit the TOKEN_URL to point to the one that Azure provides for you when you set up your speech service. Also edit the BASE_URL to point to the text-to-speech base URL for your region. For example, I had to edit mine to point to my region:
4) text2speech.py: For whatever reason, wf.getframerate() would not return the correct frame rate of my audio, causing an error.
Expression 'paInvalidSampleRate' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2048
So I ran 'pacmd list-sinks' to find my actual audio sample rate (48000) and hardcoded it in place of wf.getframerate.
5) predict.py: Lastly, my camera-capture module kept getting connectivity issues, which was actually because it kept returning a response error stating:
Error: Could not preprocess image for prediction. module 'tensorflow' has no attribute 'Session'
This method was deprecated, so to fix this, edit the predict.py's line from 'tf.Session()' to 'tf.compat.v1.Session()'
After all was said and done, I was able to get it working:
Sure I could share my predict.py edit. It's simply a one line edit on line 123 from tf.Session() to tf.compat.v1.Session().
fromurllib.requestimporturlopenfromdatetimeimportdatetimeimporttimeimporttensorflowastffromPILimportImageimportnumpyasnpimportsysclassPredict():def__init__(self):self.filename='model.pb'self.labels_filename='labels.txt'self.network_input_size=0self.output_layer='loss:0'self.input_node='Placeholder:0'self.graph_def=tf.compat.v1.GraphDef()self.labels=[]self.graph=Noneself._initialize()def_initialize(self):print('Loading model...',end=''),withtf.io.gfile.GFile(self.filename,'rb')asf:self.graph_def.ParseFromString(f.read())tf.import_graph_def(self.graph_def,name='')self.graph=tf.compat.v1.get_default_graph()# Retrieving 'network_input_size' from shape of 'input_node'
input_tensor_shape=self.graph.get_tensor_by_name(self.input_node).shape.as_list()assertlen(input_tensor_shape)==4assertinput_tensor_shape[1]==input_tensor_shape[2]self.network_input_size=input_tensor_shape[1]withopen(self.labels_filename,'rt')aslf:self.labels=[l.strip()forlinlf.readlines()]def_log_msg(self,msg):print("{}: {}".format(time.time(),msg))def_resize_to_256_square(self,image):w,h=image.sizenew_w=int(256/h*w)image.thumbnail((new_w,256),Image.ANTIALIAS)returnimagedef_crop_center(self,image):w,h=image.sizexpos=(w-self.network_input_size)/2ypos=(h-self.network_input_size)/2box=(xpos,ypos,xpos+self.network_input_size,ypos+self.network_input_size)returnimage.crop(box)def_resize_down_to_1600_max_dim(self,image):w,h=image.sizeifh<1600andw<1600:returnimagenew_size=(1600*w//h,1600)if(h>w)else(1600,1600*h//w)self._log_msg("resize: "+str(w)+"x"+str(h)+" to "+str(new_size[0])+"x"+str(new_size[1]))ifmax(new_size)/max(image.size)>=0.5:method=Image.BILINEARelse:method=Image.BICUBICreturnimage.resize(new_size,method)def_convert_to_nparray(self,image):# RGB -> BGR
image=np.array(image)returnimage[:,:,(2,1,0)]def_update_orientation(self,image):exif_orientation_tag=0x0112ifhasattr(image,'_getexif'):exif=image._getexif()ifexif!=Noneandexif_orientation_taginexif:orientation=exif.get(exif_orientation_tag,1)self._log_msg('Image has EXIF Orientation: '+str(orientation))# orientation is 1 based, shift to zero based and flip/transpose based on 0-based values
orientation-=1iforientation>=4:image=image.transpose(Image.TRANSPOSE)iforientation==2ororientation==3ororientation==6ororientation==7:image=image.transpose(Image.FLIP_TOP_BOTTOM)iforientation==1ororientation==2ororientation==5ororientation==6:image=image.transpose(Image.FLIP_LEFT_RIGHT)returnimagedefpredict_url(self,imageUrl):self._log_msg("Predicting from url: "+imageUrl)withurlopen(imageUrl)astestImage:image=Image.open(testImage)returnself.predict_image(image)defpredict_image(self,image):try:ifimage.mode!="RGB":self._log_msg("Converting to RGB")image=image.convert("RGB")# Update orientation based on EXIF tags
image=self._update_orientation(image)image=self._resize_down_to_1600_max_dim(image)image=self._resize_to_256_square(image)image=self._crop_center(image)cropped_image=self._convert_to_nparray(image)withself.graph.as_default():withtf.compat.v1.Session()assess:prob_tensor=sess.graph.get_tensor_by_name(self.output_layer)predictions,=sess.run(prob_tensor,{self.input_node:[cropped_image]})result=[]forp,labelinzip(predictions,self.labels):truncated_probablity=np.float64(round(p,8))iftruncated_probablity>1e-8:result.append({'tagName':label,'probability':truncated_probablity,'tagId':'','boundingBox':None})print('[%s]'%', '.join(map(str,result)))response={'id':'','project':'','iteration':'','created':datetime.utcnow().isoformat(),'predictions':result}returnresponseexceptExceptionase:self._log_msg(str(e))return'Error: Could not preprocess image for prediction. '+str(e)
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Hi, just wanted to let you know that I got your demo working on a Linux desktop x86_64 architecture, running Ubuntu 18.04.3, using a Logitech USB C922 webcam.
There were several challenges in finding all the things that needed tweaking, so I thought I'd share for others who may run into some of these issues.
1) deployment.template.json: Edit the azureSpeechServicesKey to match your Azure Cognitive Service's Speech service key (not BingKey, as stated in the tutorial)
2) module.json: In each module's folder, edit the "repository" line to point to your localhost:5000 instead of glovebox
3) azure_text_speech.py: Edit the TOKEN_URL to point to the one that Azure provides for you when you set up your speech service. Also edit the BASE_URL to point to the text-to-speech base URL for your region. For example, I had to edit mine to point to my region:
4) text2speech.py: For whatever reason, wf.getframerate() would not return the correct frame rate of my audio, causing an error.
Expression 'paInvalidSampleRate' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2048
So I ran 'pacmd list-sinks' to find my actual audio sample rate (48000) and hardcoded it in place of wf.getframerate.
5) predict.py: Lastly, my camera-capture module kept getting connectivity issues, which was actually because it kept returning a response error stating:
Error: Could not preprocess image for prediction. module 'tensorflow' has no attribute 'Session'
This method was deprecated, so to fix this, edit the predict.py's line from 'tf.Session()' to 'tf.compat.v1.Session()'
After all was said and done, I was able to get it working:
Hi ,
Thanks for sharing.I am receiving the same error of your 5 point. In my predict.py file "tf.Session" is missing. Any help will be great.
if possible could u share your this predict.py file.
Sure I could share my predict.py edit. It's simply a one line edit on line 123 from tf.Session() to tf.compat.v1.Session().