I'm working on speech recognition [SOLVED]


(Claudio Pelosi) #1

I’m working on speech recognition and I’m relying on an agi script that uses Google cloud platform in the specific speech recognize
Someone has experience in using the Google cloud platform API because everything seems to work but I believe that it does not give me the permissions to access the service


#2

It’s not ‘you’ that needs permission, it’s the asterisk user, do the ‘init’ for that user for google cloud


(Claudio Pelosi) #3

It seems interesting to me what you say, because of my fault I don’t understand you well, here is an example I am using, this code is inserted in “/etc/asterisk/extensions_override_freepbx.conf”

here is script:
[ivr-3]; Claudio
include => ivr-3-custom
exten => fax, 1, Goto ($ {CUT (FAX_DEST, ^, 1)}, $ {CUT (FAX_DEST, ^, 2)}, $ {CUT (FAX_DEST, ^, 3)})

exten => s, 1, Answer ()
exten => s, n, Set (TIMEOUT (digit) = 5)
exten => s, n, agi (googletts.agi, “Try to talk”, it)
exten => s, n (record), agi (speech-recog.agi, it-IT)
exten => s, n, agi (googletts.agi, “$ {utterance}”, it)
;; Wait for digit:
exten => s, n, WaitExten ()

exten => h, 1, Hangup ()


#4

its not necessarily the code, to use the google stt engine the user needs an authorized ‘token’, installed in her ‘home’ directory.


(David55) #5

For information, someone has just posted something on using Google speech recognition, with Asterisk, on the Asterisk community forum:


(Claudio Pelosi) #6

thanks for the directions, maybe what you tell me has to do with this?
https://cloud.google.com/storage/docs/authentication#user_accounts


#7

that is exactly what i am referring to, you have to do it for the asterisk user too.


(Hugh Janus) #8

IBM also offers a free speech-to-text service. Try googling: “free ibm voicemail transcription”


(Claudio Pelosi) #9

I am having difficulty implementing, I have seen this online but I think it is outdated by now, does anyone use google to recognize the voice?


#10

For Google you can use something as simple as

same => n,Record(/tmp/${prefix}number.wav,2,5)
same => n,Set(voice=${SHELL(gcloud ml speech recognize /tmp/${prefix}number.wav --language-code='en-US'|sed -n -e  '/transcript/p'|sed  -e 's/[^0-9]*//g' -e 's/.*\([0-9]\{2\}$\)/\1/'|tr -d '\n')})

almost Out of the Box in a custom context. If you want ‘close to realtime’ you will need some sort of EAGI script to feed ‘gloud ml speech recognize’


(Claudio Pelosi) #11

Ok everything works it was me who did not configure the key on Google well, I recommend everyone to get a nice tutorial before using the google bees it helped me a lot!


(D E) #12

I would love me some nice tutorial.

I want to be able to pick up the phone and instead of get a dial tone get a recording saying “who do you want to reach”, I say who I’m trying to call (from maybe 20 ‘speed dials’), and the destination is dialled for me.

Is this possible?!


#13

The Vosk stuff I posted on the other thread works just fine I use it to replace all DTMF signalled interfaces like IVR’s (and passwords, it can also ‘fingerprint’ who is speaking after a little training) chasing them robo-callers :wink:

0