I'm working on speech recognition [SOLVED]

I’m working on speech recognition and I’m relying on an agi script that uses Google cloud platform in the specific speech recognize
Someone has experience in using the Google cloud platform API because everything seems to work but I believe that it does not give me the permissions to access the service

It’s not ‘you’ that needs permission, it’s the asterisk user, do the ‘init’ for that user for google cloud

It seems interesting to me what you say, because of my fault I don’t understand you well, here is an example I am using, this code is inserted in “/etc/asterisk/extensions_override_freepbx.conf”

here is script:
[ivr-3]; Claudio
include => ivr-3-custom
exten => fax, 1, Goto ($ {CUT (FAX_DEST, ^, 1)}, $ {CUT (FAX_DEST, ^, 2)}, $ {CUT (FAX_DEST, ^, 3)})

exten => s, 1, Answer ()
exten => s, n, Set (TIMEOUT (digit) = 5)
exten => s, n, agi (googletts.agi, “Try to talk”, it)
exten => s, n (record), agi (speech-recog.agi, it-IT)
exten => s, n, agi (googletts.agi, “$ {utterance}”, it)
;; Wait for digit:
exten => s, n, WaitExten ()

exten => h, 1, Hangup ()

its not necessarily the code, to use the google stt engine the user needs an authorized ‘token’, installed in her ‘home’ directory.

For information, someone has just posted something on using Google speech recognition, with Asterisk, on the Asterisk community forum:

1 Like

thanks for the directions, maybe what you tell me has to do with this?
https://cloud.google.com/storage/docs/authentication#user_accounts

that is exactly what i am referring to, you have to do it for the asterisk user too.

IBM also offers a free speech-to-text service. Try googling: “free ibm voicemail transcription”

I am having difficulty implementing, I have seen this online but I think it is outdated by now, does anyone use google to recognize the voice?

For Google you can use something as simple as

same => n,Record(/tmp/${prefix}number.wav,2,5)
same => n,Set(voice=${SHELL(gcloud ml speech recognize /tmp/${prefix}number.wav --language-code='en-US'|sed -n -e  '/transcript/p'|sed  -e 's/[^0-9]*//g' -e 's/.*\([0-9]\{2\}$\)/\1/'|tr -d '\n')})

almost Out of the Box in a custom context. If you want ‘close to realtime’ you will need some sort of EAGI script to feed ‘gloud ml speech recognize’

Ok everything works it was me who did not configure the key on Google well, I recommend everyone to get a nice tutorial before using the google bees it helped me a lot!

I would love me some nice tutorial.

I want to be able to pick up the phone and instead of get a dial tone get a recording saying “who do you want to reach”, I say who I’m trying to call (from maybe 20 ‘speed dials’), and the destination is dialled for me.

Is this possible?!

The Vosk stuff I posted on the other thread works just fine I use it to replace all DTMF signalled interfaces like IVR’s (and passwords, it can also ‘fingerprint’ who is speaking after a little training) chasing them robo-callers :wink:

0

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.