I finally got the chance to sit down and put together an email processing script for outgoing voicemail emails that does the following:
Checks to see if outgoing email is HTML and changes Content-Type accordingly
Converts outgoing WAV attachments to MP3
Sends WAV attachment to Microsoft Bing speech recognition and inserts the transcribed message into the outgoing email (NOTE: This requires an Azure account with cognitive services)
Regarding the accuracy, it really depends on how clear the caller’s voice is. If there’s a lot of noise in the background it can have trouble. I find it works pretty well for my needs.
Implemented it this today. It works great except that the Bing Speech to Text REST API only allows for up to 15 seconds of audio. It works for the short voice mails but obviously not the longer ones.
With UCP 13. You could add a new field to the VM and put the transcript into the text file which would show up in UCP. Getting it from the script to that text file is the part I stopped on. Basically text to file though.
I implemented this and am now wondering, anyone have any luck getting it to appear more “grammatically correct”? The text I have returning is one long run-on sentence.
Supposedly, Watson should be able to place punctuation, grammer, etc.
Hmmm…Set this up because a Customer wanted it - the Accuracy through Bing is actually pretty spot-on, but it stops transcribing after the first pause - which is a No-Go.
Cognitive Services Bing Speech API is being retired November 1, 2021 so it won’t work after then anyways - Has anybody looked at translating this to the new Azure Speech Service - I am trying right now, but I have almost Zero experience with Python - If I get it translated I will post it here - the Microsoft docs say it’s easy, but it is certainly not easy for me. Will update as I progress (or don’t) this weekend.
Works perfectly and getting set up with IBM is trivial - and right now they are giving out a $200 credit when you set up a real account (not a free one).
Why isn’t this a standard thing in FreePBX - You could leave the script like it is without any proper credentials, so that it just wouldn’t work unless you had the credentials, but DANG - it works perfectly and the recognition is really spot-on!
This is so cool and really up’s the “Polish” level of FreePBX - Lots of Hosted providers are providing Transcription - now we can too!