FreePBX | Register | Issues | Wiki | Portal | Support

TTS Engine Custom - Amazon Polly - 24 languages


(Jerson Jr) #1

Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Polly includes 47 lifelike voices spread across 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.

Important remark

The following procedures were performed in a test environment, the propolys-tts.agi file will be modified and then FreePBX will alert you as per the text below;

Module: “Text To Speech”, File: "/var/www/html/admin/modules/tts/agi-bin/propolys-tts.agi altered"

Create Custom Engine:

Sample project to demonstrate usage of the AWS SDK for Node.js:

cd /opt/
git clone https://github.com/awslabs/aws-nodejs-sample
cd aws-nodejs-sample
npm install
npm install optimist
npm install child_process

vim script.js

// Load the SDK
var argv = require(‘optimist’).argv;
const AWS = require(‘aws-sdk’)
const Fs = require(‘fs’)
var child_process = require(‘child_process’);
// Create an Polly client
const Polly = new AWS.Polly({
accessKeyId: “accessKeyId here”,
secretAccessKey: “secretAccesKey here”,
signatureVersion: ‘v4’,
region: ‘us-east-1’

})

let params = {
‘Text’: argv.text,
// ‘Text’: ‘Tatiana’,
‘OutputFormat’: ‘mp3’,
‘SampleRate’: ‘8000’,
‘VoiceId’: ‘Vitoria’
}

Polly.synthesizeSpeech(params, (err, data) => {
if (err) {
console.log(err.code)
} else if (data) {
if (data.AudioStream instanceof Buffer) {
Fs.writeFile(argv.mp3, data.AudioStream, function(err) {
if (err) {
return console.log(err)
}
console.log(“The file was saved!”)
var output = child_process.execSync('lame --decode ’ + argv.mp3 + ’ ’ + ‘-b 8000’ + ’ ’ + argv.wav + ‘.wav’);

        })
    }
}

})

vim /var/lib/asterisk/agi-bin/propolys-tts.agi

case 'node':

                        exec($enginebin." /opt/aws-nodejs-sample/script.js --mp3=/var/lib/asterisk/sounds/tts/$engine-tts-$hash.mp3 --text='$text' --wav=/var/lib/asterisk/sounds/tts/$engine-tts-$hash");
                        break;

Installing Lame to convert mp3 to wav on Freepbx Centos 7:

The Amazon Polly service returns the audios in the following formats, mp3, pcm and ogg, I used the mp3 format but it was necessary to convert from mp3 to wav 8000hz:

yum -y install lame

Now you can use the tts module with Amazon Polly:

Finally

Go to Applications=> Text to Speech and create your TTS with engine Polly.


Text to Speech Module
What would be the easiest way to incorporate variable speech in an IVR message
Voice IVR - Any new updates or software to use?
IVR with specific needs
And if the switchboard called me to warn of some urgencies ... with Text To Speech!
(Andrew Nagy) #2

Instead of rebuilding sox you should just use lame for mp3 files, you already installed it.

[root@x core]# lame
LAME 64bits version 3.99.5 (http://lame.sf.net)

usage: lame [options] <infile> [outfile]

    <infile> and/or <outfile> can be "-", which means stdin/stdout.

Try:
     "lame --help"           for general usage information
 or:
     "lame --preset help"    for information on suggested predefined settings
 or:
     "lame --longhelp"
  or "lame -?"              for a complete options list

(Jerson Jr) #3

Script modified for lame!


(Tony Lewis) #4

Why not do a pull request in GIT from us for this so its part of the module for others.


(Jerson Jr) #5

By this way http://git.freepbx.org/projects/FREEPBX/repos/tts/browse/ ?


(Chris Dolese) #6

yes - and you should follow up on this …
if you have issues or questions let me know


(Jerson Jr) #7

I did not succeed and gave up, but it works according to the above modifications!

I did not do anything! I just changed the /var/www/html/admin/modules/tts/agi-bin/propolys-tts.agi in the hand that the web part can not do in the module.


(Chris Dolese) #8

jim going to PM you - please look for it


(William Bond) #9

I believe I am missing something here. I have followed the instructions listed, I have everything installed and created the script.js file. In regard to the propolys-tts.agi file, I have modified the following area:

    switch ($engine) {
            case 'text2wave':
                    exec($enginebin." -f ".$format['rate']." -o $tmpwavefil$
                    break;
            case 'flite':
                    exec($enginebin." -f $textfile -o $tmpwavefile");
                    break;
            case 'swift':
                    exec($enginebin." -p audio/channels=1,audio/sampling-ra$
                    break;
            case 'pico':
                    exec($enginebin." -f ".$format['rate']." -o $tmpwavefil$
                    break;
            case 'node':
                    exec($enginebin." /opt/aws-nodejs-sample/script.js --mp$
                    break;
            default:
                    debug("$engine is not a valid engine!", 1);
            break;

Lame was already installed with the most recent FreePBX Distro I am using, so I didnt need to do that.

So my question is, how do you add it to the FreePBX GUI? Because when I go to Applications=> Text to Speech, all I see is filte for engines. Personally, I never intend to replace things when I make changes, I try to make additions. So, I do not want to remove filte or overwrite the way it works, I would like a new option for ‘AWS/Polly or in this case ‘node’’. Thank you!


(Jerson Jr) #10

$format = array(
“ext” => “sln”,
“rate” => "8000"
case ‘node’:

To:

$format = array(
“ext” => “wav”,
“rate” => “8000”

And:

case ‘node’:
exec($enginebin." /opt/aws-nodejs-sample/script.js --mp3=/var/lib/asterisk/sounds/tts/$engine-tts-$hash.mp3 --text=’$text’ --wav=/var/lib/asterisk/sounds/tts/$engine-tts-$hash");
break;


(William Bond) #11

Yes, I did have the rest of that line, it ran off the screen. So I do have the full exec line for ‘node’.


(William Bond) #12

I’m not sure I follow the first portion… what am I doing with the $format?


(Jerson Jr) #13

This code is in the first lines of the propolys-tts.agi


(William Bond) #14

Ok, I figured that out your message got chopped up a little when you edited it. So I updated the $format array but it still does not show as an available engine. Do I need to add something to Settings >> Text to Speech Engines?


(Jerson Jr) #15

I’ve done the update, I’ll set it up again because something may have changed.


(William Bond) #16

Do I need to add something to Settings >> Text to Speech Engines? It doesn’t show up in that drop down list either though.


(Jerson Jr) #17

Yes, click in Add TTS Engine

Engine Name: Node or Polly
Engine Path: /usr/bin/node

And Submit


(Jerson Jr) #18
  • Egine Name: Select Custom

(William Bond) #19

Thank you, that was what I needed. I thought it was probably simple, just wasnt sure exactly what to enter. I’m on my way to testing now :slight_smile:


(William Bond) #20

Alright, I have my AWS account setup and entered the keyid and accessid. I setup a new TTS and chose the new engine, then I submitted and applied the config, but nothing is played when I try to use the TTS. If I switch it back to ‘filte’ it works fine. From here, I’m not sure where I would check to see if it is actually doing what it should. But, I did login to AWS and it says the key has yet to be used.