TTS Engine Custom - Amazon Polly - 24 languages

If you are willing to offer some guidance here and there, I’m willing to put in the work. :slight_smile:

SSML Example:

<p><s>This is a test<break time=\\\"3000ms\\\"/>, <prosody rate=\\\"-15%\\\">Another test.</s></p>

:yum:

1 Like

SSML won’t work in the TTS module as it is currently written (it speaks the markup; “less than p greater than less than s greater than This is a test less than break…”). I’m thinking that, at least for the polly engine, the user will be able to choose either “plain text” or “SSML” mode for entering their text in the “TTS module of the future” so that it is passed to the engine correctly depending on that selection.

1 Like

This works for me… You can choose to add SSML or not. This is for an automated wifi password changer that I made.

js file:
let params = {
‘Text’: ‘< speak >’ + argv.text + ‘</ speak>’,
‘TextType’: ‘ssml’,
‘OutputFormat’: ‘mp3’,
‘SampleRate’: ‘8000’,
‘VoiceId’: ‘Joanna’,
}

php file:
$text= ‘<amazon:auto-breaths>Here is sunflower spelled out. S<break time=“200ms”/>u<break time=“200ms”/>n<break time=“200ms”/>f<break time=“200ms”/>l<break time=“200ms”/>o<break time=“200ms”/>w<break time=“200ms”/>e<break time=“200ms”/>r.</amazon:auto-breaths>’;

Notes:
<say-as interpret-as=" spell out "> Sunflower </say-as> seemed too fast to understand.
<prosody rate=“60%”> Slows the speech down and makes the letters sound weird when spoken.

This file doesn’t seem to exist, am I missing something?

I’m getting this when I try to use it:

[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [1@ext-tts:1] NoOp("SIP/vitel-inbound-0000002a", "TTS: Test") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [1@ext-tts:2] NoOp("SIP/vitel-inbound-0000002a", "Using: polly") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [1@ext-tts:3] Answer("SIP/vitel-inbound-0000002a", "") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [1@ext-tts:4] AGI("SIP/vitel-inbound-0000002a", "propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: Launched AGI Script /var/lib/asterisk/agi-bin/propolys-tts.agi
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: TTS AGI Started
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: Generated WAV file: /var/lib/asterisk/sounds/tts/polly-tts-2d2bd27c76cd34d0584fdb2a36bc7919.sln
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: TXT file: /var/lib/asterisk/sounds/tts/polly-tts-2d2bd27c76cd34d0584fdb2a36bc7919.txt
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: Text to speech wave file doesnt exist, lets create it.
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: Executing polly
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: polly is not a valid engine!
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: File was not created!
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: propolys-tts.agi,"This is a test of Polly!",polly,/usr/bin/node: TTS AGI end
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] res_agi.c: &lt;SIP/vitel-inbound-0000002a&gt;AGI Script propolys-tts.agi completed, returning 0
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [1@ext-tts:5] Goto("SIP/vitel-inbound-0000002a", "app-announcement-5,s,1") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx_builtins.c: Goto (app-announcement-5,s,1)
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [s@app-announcement-5:1] GotoIf("SIP/vitel-inbound-0000002a", "1?begin") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx_builtins.c: Goto (app-announcement-5,s,4)
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [s@app-announcement-5:4] NoOp("SIP/vitel-inbound-0000002a", "Playing announcement Polly TTS Test") in new stack
[2019-02-13 14:54:40] VERBOSE[23111][C-00000020] pbx.c: Executing [s@app-announcement-5:5] Playback("SIP/vitel-inbound-0000002a", ",noanswer") in new stack
[2019-02-13 14:54:40] WARNING[23111][C-00000020] file.c: File does not exist in any format
[2019-02-13 14:54:40] WARNING[23111][C-00000020] file.c: Unable to open (format (ulaw)): No such file or directory
[2019-02-13 14:54:40] WARNING[23111][C-00000020] app_playback.c: Playback failed on SIP/vitel-inbound-0000002a for ,noanswer

Tom, You don’t need to use the modified propolys-tts.agi file anymore, as the necessary changes were recently merged into the modules proper. You just need to upgrade to the Edge version of ‘tts’ and ‘ttsengines’ (the install-pollytts script I wrote should do this for you):

fwconsole ma --edge downloadinstall tts ttsengines

That said, it does look like someone on my team moved the propolys-tts.agi file into an archival directory on our file server. I have moved it back to root so you can grab the file individually again if you really want to. Again, these changes to support Polly have already been integrated into the modules and are in the Edge repo pending the normal QA process for transition to the Stable repo. Using the Edge repo instead of the hand-modified propolys-tts file will preclude the security warnings since they are properly signed (with the same needed changes to support Polly).

1 Like

So, as my brain has been churning on this, I’m thinking in the new “modular” reincarnation of the TTS module, the most sensible (and rather easy) way to make the TTS elements universally available to other modules would be to hook into System Recordings.

Example: When you create a new TTS element in its module, you can select the “static” option to pre-generate the audio file and register it in System Recordings (TTS_) like you would a normal custom audio file. Then it can be used anywhere in the GUI where you would naturally select a static sound/announcement to playback to your caller. When you update the TTS element text, it regenerates the file on submission.

Going this route with dynamic elements won’t work, tho. I’m thinking that we’ll have to create various TTS “modules” for different dynamic elements; like a weather module, <insert another dynamic example here cuz it’s 4am lol>, etc.

That said, I’m curious if anyone else would like to assist with this new iteration of TTS. I know the folks at Sangoma don’t have it on their radar any time soon so I’d like to get the ball rolling ourselves. Looking at the System Recordings module, we may have a bit of work to do there to get a proper hook for TTS (hasn’t been touched in earnest since May '17).

I’m going to throw up a clean forum topic to get deeper into this. I’ll post link here when I get that going.

4 Likes

I demoed a system for the local archdiocese that had various “static” messages for closing school(Snow Day, Too Hot, Weasels got loose, The Rapture, etc.) but option 9 was “enter the text of a message and we’ll send that out to the masses”. I coded everything up so that it ran out of System messages and had the TTS program write the “variable” message to a “Close-9” message in the same space. Made the IVR work a lot easier.

I don’t see why using the System Recordings “Local” area wouldn’t work for this.

@TheWebMachine

Any traction on this?

1 Like

The audio generated through FreePBX is quiet, about 12db quieter, but when I use the same scripts from the command line and put the audio files in manually (overwrite the audio files that were generated by FreePBX for TTS), they are the proper volume. What am I doing wrong on the FreePBX side? How do I increase the volume of the generated files?

The log for the first run of a test:
[2019-10-28 13:00:18] VERBOSE[4738][C-0000001e] pbx.c: Executing [4@ext-tts:4] AGI(“PJSIP/201-0000002b”, “propolys-tts.agi,“Testing, 1, 2, 3, Testing”,polly,/usr/bin/node”) in new stack
[2019-10-28 13:00:18] VERBOSE[4738][C-0000001e] res_agi.c: Launched AGI Script /var/lib/asterisk/agi-bin/propolys-tts.agi
[2019-10-28 13:00:18] VERBOSE[4738][C-0000001e] res_agi.c: propolys-tts.agi,“Testing, 1, 2, 3, Testing”,polly,/usr/bin/node: TTS AGI Started
[2019-10-28 13:00:18] VERBOSE[4738][C-0000001e] res_agi.c: propolys-tts.agi,“Testing, 1, 2, 3, Testing”,polly,/usr/bin/node: Generated WAV file: /var/lib/asterisk/sounds/tts/polly-tts-c324350ace888373f168b9ea3e366d4c.wav
[2019-10-28 13:00:18] VERBOSE[4738][C-0000001e] res_agi.c: propolys-tts.agi,“Testing, 1, 2, 3, Testing”,polly,/usr/bin/node: TXT file: /var/lib/asterisk/sounds/tts/polly-tts-c324350ace888373f168b9ea3e366d4c.txt
[2019-10-28 13:00:19] VERBOSE[4738][C-0000001e] res_agi.c: propolys-tts.agi,“Testing, 1, 2, 3, Testing”,polly,/usr/bin/node: Streaming the generated wave.
[2019-10-28 13:00:19] WARNING[4738][C-0000001e] format_wav.c: Does not begin with RIFF
[2019-10-28 13:00:19] WARNING[4738][C-0000001e] file.c: Unable to open format wav
[2019-10-28 13:00:19] VERBOSE[4738][C-0000001e] res_agi.c: <PJSIP/201-0000002b> Playing ‘tts/polly-tts-c324350ace888373f168b9ea3e366d4c.slin’ (escape_digits=#) (sample_offset 0) (language ‘en’)
[2019-10-28 13:00:19] WARNING[4738][C-0000001e] mp3/interface.c: Junk at the beginning of frame 49443304
[2019-10-28 13:00:22] VERBOSE[4738][C-0000001e] res_agi.c: propolys-tts.agi,“Testing, 1, 2, 3, Testing”,polly,/usr/bin/node: TTS AGI end
[2019-10-28 13:00:22] VERBOSE[4738][C-0000001e] res_agi.c: <PJSIP/201-0000002b>AGI Script propolys-tts.agi completed, returning 0

I have not found a way (yes, even editing “ext” => “sln”,“rate” => “8000” to wav) to get wav working, but I have got slin working just fine. My suspicions are its with the sln conversion, but again, running the same commands from the console that FreePBX is running produces the proper volume.

Is anyone aware of a way to allow the playback to be interrupted by a key press? We use this to play dynamic messages that prelude our IVR. It would be ideal if the customer did not have to wait to hear the whole message prior to being able to press the applicable menu option.

Thanks in advance for any insights.

Disregard, I see it already answered in the earlier posts.

Rather than use Playback, use Background, which will play the file, but will allow for an interrupt.

You can also use “Read”:

[weather]
exten => s,1,Answer()
exten => s,n,Read(zip,custom/polly-Please-enter-a-zip-code,5, ,1,10)
exten => s,n,AGI(/var/lib/asterisk/agi-bin/custom/weather.php,${zip})
exten => s,n,Hangup()

Read(variable[,filename][,maxdigits][,option][,attempts][,timeout])
https://www.voip-info.org/asterisk-cmd-read/

1 Like

Hey Guys,

does anybody from you has an Idea, why when a tts file is created the first time that it kicks the caller out of the call bevore it plays the tts wav file ?

When the caller calls the second time, if direct works, because at the 1st time the creation of the wav was direct working,… the second try it only catches that.

[2020-07-18 23:54:05] VERBOSE[17425][C-000001a4] pbx.c: Executing [s@app-announcement-16:4] NoOp(“SIP/Sipgate-00000266”, “Playing announcement TEST”) in new stack

[2020-07-18 23:54:05] VERBOSE[17425][C-000001a4] pbx.c: Executing [s@app-announcement-16:5] Playback(“SIP/Sipgate-00000266”, “custom/TELL_MoinMoin_DE,noanswer”) in new stack

[2020-07-18 23:54:05] VERBOSE[17425][C-000001a4] file.c: <SIP/Sipgate-00000266> Playing ‘custom/TELL_MoinMoin_DE.slin’ (language ‘de_DE’)

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] pbx.c: Executing [s@app-announcement-16:6] Goto(“SIP/Sipgate-00000266”, “ext-tts,7,1”) in new stack

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] pbx_builtins.c: Goto (ext-tts,7,1)

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] pbx.c: Executing [7@ext-tts:1] NoOp(“SIP/Sipgate-00000266”, “TTS: Test DE”) in new stack

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] pbx.c: Executing [7@ext-tts:2] NoOp(“SIP/Sipgate-00000266”, “Using: aws-polly-de”) in new stack

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] pbx.c: Executing [7@ext-tts:3] Answer(“SIP/Sipgate-00000266”, “”) in new stack

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] pbx.c: Executing [7@ext-tts:4] AGI(“SIP/Sipgate-00000266”, “agi://127.0.0.1/propolys-tts.agi,“Hallo, ich bin es und teste gerade die Engine! Klasse, oder”,aws-polly-de,/usr/bin/node”) in new stack

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] res_agi.c: agi://127.0.0.1/propolys-tts.agi,“Hallo, ich bin es und teste gerade die Engine! Klasse, oder”,aws-polly-de,/usr/bin/node: TTS AGI Started

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] res_agi.c: agi://127.0.0.1/propolys-tts.agi,“Hallo, ich bin es und teste gerade die Engine! Klasse, oder”,aws-polly-de,/usr/bin/node: Generated WAV file: /var/lib/asterisk/sounds/tts/aws-polly-de-tts-85d5f4db9f01d8be55d991e19952d03e.wav

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] res_agi.c: agi://127.0.0.1/propolys-tts.agi,“Hallo, ich bin es und teste gerade die Engine! Klasse, oder”,aws-polly-de,/usr/bin/node: TXT file: /var/lib/asterisk/sounds/tts/aws-polly-de-tts-85d5f4db9f01d8be55d991e19952d03e.txt

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] res_agi.c: agi://127.0.0.1/propolys-tts.agi,“Hallo, ich bin es und teste gerade die Engine! Klasse, oder”,aws-polly-de,/usr/bin/node: Text to speech wave file doesnt exist, lets create it.

[2020-07-18 23:54:06] VERBOSE[17425][C-000001a4] res_agi.c: agi://127.0.0.1/propolys-tts.agi,“Hallo, ich bin es und teste gerade die Engine! Klasse, oder”,aws-polly-de,/usr/bin/node: Executing aws-polly-de

[2020-07-18 23:54:14] NOTICE[1142] chan_sip.c: Peer ‘501’ is now UNREACHABLE! Last qualify: 47

[2020-07-18 23:54:14] NOTICE[1142] chan_sip.c: Peer ‘502’ is now UNREACHABLE! Last qualify: 46

[2020-07-18 23:54:18] VERBOSE[17425][C-000001a4] res_agi.c: <SIP/Sipgate-00000266>AGI Script agi://127.0.0.1/propolys-tts.agi completed, returning 4

[2020-07-18 23:54:18] VERBOSE[17425][C-000001a4] pbx.c: Spawn extension (ext-tts, 7, 4) exited non-zero on ‘SIP/Sipgate-00000266’

Here it kicked me out about at 2020-07-18 23:54:15.

For me it looks like the from the request of the wav until it is ready it takes ~ 12 seconds and I was kicked out about after 9-10 seconds.

Usually we have a 20ms ping, 35k DSL and I use for the FreePBX since years an Raspberry Pi 2 which is usually for this request absolute enough.

Thanks a lot

Jan

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.