TTS polly - special charcters and long texts not processed

Hello,
i have installed polly-tts with amazon text 2 speech, but i am facing a problem:
It seems that longer texts are not processed (50 words) and Umlaute (ä,ü) etc are forwarded to the module as HTML e.g. ä is converted to &aUml; what produces big problems.

Any idea how to fix it? We are a community pharmacy and need to change our messages to the customers nearly every day concerning Covid-19 and testpossibilities, availability of free Covid-19-tests, etc.

Regards,
Gunther

Less than ideal, but can you break the messages up into several?

I am not sure if the FreePBX TTS module is caching the generated audio files.

So perhaps you generate these files manually to save cost and “fix” this issue?

Hi, that would be an immense effort and would lead to chaos.

The messages change nearly every day…

It seems like it is getting stored like this in the DB.

MariaDB [asterisk]> select * from tts;
+----+------+-------------------------------------------------------------------------------------+------------------------+--------+
| id | name | text                                                                                | goto                   | engine |
+----+------+-------------------------------------------------------------------------------------+------------------------+--------+
|  1 | Test | It seems that longer texts are not processed (50 words) and Umlaute (ä,ü) | app-blackhole,hangup,1 | flite  |
+----+------+-------------------------------------------------------------------------------------+------------------------+--------+

I guess your best route is to file a bug report, issues.freepbx.org.

If it is important enough to you, review the code, or perhaps hire someone to fix it: Browse FreePBX / tts - FreePBX GIT

If you do so, please consider contributing the code back to the project.

https://wiki.freepbx.org/display/FOP/Code+License+Agreement

Hi all. We had a hand in the creation of the first Polly TTS support for FreePBX. So, it would be prudent to review our guide here for all the caveats. You are correct that there are limits (partly in the Polly API interface; partly in the TTS module) that we have little control over.

That said, here’s the TL;DR that matters to this post:

  1. You can’t use single quotes/apostrophe (’) including contractions - use dont instead of don’t, double quotes ("), or carriage returns/newlines, as these will break the script on the backend and/or storage into the FreePBX DB (software devs out there will recognize why quotes can be bad news without special handling).

  2. You can’t use “foreign” or accented characters as these won’t be stored in the DB correctly due to encoding differences, as you’ve already identified…these chars get stored using their “character entities” because that’s how the browser is parsing them in the form.

  3. You can’t use SSML in FreePBX TTS module. This is a parsing limitation of the TTS module, itself, as it breaks when trying to parse the field

  4. AWS Polly has a firm 3000-char limit per API request. This is their limit, not ours. Long text will need broken up into multiple chained TTS elements in your dialplan

In short, AWS Polly support in the TTS module is far from perfect and you may have to get a little creative in your text. Polly is pretty smart and can typically handle pronunciation of words without their appropriate apostrophes and accents, as it will recognize the word you “meant” to provide it.

The TTS module is one of those that Sangoma has allowed to languish for years now. Back when we first wanted to introduce Polly support, this was when v15 was still in early development, and we were hopeful that TTS would get a rewrite like many other modules did…but Sangoma told us flat out that they had zero interest in revamping this module [because it doesn’t make them any money].

If TTS is going to get that rewrite, it will have to be by the community for the community. Don’t expect Sangoma to pick up the ball here. I suspect they will reject (“Won’t Fix”) nearly any ticket submitted asking for improvements to TTS.

Finally, caching of TTS elements by Asterisk does happen is very important, as AWS charges for Polly…you wouldn’t want static text constantly generating new audio files unless you want to pay for all that wasted processing, so if the text element hasn’t changed the previously generated/downloaded audio is used. You do have the option of using AWS’ Polly page yourself to generate audio from more complex text or using SSML and download that audio file to use as a System Recording in FreePBX, presuming you don’t want to change it all that often. The TTS module works fine for most needs with Polly, including supporting Asterisk Channel Variables in the text, but there will be cases where generating your own audio files with Polly for static audio elements would be preferred:

https://console.aws.amazon.com/polly/home/SynthesizeSpeech

I hope this helps!

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.