Task processor queue stasis/m:channel:all-xxx reaching 500 scheduled tasks again

Hi,

FreePBX Distro SNG7-PBX16-64bit-2208-2 (16.0.23 / Asterisk 18.14), with around 20 PJSIP trunks and no local extensions in use (yet). The PBX is acting as a quasi SBC for now, routing calls between the PJSIP trunks. No AMI/ARI apps, no Stasis() apps.

Not using Inbound/Outbound call routing either – added all call routing logic into extensions_custom.conf, via the [ext-did-custom] context. Clearing/building a HASH() at every call to store the details of all trunks just to make life easier when making routing decisions.

Using a trunk dial hook in order to manipulate B-leg headers.

One of the trunks is special, our application is sitting at the far end of it, taking/making pure SIP calls. Calls from/to all other trunks are connecting from/to there.

This FreePBX server is handling around 500 calls per day like this, and two of the stasis/m:channel:all topics queue shoot up to above 500 right after the first couple of calls, and eventually settle around 550 and 750.

Have another FreePBX unit, set up exactly the same way, but handling around 5000 calls per day, and the same stasis topics reach around the same queue depth…

In neither of these PBX servers do I see any other stasis topics violating high watermarks, or getting really any close to it. Not any PJSIP or PJSIP/channel, nor any other ones. It is merely these two “catch-all” topics, that (I assume) get tons of stasis messages forwarded by all channels.

FreePBXes are running in Microsoft Azure as VMs. There is absolutely no “hardware” capacity saturation whatsoever, so no CPU, RAM, storage pressures of any kind at all.

Planning to have around 100-200 of these trunks eventually with several hundred extensions per server and a proportionally increased call volume. Might bump up the VM sizes by that time of course.

I do use heplify/homer suite to trace calls and for KPIs. But consequently disabled Call Event Logging. I am thinking on disabling CDR too, however neither of these gave/give me stasis topic pressures in their respective task processor queues at all.

So, I am wondering:

  • Are there particular Asterisk dialplan applications and/or dialplan functions that rely heavily on stasis channel message bus topics? Might I have done a “bad thing” by attempting to create easy to maintain (read/learn) dialplan scripts but generating too big of a “hassle” within Asterisk/Stasis? In a more generic sense: would there be any way to inadvertently overload stasis topic taskprocessor queues by doing “unwise” things in the dialplan within the constraints mentioned above (no Stasis() apps, ARI/AMI, etc API calls whatsoever) ?

  • What if I simply disregard the stasis/m:channel:all topic pressure warning? Am I correct assuming these topics are used to make some CEL/CDR statistics only and should have no essential call processing/routing Asterisk features degraded if left untreated?

  • Assuming I did nothing “wrong” in the dialplan, would increasing stasis thread pool limits (eg max thread size in a pool from the default 50 to 100 or so) be the correct step to adapt stasis to my workflow?

  • Is there a less painful way (than to rebuild) to enable Asterisk developer mode to gain access to stasis statistics – ideally on a temporary, switchable/configurable manner – to drill deeper into the root cause? Especially in a FreePBX Distro env?

Thanks,
Peter

In short, it means that the system cannot handle the requests fast enough…

If you don’t really use FreePBX other than for creating the Trunks, I suggest writing a complete custom dialplan.
There are a lot of parts in the FreePBX dialplan that are unnecessary for your use case, so in theory, if you don’t call all the AGIs that FreePBX uses then it’ll probably speed things up.

With that being said, 500 calls per day should not trigger these overloads with standard hardware resources. So I suggest you actually try to find out what is happening here.

There are a number of threads on this and the Asterisk Community Forums on this subject.

Also, see this blog post: