PJSIP stops accepting calls and console flooded with WARNING messages

rymes · December 31, 2019, 2:25pm

We have installed a new system that has been running reasonably well until recently. It is a virtual machine that is connected to a PRI via a Sangoma Vega device. All phones are Cisco 7960Gs running SIP firmware and the phones and the Vega are all connected using PJSIP. Three times in the last week we have had to restart the machine because all calls in and out fail.

When the system experiences the issue, the asterisk console is flooded with warning messages and the Asterisk full log (/var/log/asterisk/full) grows to multiple Gigabytes in a 24-hour period (one was 14 GB!). The console is completely useless when this is occurring, due to the amount of text scrolling by, so you have to run any commands via “asterisk -rx” to be able to read the output, even with verbosity set to 0. The Warnings look like this:

[2019-12-31 07:59:22] WARNING[13592][C-00000013]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:22] WARNING[13592][C-00000013]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:22] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:22] WARNING[10968]: logger: ***: Logging resumed.  1055 messages discarded.
[2019-12-31 07:59:22] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:22] WARNING[10968]: logger: ***: Logging resumed.  1399 messages discarded.
[2019-12-31 07:59:22] WARNING[13609][C-00000014]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:23] WARNING[13592][C-00000013]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:23] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:23] WARNING[10968]: logger: ***: Logging resumed.  550 messages discarded.
[2019-12-31 07:59:23] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:23] WARNING[10968]: logger: ***: Logging resumed.  1476 messages discarded.
[2019-12-31 07:59:23] WARNING[13592][C-00000013]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:23] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:23] WARNING[10968]: logger: ***: Logging resumed.  519 messages discarded.
[2019-12-31 07:59:23] WARNING[13609][C-00000014]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:23] WARNING[13609][C-00000014]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:23] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:23] WARNING[10968]: logger: ***: Logging resumed.  663 messages discarded.
[2019-12-31 07:59:23] WARNING[13592]: logger: ***: Log queue threshold (1000) exceeded.  Discarding new messages.
[2019-12-31 07:59:23] WARNING[10968]: logger: ***: Logging resumed.  952 messages discarded.
[2019-12-31 07:59:23] WARNING[13609][C-00000014]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:channel:all-000000ac' task processor queue reached 500 scheduled tasks again.
[2019-12-31 07:59:23] WARNING[13592][C-00000013]: taskprocessor.c:1110 taskprocessor_push: The 'stasis/m:cache_pattern:1/channel:all-000000ab' task processor queue reached 500 scheduled tasks again.

I have found this blog post on the subject, and as a result I have disabled a number of modules we are not using, like Zulu, DUNDi Lookup Registry, iSymphony, Paging Pro, Sangoma Property Management, CRM, etc., but I don’t have any confidence that those changes will result in any progress, as the blog post and the log messages don’t make clear to me what is driving this. System load doesn’t seem to be an issue.

Can anyone point me to any details to what might be causing this? Which portions of FreePBX use Stasis?

rymes · December 31, 2019, 5:55pm

As a side note, I did stumble into this while searching about, but I do not see the relevant files in the filesystem anywhere, and I’m not really certain that they will do any good.

https://wiki.asterisk.org/wiki/display/AST/Performance+Tuning

partgenius · December 31, 2019, 9:55pm

Running FOP2? Or any 3rd party apps?
What do your queues look like? 30 agents in a ring all?
Disabling CEL is a start. You gotta do it from the config file too since the gui doesnt actually disable CEL.
Hows your processor usage look? HTOP screenshot.
Join empty leave empty options? I like strict for both. If transports reload or phones drop lots of people could get stuck in queue otherwise.
Set for no fowarding allowed on queues?
I like to rename the dashboardstats.php file or whatever its called because it use to update stats every minute causing a reload. Still bugs where you cant disable it from running. Every apply will just add the cron job back.

jcolp · December 31, 2019, 9:57pm

Are a ton of calls being placed? Are all your modules up to date? Is your system externally accessible?

rymes · January 2, 2020, 6:33pm

Thanks for the tips, guys. After digging around and looking at things, it seems that we had a call loop in our incoming call logic. An inbound route pointed at a call flow control that pointed to a time condition that pointed back at the call flow control during non-business hours.

Perhaps this will help someone else if they stumble across it?!

system · January 9, 2020, 6:33pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.