Crash help

dwsiemens · September 16, 2020, 6:57pm

I think I ran over the edge, the system basically won’t respond, Lots of reloads Like 5 from apply configs where happening.

In the logs i have

[2020-09-16 09:19:57] WARNING[15078][C-00003593] taskprocessor.c: The ‘stasis/p:channel:all-0001a8fc’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[16500][C-0000355a] taskprocessor.c: The ‘stasis/p:channel:all-0001a8d3’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[25666] taskprocessor.c: The ‘stasis/p:channel:all-0001a8fe’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[9091] taskprocessor.c: The ‘stasis/p:channel:all-0001a73a’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[9091] taskprocessor.c: The ‘stasis/p:channel:all-0001a8e6’ task processor queue reached 500 scheduled tasks.

So what should I tune up is the next question. Load average was about of 13-14.
Sar info
08:30:01 AM all 8.92 0.01 5.38 5.79 0.00 79.90
08:40:01 AM all 9.99 0.01 5.32 6.01 0.00 78.68
08:50:01 AM all 10.01 0.01 5.26 6.46 0.00 78.25
09:00:01 AM all 10.80 0.01 5.63 5.92 0.00 77.64
09:10:01 AM all 12.41 0.01 6.33 5.75 0.00 75.50
09:20:01 AM all 14.13 0.01 6.64 7.70 0.00 71.52
09:30:01 AM all 5.60 0.00 1.57 0.01 0.00 92.81

cynjut · September 17, 2020, 2:17pm

The last time I saw this, it was this:

dwsiemens · September 18, 2020, 12:15pm

I’ve changed all the peers to be solicited instead of auto in m config.

What else?

I have also upgraded my machine image size from a mx5.xl to a mx5.2xl to raise the network bandwidth. AWS has some pretty low non published limits on network adapters. they have them down at around 160mb/s on a mx5.xl and I am up in that territory with all the network file system usage. .

cynjut · September 18, 2020, 1:59pm

Adding more memory for this (IIRC) doesn’t change the performance of this part of the system. The problem is just the number of operations that the system is doing and there just are enough electrons to make this happen.

Let us know if this doesn’t solve the problem. I’m sure one of us can search through the Forum archive for you and make more suggestions.

dwsiemens · September 18, 2020, 2:30pm

It wasn’t to fix the server mem or cpu side. Its because aws has some arbitray limits on network throughput. the data on a mx5.xl is that it only supporuts about 1.2 gbits or around 155mb/s Which the performance data shows I was hitting. So in AWS the only option is to buy more of everything to get more network.

nielsen · September 18, 2020, 3:28pm

I’m curious as to how many extensions you have to hit this kind of problem if you don’t mind giving out that info…

dwsiemens · September 18, 2020, 3:43pm

Just under 500 extensions.

The problem is the stasis queue backed up, And then I had all kinds of issues with high memory usage and calls not coming, and users not able to make calls. And then the users think the sky fell and then is off to have a root cause analysis and I’m off to electric chair of IT.

system · October 19, 2020, 3:43pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.