Crash help


(Dan S) #1

I think I ran over the edge, the system basically won’t respond, Lots of reloads Like 5 from apply configs where happening.

In the logs i have

[2020-09-16 09:19:57] WARNING[15078][C-00003593] taskprocessor.c: The ‘stasis/p:channel:all-0001a8fc’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[16500][C-0000355a] taskprocessor.c: The ‘stasis/p:channel:all-0001a8d3’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[25666] taskprocessor.c: The ‘stasis/p:channel:all-0001a8fe’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[9091] taskprocessor.c: The ‘stasis/p:channel:all-0001a73a’ task processor queue reached 500 scheduled tasks.

[2020-09-16 09:19:57] WARNING[9091] taskprocessor.c: The ‘stasis/p:channel:all-0001a8e6’ task processor queue reached 500 scheduled tasks.

So what should I tune up is the next question. Load average was about of 13-14.
Sar info
08:30:01 AM all 8.92 0.01 5.38 5.79 0.00 79.90
08:40:01 AM all 9.99 0.01 5.32 6.01 0.00 78.68
08:50:01 AM all 10.01 0.01 5.26 6.46 0.00 78.25
09:00:01 AM all 10.80 0.01 5.63 5.92 0.00 77.64
09:10:01 AM all 12.41 0.01 6.33 5.75 0.00 75.50
09:20:01 AM all 14.13 0.01 6.64 7.70 0.00 71.52
09:30:01 AM all 5.60 0.00 1.57 0.01 0.00 92.81


(Dave Burgess) #2

The last time I saw this, it was this:


(Dan S) #3

I’ve changed all the peers to be solicited instead of auto in m config.

What else?

I have also upgraded my machine image size from a mx5.xl to a mx5.2xl to raise the network bandwidth. AWS has some pretty low non published limits on network adapters. they have them down at around 160mb/s on a mx5.xl and I am up in that territory with all the network file system usage. .


(Dave Burgess) #4

Adding more memory for this (IIRC) doesn’t change the performance of this part of the system. The problem is just the number of operations that the system is doing and there just are enough electrons to make this happen.

Let us know if this doesn’t solve the problem. I’m sure one of us can search through the Forum archive for you and make more suggestions.


(Dan S) #5

It wasn’t to fix the server mem or cpu side. Its because aws has some arbitray limits on network throughput. the data on a mx5.xl is that it only supporuts about 1.2 gbits or around 155mb/s Which the performance data shows I was hitting. So in AWS the only option is to buy more of everything to get more network.


#6

I’m curious as to how many extensions you have to hit this kind of problem if you don’t mind giving out that info…


(Dan S) #7

Just under 500 extensions.

The problem is the stasis queue backed up, And then I had all kinds of issues with high memory usage and calls not coming, and users not able to make calls. And then the users think the sky fell and then is off to have a root cause analysis and I’m off to electric chair of IT.