CPU spikes on Asterisk

Hi all,

I am migrating the old elastix system to new FreePBX Distro (latest 10.13.66) and I am experiencing strange CPU spikes from time to time even though the system is under no load. Spikes sometimes exceed 100% CPU (more than one fully utilized core).
Could you please point me to a direction, how to figure out, why is this happening? I tried ps -LlFm and pstack with no luck so far. I attach top output.
System has currently configured with cca 250 PJSIP endpoints (cca 150 online).
Thank you.

top - 12:46:03 up  2:51,  2 users,  load average: 0.10, 0.14, 0.13
Tasks: 175 total,   1 running, 174 sleeping,   0 stopped,   0 zombie
Cpu0  : 32.7%us,  4.3%sy,  0.0%ni, 63.0%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu1  :  9.0%us,  3.3%sy,  0.0%ni, 87.7%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu2  : 17.9%us,  1.6%sy,  0.0%ni, 74.3%id,  0.0%wa,  0.0%hi,  0.0%si,
6.3%st
Cpu3  : 58.0%us,  2.3%sy,  0.0%ni, 39.7%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:   3922216k total,  2187576k used,  1734640k free,    93200k buffers
Swap:   786428k total,        0k used,   786428k free,   534736k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2061 asterisk  20   0 3021m  88m  16m S 99.0  2.3  17:45.63 asterisk
20974 asterisk  20   0  331m  26m 9056 S  4.3  0.7   0:00.13 php
20950 root      20   0  110m 2444 1252 S  2.7  0.1   0:00.82 htop
 1492 mysql     20   0  947m  40m 6284 S  1.7  1.1   0:36.99 mysqld
 3114 mongodb   20   0  327m  43m  20m S  0.7  1.1   0:33.33 mongod
    4 root      20   0     0    0    0 S  0.3  0.0   0:00.16 ksoftirqd/0
  860 root      20   0     0    0    0 S  0.3  0.0   0:00.39 kauditd
 1746 asterisk  20   0 4607m 484m  18m S  0.3 12.6   2:59.63 java
 2140 root      20   0  323m  16m 9100 S  0.3  0.4   0:06.46 php
20951 root      20   0 15032 1308  948 R  0.3  0.0   0:00.08 top
    1 root      20   0 19360 1500 1196 S  0.0  0.0   0:00.81 init

What is the CPU of the machine. Run a tcpdump using nohup and the next time that you get high usage check the pcap file what sip messages asterisk exchanges. No ringing phones doesn’t mean that is nothing happening at the core of asterisk.

thank you. I have done that as well, but found out just an ordinary SIP traffic - OPTIONS and REGISTERS. I managed to combine ps and pstack to get the output leading somewhere…all the threads that have some significant load have something like this:

#0  0x00007f140e3dc6d0 in memcpy () from /lib64/libc.so.6
#1  0x00007f14103d5eff in ?? () from /usr/lib64/libsqlite3.so.0
#2  0x00007f14103c0560 in sqlite3_step () from /usr/lib64/libsqlite3.so.0
#3  0x00000000004f98f5 in ast_db_gettree ()
#4  0x00007f13c5d5cfde in ?? () from /usr/lib64/asterisk/modules/res_sorcery_astdb.so
#5  0x00000000005ca2c1 in ast_sorcery_retrieve_by_regex ()
#6  0x00007f138eb57f6b in ast_sip_location_retrieve_aor_contacts_nolock () from /usr/lib64/asterisk/modules/res_pjsip.so
#7  0x00007f138eb5a587 in ast_sip_location_retrieve_aor_contacts () from /usr/lib64/asterisk/modules/res_pjsip.so
#8  0x00007f138eb5a827 in ast_sip_for_each_contact () from /usr/lib64/asterisk/modules/res_pjsip.so
#9  0x000000000045eb0c in ?? ()
#10 0x000000000045ee3f in __ao2_callback ()
#11 0x00007f138eb591c1 in ?? () from /usr/lib64/asterisk/modules/res_pjsip.so
#12 0x00007f138eb5c1e4 in ast_sip_cli_traverse_objects () from /usr/lib64/asterisk/modules/res_pjsip.so
#13 0x00000000004da4cc in ast_cli_command_full ()
#14 0x00000000004da630 in ast_cli_command_multiple_full ()
#15 0x0000000000453fce in ?? ()
#16 0x0000000000600bb4 in ?? ()
#17 0x00007f140f0b3aa1 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f140e43b93d in clone () from /lib64/libc.so.6

Guess it searches DB for CONTACTS? Is it really so expensive? Sorry, I am newbie in reading this stuff.
Any idea?

Thank you

Hello Loki,

I know what this cpu spike causes. And how to resolve this.
I have experienced the same on all sort of FreePBX Distro’s.
Look at the time in your image, upper left. A few seconds past the minute.
This is caused by a group od crons which executes every minute.

You can try the next thing:

  • Remove (not Disable) the modules: PagingPro, QueuePro Qxact Reports (All Commercial modules)
    Disable PagingPro and QueueXactReports, dont fix the CPY spikes! Remove them instead.
  • If you have queues, look at all queues and set [Reset Queue stats] to NO
  • Disable Zulu server at bottom of AdvancedSettings
  • Set PNP server to no at Sysadmin module
  • check cronjobs for user asterisk with

crontab -e -u asterisk

And check everything with * * * * * [ which executes every minute]
Now reboot the system and check again with the top command.

You will see that the CPU spikes are gone.

Some cronjobs execute at exactly the same time. (whole minute)
See my screenshot.

Let me know if my solutions resolves your cpu spikes.

2 Likes

Hello,

thank you very much for your suggestions. I pasted the question to the Asterisk forum as well and the guys there pointed me at the direction of searching the reasons why the CONTACT information for the endpoints is fetched.

I then realized that I have Nagios fetching the information about response times on the configured trunks. I disabled this monitoring and the spikes are now much less common. So, basically, it seems the code to access the contact information is not optimized and the CPU time is taken because it needs to access SQLite DB to get the info. Nevertheless, I will put your suggestions into action this evening and report the results after some time.

Once again, thank you very much for your suggestions.

1 Like

Hi,

i tried 4allbusiness suggestions, but I noticed no significant change in the behavior.

Moreover, when I added 200 more extensions (now almost 500 in total) the spikes got worse, now I am getting 200+% CPU load on asterisk every 40-50 seconds. System works fine so far, but I worry about the scalability.

Any suggestions how to optimize asterisk behavior?

Thank you