High CPU Load after upgrade to FreePBX 14 / Asterisk 13

After I upgraded the server to the above versions using the distro-upgrade command the server load has increased hugely. I didn’t change anything else on the server or setup at the same time.
The overload is causing the hold music and IVR menus to run slowly and broken and is affecting call quality too.

The system is now running as a VM on an ESXi host with 8 cores and 8GB Ram allocated to the VM.

Here is the output of top when there were around 7 live calls on the server (there was only 4GB RAM when I ran this).

Tasks: 169 total,   2 running, 167 sleeping,   0 stopped,   0 zombie
%Cpu0  : 25.2 us, 31.9 sy,  0.0 ni, 40.6 id,  0.0 wa,  0.0 hi,  2.3 si,  0.0 st
%Cpu1  : 26.2 us, 41.9 sy,  0.0 ni, 31.2 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu2  : 28.4 us, 36.1 sy,  0.0 ni, 35.1 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu3  : 24.7 us, 40.3 sy,  0.0 ni, 34.7 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu4  : 31.4 us, 41.9 sy,  0.0 ni, 26.4 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu5  : 27.8 us, 37.8 sy,  0.0 ni, 34.1 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu6  : 23.4 us, 43.8 sy,  0.0 ni, 32.4 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu7  : 25.7 us, 43.7 sy,  0.0 ni, 30.0 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
KiB Mem :  3706348 total,   389940 free,  2120004 used,  1196404 buff/cache
KiB Swap:   786428 total,   581372 free,   205056 used.  1144172 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
16267 asterisk  20   0 5294480 216468  13252 S 316.9  5.8 201:13.77 asterisk
15545 root      20   0   78128   4348   3356 S  59.3  0.1  27:00.14 asterisk
 1176 asterisk  20   0 6846884 953524   8332 S  38.4 25.7 159:12.90 java
25731 asterisk  20   0 1096072  68460   6464 S  12.9  1.8  14:04.72 node /var/www/h
30712 asterisk  20   0  419296  44304  10864 S   9.9  1.2   6:16.29 php
16633 asterisk  20   0  407644  33192  10840 S   9.3  0.9  12:30.80 php
31576 root      20   0  148128   5284   3964 S   5.3  0.1   0:43.35 sshd
13567 root      20   0   78128   4348   3356 S   4.0  0.1   0:24.08 asterisk
30690 asterisk  20   0  305220   9000   6192 R   3.3  0.2   0:00.10 php
29685 root      20   0       0      0      0 S   1.0  0.0   0:00.41 kworker/7:0
29879 root      20   0       0      0      0 S   1.0  0.0   0:00.39 kworker/6:0
    9 root      20   0       0      0      0 S   0.7  0.0   2:31.83 rcu_sched
 1026 mongodb   20   0  581136   9880   2752 S   0.7  0.3  17:26.01 mongod
22400 root      20   0       0      0      0 S   0.7  0.0   0:15.96 kworker/3:0
30588 root      20   0       0      0      0 S   0.7  0.0   0:00.04 kworker/0:1
 1050 mysql     20   0 2091148 377468   5764 S   0.3 10.2 133:49.29 mysqld
 4324 root      20   0       0      0      0 S   0.3  0.0   0:09.20 kworker/4:2
 6491 root      20   0       0      0      0 S   0.3  0.0   0:10.71 kworker/2:2
16741 asterisk  20   0 1145628 109432   6528 S   0.3  3.0   2:06.88 letschat
20027 root      20   0  160024   2360   1528 R   0.3  0.1   0:53.34 top
24375 root      20   0       0      0      0 S   0.3  0.0   0:02.74 kworker/1:2
32625 root      20   0       0      0      0 S   0.3  0.0   0:12.80 kworker/5:1
    1 root      20   0  191132   3040   1888 S   0.0  0.1   0:18.54 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.29 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   0:09.87 ksoftirqd/0
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H
    7 root      rt   0       0      0      0 S   0.0  0.0   0:01.20 migration/0
    8 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_bh
   10 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 lru-add-drain
   11 root      rt   0       0      0      0 S   0.0  0.0   0:01.52 watchdog/0
   12 root      rt   0       0      0      0 S   0.0  0.0   0:01.33 watchdog/1
   13 root      rt   0       0      0      0 S   0.0  0.0   0:01.20 migration/1
   14 root      20   0       0      0      0 S   0.0  0.0   0:00.73 ksoftirqd/1
   16 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/1:0H
   17 root      rt   0       0      0      0 S   0.0  0.0   0:01.31 watchdog/2
   18 root      rt   0       0      0      0 S   0.0  0.0   0:01.17 migration/2
   19 root      20   0       0      0      0 S   0.0  0.0   0:00.92 ksoftirqd/2

The process causing the problem is always the asterisk one.

I disabled remote CDR logging, call recording and some other things and it didn’t make a difference to the load.
Even ringing from one extension to another when there is no other server activity will push CPU load to 50% viewed using top.

I have looked at other threads regarding this and although a few other people have had this problem, no one seems to have found the cause. Please could someone help as I am not sure how to work out what the problem is.

We noticed it as well. I’ve found no solution to it. I have tons of posts about it all over the forums.

Are you running pjsip extensions or chansip?

We are mainly using chan_sip but I have enabled pj_sip and installed a couple of extensions.

When I disable pj_sip is doesn’t affect the server load.

Also when I enable pj_sip and phone that extension it still spikes the server load. For this reason I didn’t think it was worth while changing everything over to pj_sip.

Does this problem still exist if you install and set up a FreePBX server from scratch?

I was wondering if this is upgrade related or hardware related.

I see my choices at the moment as

a) trying to get to the bottom of the problem
b) move the server onto a different host to see if that makes a difference
c) setting up a new server from scratch.

I want to get to the bottom of this really as I see it as a problem that could affect lots of FreePBX users over the next couple of years and could even be the result of an underlying problem in Asterisk or FreePBX

We are not going back to setting it up from scratch so we have not tested.

We had Freepbx 13 running Asterisk 13 then upgraded to Freepbx 14 asterisk 13 and noticed a huge jump in CPU usage from the main asterisk thread, and UCP history on 14 uses so much more processing but that’s a mysql thread.

We have moved hosts and hardware with no change. Throwing hardware at it delays the issue till you get to around 700 phones and a lot of call flow. We are running a dedicated server too so no noisy neighbors.

Yup it’s going to become more and more prevalent. I hope it does so it gets fixed. We spent a ton of time trying to narrow this down but we think it’s something core in asterisk or something being fed into asterisk by Freepbx. Developers of Freepbx are having a hard time replicating.

I can consistently spike the cpu by having a queue with 12 people set to ring all at once with each of those phones having 20+ blf’s watching one another.

Let’s also be clear on something here, you’re claiming that this has been narrowed down to some Asterisk/FreePBX issue but you have been asked in other threads to provide some sort of proof, logs, debugging, etc and you said you didn’t have any of that. You had issues and had to “fix it immediately” so no time to do all that troubleshooting. It wasn’t just people like me asking for these details Sangoma/Digium people have asked too and nothing has been provided.

So where do they need to look and what do they need to fix exactly?

Greetings, If its not to late I will add some info to this topic. I also have a Freepbx system running in a VM and I have noticed several similarities to Mark’s issues with high cpu and call quality. I am running the Freepbx 14 distro on a fresh install with 3 queues,isyphony used extensively,3 parking lots on parking pro, 47 extensions chan_sip (Yealink T46G phones) and we run in ‘User & Device Mode’, we also record all calls however not until they hit an extension. The high cpu shown in the pictures below happens every time the phones ring not just when the call initially comes in or when someone picks up, in other words, if the phone rings twice I get two spikes. I have thrown 10 cores and 16 gigs of ram at this machine in an attempt to keep it functional for more than a week and it seems to suffer the same RAM / SWAP issues over time as many others have complained about in the forums (I think this is a kernel issue?)… I rebooted last night at midnight and was using 4% swap after 9 days uptime, as you can see its now back down to 0% swap.

HTOP HIGH CPU DURING CALL

HTOP NORMAL CPU

TOP HIGH CPU DURING CALL

TOP NORMAL CPU

MEMORY SWAP
freepbxDashScreen

I understand this is not diagnostic logs and there may be more information needed however, I do not know enough about this software to diagnose it so if there is some other information I can provide please let me know. This is a production PBX so taking it down can only be done after hours.

@Dellsmash

Looking over your screenshots a lot of your CPU and RAM is going into isymphony. I don’t think anyone else in this thread is running isymphony. I suggest turning off and shutting down isymphony for a day or a week.

You’ll notice that the ‘java’ process (This is isymphony) is using 10.6GB of your RAM

The best advice I can give to anyone is turn that off and after 5 days reboot Asterisk and you will see improvements. If you don’t report here.

Does anyone have any instructions for trying to identify what the cause of the problem is?
I read on another thread that the VM E1000 network card type could be a bottle neck so I changed over to a VMXNET3 and that might have helped slightly. How would I tell whether there is a hardware bottle neck?

Although we are using some of the swap at the moment we have 4GB of memory free and I don’t think there is a memory leak or increasing server load over time unlike some people have reported.

This is pretty frustrating as we have used asterisk over the years on machines with nowhere near the resources of the current server with no issues at all. Asterisk has always been very efficient. The problems reported all seem to be slightly different but I was wondering if people have always had problems like this or whether it is a new Asterisk 13 phenomenon.

It’s not. Don’t confuse the system user of “asterisk” as actually Asterisk. The system user “asterisk” is the main user for all the things that run on the system. Apache, etc all owned and managed by the user “asterisk”.

Java is not an Asterisk thing. It’s eating up a lot of your memory. That’s a Java thing. Uninstall any and all FreePBX modules that you have no plans on using. Not going to use Zulu, turn it off, etc, etc, etc.

I realise the asterisk user can run multiple things but the asterisk command is what is using up our resources and causing the problem, we don’t have Java process causing a problem.

I have turned off all the modules that we are not using including Zulu. Just phoning from one extension to another causes a big spike in CPU usage which doesn’t look normal. As I said before, we had no problems at all before the upgrade and there were way less resources allocated to it.

I got the screenshot confused with who posted it. So yeah.

Have you restarted Asterisk recently?

Yes, I have restarted when I was trying other things.

This should solve the problem temporarily

No this will not solve the problem.

My issue is there 100% of the time, I can start the server up and demonstrate the problem just by dialing a group.

I don’t not have any issues that get worse over time and I don’t think I have memory leakage or anything like that.

Sorry to say we have not seen this issue in support

@tm1000

I wish I could disable iSymphony however we have 30+ people using it and it makes the less ‘computer savvy’ folks capable of managing their inbox’s and listening to old calls with ease… We even put a link to it on our desktops using active directory to make sure they don’t accidentally erase it and lose access ( ya i know, whatcha gonna do though… if it isnt simple then it might as well not be available for some people…). Either way, is this the main cause of the CPU and sound quality or is it just because its running in a Virtual Machine and needs bare metal?

Andrew, I am not using iSymphony, and I have all the same troubles as @adtopkek, I keep adding more hardware to the issue, but I’m running out of options when it comes to that… and I have a scheduled restart every morning at 5:15…
here’s a couple of screen shots of mysql and htop. please tell me what else I can show you, happy to let you connect via vpn and let you poke around if that will be easier?