So idle box here, nothing really going on. About 8 soft phones connected and not even in calls, no one on the admin, just me over ssh and the load average is 0.60?
I should mention the box is virtualized with Virtual Box.
Is this right?
top - 11:24:25 up 9:49, 2 users, load average: 0.35, 0.61, 0.59
Check your cron jobs for one that runs at the "top of every minute. One of them is running and doing a LOT of work. A quick “ps aux” from the console should steer you in a direction.
There are regular Cron Jobs every 1 min, 5 mins, and 15 mins that jump the load average.
How active is this box and how much call flow is on it? Ours has a lot of both and I can visibly notice jumps every time the cron jobs run, which is expected. Which Freepbx/Asterisk versions?
The cron jobs? We did look into those and they are a thing but even running them manually over and over don’t seem to cause the random cpu jumps we are getting every few seconds.
We have uninstalled most modules we aren’t using but we are using about 75% of the available ones.
It’s more the massive call flow rather than the number of extensions. But we have ~ 1900 extensions 1160 of which are being used. While the call flow is being processed I don’t hear any sound issues until it finishes and asterisk switches over to the new dial plan but thats just momentary, really we expect that and its not that bad. Then for a few minutes after load spikes and call quality goes to shit AFTER the reload is done.
Had a slightly smaller number before moving from Freepbx 13 -> 14 and asterisk was handling the load like a champ and using lower spec hardware.
Once a minute our cron jobs run and it does cause some spike. We have looked into that and it by itself doesn’t seem to cause any issues so we haven’t been really worried about that. They cause a 30-50% spike in the CPU but we will jump from 50% usage to 200% usage inbetween the cron jobs running.
We only seem to start having any call quality issues once the CPU approaches 250% or above thanks to our reniceing of it, if you haven’t reniced asterisk I would highly suggest it.
Yes. After the Fwconsole reload is finished that is when it goes to shit, it lasts a couple minutes. Any issues while the reload is running have been minor since we reniced asterisk. The moment the reload is finished I hear a small noise but I find that to be expected when switching to a new dialplan.
The point is, WHAT is being run and is it critical ( I doubt that it is probably just for the FreePBX gui) if not critical relegate that cron job to run out of hours only or just on Sunday morning,
Mar 30 16:55:01 voicelync3 crontab[23572]: (asterisk) LIST (asterisk)
Mar 30 16:55:01 voicelync3 crontab[23574]: (asterisk) LIST (asterisk)
Mar 30 16:55:01 voicelync3 crontab[23577]: (asterisk) LIST (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23602]: (asterisk) LIST (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23604]: (asterisk) LIST (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23605]: (asterisk) REPLACE (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23607]: (asterisk) LIST (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23609]: (asterisk) LIST (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23610]: (asterisk) REPLACE (asterisk)
Mar 30 16:55:03 voicelync3 crontab[23612]: (asterisk) LIST (asterisk)
Everything else we use we have commented out (disabled) and tested without an impact on call quality after a reload. Causes a few less spikes in the CPU but does nothing for the major 200% spikes which seem to be caused by ???.
I know it is not commented out right now. We spent a day looking into Cron as being a source of our issues, testing, and retesting. We found that yes it caused increase loads when the tasks ran but the increases never resulted in call quality problems even when we spammed the cron jobs over and over and over to try and break it.
Disabling all of them did not help after the reload so we accepted that we just have harmless spikes from cron jobs every minute.
reloading 1900 extensions in FreePBX will never be quick, I believe that was improved in newer versions (what are you using ?), I suggest you refrain from reloading during working hours