FreePbx getting slower by the day

Progressively throughout the week if we start it on Monday, the gui and server itself will become increasingly slower by the day until it becomes unusable. The only way to “solve” this issue is by doing a
“fwconsole restart” once a week.

Server info:
FreePBX 15.0.17.55
Current Asterisk Version: 16.20.0

Install (and enable) sysstat and you can get an hourly progressive report on your systems resource usage. Cpu,memory,disk io and network

It’s not a system resources issue, we’re running Dual X5660 (HyperThreat disabled) 72Go, SAS Drive Raid 10, 2 x 10Gbs SFP+ LACP. It can take upwards of 1 minute to load the “extensions” page, and all it has to do is load from the TMP directory. And once the extensions page loads, it can take another minute just to get into that extension.

Here are the results from sysstat:

Average: CPU %user %nice %system %iowait %steal %idle
Average: all 1.10 0.00 0.60 0.01 0.00 98.29
Average: 0 0.90 0.00 0.65 0.00 0.00 98.44
Average: 1 2.10 0.00 0.91 0.00 0.00 96.99
Average: 2 1.07 0.00 0.58 0.06 0.00 98.29
Average: 3 1.32 0.00 0.59 0.00 0.00 98.09
Average: 4 0.75 0.00 0.38 0.00 0.00 98.86
Average: 5 1.32 0.00 0.71 0.00 0.00 97.98
Average: 6 0.83 0.00 0.44 0.00 0.00 98.73
Average: 7 1.25 0.00 0.70 0.00 0.00 98.05
Average: 8 0.94 0.00 0.46 0.00 0.00 98.60
Average: 9 0.68 0.00 0.46 0.00 0.00 98.86
Average: 10 1.43 0.00 1.00 0.00 0.00 97.57
Average: 11 0.64 0.00 0.34 0.00 0.00 99.02

I never suggested you were under resourced, sarwill give you hourly snapshots of increasing resource usage over the last week, it will show you what resource is being used more and more after a restart. I’m guessing memory but it might be disk io

we’ve tried running the server virtualized, had the original issues then we moved it to a physical box and the issues kept persisting, and we are still experiencing them.

Is that sysstat output shown from when the system is slow?

Basically you need to do more investigation into what is using the resources. i.e. asterisk, mysql, etc.
The systat package can help, I tend to use ‘atop’ (which can store the data over time and shows cpu, processes, network and disk IO in a top-style format) and nagios/check_mk monitoring of CPU and disk IO.

Also, what is being used in FreePBX - for example, I believe I’ve heard that Zulu can cause performance issues.

I’ll try what you suggested and I’ll get back to you next week with some results from those snapshots!

Status update, now it takes 15s to load into an extension and here are the server stats to compare with last time:
Average: CPU %user %nice %system %iowait %steal %idle
Average: all 1.10 0.00 0.75 0.01 0.00 98.14
Average: 0 1.41 0.00 1.14 0.00 0.00 97.45
Average: 1 1.12 0.00 0.53 0.00 0.00 98.35
Average: 2 1.35 0.00 1.05 0.06 0.00 97.53
Average: 3 1.51 0.00 0.84 0.00 0.00 97.65
Average: 4 0.75 0.00 0.53 0.00 0.00 98.71
Average: 5 1.26 0.00 0.73 0.00 0.00 98.00
Average: 6 0.67 0.00 0.40 0.01 0.00 98.92
Average: 7 1.07 0.00 0.67 0.00 0.00 98.27
Average: 8 0.81 0.00 0.49 0.00 0.00 98.70
Average: 9 0.66 0.00 0.35 0.00 0.00 98.99
Average: 10 1.50 0.00 1.54 0.00 0.00 96.96
Average: 11 1.12 0.00 0.75 0.00 0.00 98.13

There is no significant CPU usage (%idle ~= 100) and no significant use of fast I/O (%iowait ~= 0 - which I think means disk). That suggests, to me, that is suffering from network problems, or from, maybe, an external database server.

I’m not sure if %iowait includes page I/O, and you didn’t provide disk statistics. Assuming you are not in the same room, to actually hear if the disk is being thrashed, vmstat should be useful.

As previously suggested sar (part of sysstat) does provide much more specific io stats (and other things)

Is the memory or disk space an issue? We had a similar issue having to do with a slowly increasing memory leak. After a period of time, the memory would be completely consumed, everything would slow, audio would drop. Are you a 24/7 operation? If not maybe you can set a cron to fwconsole reset the box once a night during off hours, as a mitigation until you get the issue solved.

Delays from low memory will show as disk thrashing, which ought to be obvious from sar or vmstat or by simply listening to the disk for large numbers of head arm movement.

For what it is worth, I have noticed that our PBX starts to slow when there are more and more recorded calls piling up. After cleaning them up, the PBX pep returns. Just a thought.

Please expand, where are they “piling up” ? What is your platform? How full is your file system?

1 Like

Perhaps poor choice of words on my part.
When the count and size of manually recorded calls in the /var/spool/asterisk/monitor/… folders grows, the PBX tended to get sluggish. Since keeping older recordings purged, reducing the size and count, the issue appears alleviated.
Could be coincidental to something else.

Did you ever try my suggested sar analysis ? as the filesystem grows does anything else suffer?, if so at what time of day? , i.e

sar
sar -1
sar -2

compare and contrast

1 Like
					  CPU   %user  %nice  %system %iowait %steal %idle

sar -1 11/18/2021 Average: all 0.62 0.00 0.37 0.14 0.00 98.87
sar -2 11/17/2021 Average: all 0.62 0.00 0.42 0.13 0.00 98.83
sar -3 11/16/2021 Average: all 0.54 0.00 0.43 0.13 0.00 98.90
sar -4 11/15/2021 Average: all 0.50 0.00 0.41 0.13 0.00 98.96
sar -5 11/14/2021 Average: all 0.96 0.00 0.30 0.12 0.00 98.62
sar -6 11/13/2021 Average: all 0.41 0.00 0.31 0.13 0.00 99.15