I’ve been running FreePBX on an EC2 instance since early April of this year and it has not been a smooth transition.
Ever since switching over we have calls cut out randomly on all the extensions. It can happen anywhere between 2-5x a day.
In light of current events we don’t even have many people on the phone at once. Only about 10 max (very small company). Before this was deployed my test calls didn’t seem to have any issues. However, please note I didn’t test the calls for very long, a few minutes at a time at different times of the day.
Complete list of symptoms:
- Calls cutting out (can be anywhere between less than a half a second, to 5 seconds)
- It can happen on all the calls that are live, or just on a couple of therm
- The times it cuts out are at random times. some days it’s heavier in the morning, other days it’s in the afternoon.
- The length of call doesn’t matter, the call itself can be only a few minutes in, or starting at a half hour or even hour in if a call goes that long.
- The extensions affected are all of them, and it seems to surge with all active calls before dying down and performing as expected for another couple of hours.
- Some surges cause very little interference (less than half a second), to a very noticeable disturbance (agents/clients both saying “You’re cutting out” to calls flat out dropping)
- The surges can be a couple of cuts before being fine, to being 5 or 6 times before finally straitening out
- Happens for both IB and OB calls.
Setup of our network:
- The FreePBX server is on an EC2 instance, connected to an S3 bucket for easy access/storage to our recordings.
- FreePBX version is 18.104.22.168.
- All extensions are using PJSIP
- We have an SD-WAN setup before it goes to our internal network, there’s no firewall rules setup to block VOIP calls from our specific network EC2 instance.
- We have two routers going to our locations each with QoS setup. Please note, these are home/small business routers and not commercial routers and our ‘locations’ is in the same place, only different suites in the building.
Here’s what has been tried so far:
- Setup QoS on the routers and the SD-WAN we have running to prioritize VOIP calls.
- Put data on one channel and voice on the other for the SD-WAN so VOIP calls have a dedicated channel.
- Each of our ISP’s have 500MB and the monitoring I have setup shows we don’t use more than 20MB at a time (we only have a few calls and people checking different map apps on stations for our clients)
- Double checked all connections (I know someone would’ve asked that at one point)
- Double-checked that the audio codecs on our VOIP Server and softphone are the same (currently using the softphone using MicroSIP).
- The onboard FreePBX firewall is setup to allow our ISP’s
- Whenever a surge happens I bring up one of the recordings to see if the recording is breaking up as well. The recording has no such issue. The recording catches everything.
- I’ve checked the EC2 instance via putty to see if it’s running high for some reason and the server is running at 10% or lower with plenty of VRAM to spare during the surges, so I figure it’s not the server or installation itself
- The logs on the FreePBX during one of these surges I haven’t seen anything pop out at me (although, I’ll be the first to admit that I’m no expert at these FreePBX logs). I don’t see any uptick or different errors on the FreePBX logs when a surge happens versus when there’s no issues at all.
- The changes I’ve been doing have made small improvements to the VOIP calls. Prioritizing VOIP traffic throughout the network, matching the audio codecs as well as switching over to less bandwidth heavy codecs, have made the surges less disruptive than before but hasn’t completely solved the problem. This tells me I might be on the right track.
The last idea I have is to change our network landscape so that our routers are no longer issuing DHCP and only having our SD-WAN do the job which in turn will have ALL of our internet traffic not be filtered by any router/switch and going right to the SD-WAN, with our routers only being WiFi hubs in a sense. However, I was told by the SD-WAN administrator it may only solve the issue for a couple of days. I’m currently still planning to make the change however.
At this point, I’m at a complete lost and am looking for ANY ideas of what might be happening to the FreePBX installation