We had a perfectly working FreePBX server a few weeks ago, except it was starting to get overloaded.
We purchased a new, larger server, backed everything up, and then restored to the new phone server.
We found out very quickly that a “full backup” apparently doesn’t mean a “full backup”.
A lot of settings weren’t restored properly. One example is incoming routes. We had a handful set to ‘detect faxes’. These were magically flipped into ‘legacy mode’ so faxes were routed incorrectly. We had to manually change ~200 phone lines back to ‘yes’ instead of ‘legacy’…so this problem may be related, but I can’t seem to find it.
When people call in, it goes to an IVR that asks them to press 1 if they are a new customer or 2 if they are an existing customer. The options are simply for tracking purposes on our end.
Pressing 1 sends them to a queue–and for the purposes of this issue I’m ignoring this option.
Pressing 2 sends them to a ring group.
The ring group consists of a handful of Digium D60 phones.
Dialing that ring group from my internal phone always works. It always rings the phones, and someone always answers.
Calling in from a number of different sources (cell phone, home phone, Google Voice, etc…) will sometimes cause the phones to ring, and other times it immediately jumps to the ring group failover destination which is a time condition. If we are ‘in hours’ it goes to the queue I mentioned earlier. If we are ‘outside of hours’ it goes to voicemail.
It seems to be hit-or-miss, but when the phones in the ring group don’t ring, I will sometimes get dumped into the queue, and other times I will get dumped to voicemail.
Looking at the logs, I see:
== Spawn extension (from-internal, 4001, 1) exited non-zero on 'PJSIP/4002-00005f23' == Spawn extension (from-internal, 4001, 1) exited non-zero on 'PJSIP/4003-00005f24' == Spawn extension (from-internal, 4001, 1) exited non-zero on 'PJSIP/4010-00005f25' == Spawn extension (from-internal, 4001, 1) exited non-zero on 'PJSIP/4020-00005f26'
If I immediately dial those extensions or the ring group on my desk phone, the call connects and the phones ring.
Each office has a firewall that is NOT running SIP ALG. UDP timeouts are set to 300 seconds, and nothing firewall-related has changed in months. It’s the same config that was working from before move to a new server. Each office connects out over the internet to our phone server. There is no VPN or ‘internal network’ involved in the phones communicating. The only phones that are on the same network as the phone server is the call center. The call center users are the ones in the queue I referred to previously.
Rebooting phones involves does appear to clear up the issue for a few days, then it comes back…but it still doesn’t explain why the queue will frequently send users directly to voicemail when there are agents signed in and ready to take calls…and calls from other desk phones are able to get through to the queue.