FreePBX Randomly Stops processing calls

I have a serious issue that is leading management to question the purchase last August of the Sangoma FreePBX Phone System 60 appliance that I convinced them to buy, to replace running Asterisk and FreePBX on a 5 year old Ubuntu box that also doubled as a file server. The Sangoma PBX is currently up to date, with no available FreePBX updates showing, and it is currently on FreePBX 14.0.11.

Basically, over the past two months, the PBX has become extremely unreliable. It will just stop handling incoming or outgoing calls from the SIP trunk (AT&T), yet the FreePBX web interface will make it appear that everything is great - all extensions are registered, the trunk is up, usually one call is shown as active, even though no calls are active.

Digging into the underlying asterisk “full” log file, almost invariably, I see that the last thing logged is a call out to dialparties.agi, to get the extensions in a ring group, to dispatch an incoming call, like this:

[2019-05-28 07:28:29] VERBOSE[4433][C-0000001e] pbx.c: Executing [s@macro-dial:6] AGI(“SIP/ATT-0000004d”, “dialparties.agi”) in new stack
[2019-05-28 08:15:48] Asterisk 13.26.0 built by mockbuild @ jenkins7 on a x86_64 running Linux on 2019-04-11 19:57:02 UTC

That second message was after a restart of Asterisk, when we noticed the phones had not rung for 45 minutes! Nothing else will be logged until I manually either reboot the Sangoma box, or restart asterisk from the command line. Calls from extension to extension continue to work, no errors are logged in ANY system log files, dmesg shows no hardware issues. Asterisk is just off the weeds.

When a normal call is processed, dialparties.agi logs a bunch of information, the first of which is one of the first lines I see in script itself - I should be seeing that “Launched AGI Script” log message, but never do.

[2019-05-28 08:22:18] VERBOSE[17015][C-00000000] pbx.c: Executing [s@macro-dial:6] AGI(“SIP/ATT-00000000”, “dialparties.agi”) in new stack
[2019-05-28 08:22:18] VERBOSE[17015][C-00000000] res_agi.c: Launched AGI Script /var/lib/asterisk/agi-bin/dialparties.agi
[2019-05-28 08:22:18] VERBOSE[17015][C-00000000] res_agi.c: dialparties.agi: Starting New Dialparties.agi
[2019-05-28 08:22:18] VERBOSE[17015][C-00000000] res_agi.c: dialparties.agi: Caller ID name is ‘WIRELESS CALLER’ number is ‘XXXXXXXXXX’

I blanked out the caller id in that last line above.

Anyway, I am at my wits end. All of this SEEMS to have started 6 to 8 weeks ago, after I let FreePBX install a number of updates. Before that, the box was stable for months (August until around late March). Right now, I have to spend my days SSH’ed into the PBX, running tail -f /var/log/asterisk/full, and if I see that the log file has stopped on the call out to dialparties.agi, I go restart asterisk from the command line. If I don’t do it fast enough, one of our support staff is going to go cycle power to the unit.

We run tech support 24/7 for our customers, and this is a business critical issue. Any suggestions are appreciated.

Asterisk version? This was posted 6-8 weeks ago: Removal of Asterisk 13.26.0-1 and Asterisk 15.7.2-1

Lorne, yes I had just found that since posting my question - I am in fact on Asterisk 13.26.0 according to log files, so that must be the update that installed to cause the instability. I will attempt to downgrade asterisk per those instructions after normal business hours today, and just babysit the PBX in the meantime, since there are active calls. We run a 24/7 call center, but things slow down greatly after 5pm, and I usually do system maintenance in the evening hours.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.