FreePBX Distro - System has become unstable

PBX Firmware: 6.12.65-32
For several weeks, my PBX has become strange and unreliable. I have not been able to pin down the symptoms or the source of the problems. Log below captures two calls which ended
up being forwarded to our backup cell phone number by our VOIP vendor. No entry in CDR (would love to see calls with errors included in CDR).

VOIP Trunk is Registered.

[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:1] Set("SIP/FlowRoute_LV-0000007a", "__DIRECTION=INBOUND") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:2] Gosub("SIP/FlowRoute_LV-0000007a", "app-blacklist-check,s,1()") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [s@app-blacklist-check:1] GotoIf("SIP/FlowRoute_LV-0000007a", "0?blacklisted") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [s@app-blacklist-check:2] Set("SIP/FlowRoute_LV-0000007a", "CALLED_BLACKLIST=1") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [s@app-blacklist-check:3] Return("SIP/FlowRoute_LV-0000007a", "") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:3] Set("SIP/FlowRoute_LV-0000007a", "__FROM_DID=1203227xxxx") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:4] Set("SIP/FlowRoute_LV-0000007a", "CDR(did)=1203227xxxx") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:5] ExecIf("SIP/FlowRoute_LV-0000007a", "1 ?Set(CALLERID(name)=+12038499285)") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:6] Set("SIP/FlowRoute_LV-0000007a", "__MOHCLASS=") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:7] Set("SIP/FlowRoute_LV-0000007a", "__REVERSAL_REJECT=FALSE") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:8] GotoIf("SIP/FlowRoute_LV-0000007a", "1?post-reverse-charge") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Goto (from-trunk,1203227xxxx,10)
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:10] NoOp("SIP/FlowRoute_LV-0000007a", "") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:11] Set("SIP/FlowRoute_LV-0000007a", "__CALLINGNAMEPRES_SV=allowed_not_screened") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:12] Set("SIP/FlowRoute_LV-0000007a", "__CALLINGNUMPRES_SV=allowed_not_screened") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:13] Set("SIP/FlowRoute_LV-0000007a", "CALLERID(name-pres)=allowed_not_screened") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:14] Set("SIP/FlowRoute_LV-0000007a", "CALLERID(num-pres)=allowed_not_screened") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [1203227xxxx@from-trunk:15] Gosub("SIP/FlowRoute_LV-0000007a", "cidlookup,cidlookup_11,1()") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [cidlookup_11@cidlookup:1] Set("SIP/FlowRoute_LV-0000007a", "CURLOPT(httptimeout)=7") in new stack
[2016-08-16 14:10:52] VERBOSE[8541][C-00000020] pbx.c: -- Executing [cidlookup_11@cidlookup:2] Set("SIP/FlowRoute_LV-0000007a", "CALLERID(name)=BED BATH BEYOND") in new stack
[2016-08-16 14:11:50] VERBOSE[12928][C-00000021] netsock2.c: == Using SIP RTP CoS mark 5
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:1] Set("SIP/FlowRoute_LV-0000007b", "__DIRECTION=INBOUND") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:2] Gosub("SIP/FlowRoute_LV-0000007b", "app-blacklist-check,s,1()") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [s@app-blacklist-check:1] GotoIf("SIP/FlowRoute_LV-0000007b", "0?blacklisted") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [s@app-blacklist-check:2] Set("SIP/FlowRoute_LV-0000007b", "CALLED_BLACKLIST=1") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [s@app-blacklist-check:3] Return("SIP/FlowRoute_LV-0000007b", "") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:3] Set("SIP/FlowRoute_LV-0000007b", "__FROM_DID=1203227xxxx") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:4] Set("SIP/FlowRoute_LV-0000007b", "CDR(did)=1203227xxxx") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:5] ExecIf("SIP/FlowRoute_LV-0000007b", "1 ?Set(CALLERID(name)=+12038499285)") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:6] Set("SIP/FlowRoute_LV-0000007b", "__MOHCLASS=") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:7] Set("SIP/FlowRoute_LV-0000007b", "__REVERSAL_REJECT=FALSE") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:8] GotoIf("SIP/FlowRoute_LV-0000007b", "1?post-reverse-charge") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Goto (from-trunk,1203227xxxx,10)
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:10] NoOp("SIP/FlowRoute_LV-0000007b", "") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:11] Set("SIP/FlowRoute_LV-0000007b", "__CALLINGNAMEPRES_SV=allowed_not_screened") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:12] Set("SIP/FlowRoute_LV-0000007b", "__CALLINGNUMPRES_SV=allowed_not_screened") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:13] Set("SIP/FlowRoute_LV-0000007b", "CALLERID(name-pres)=allowed_not_screened") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:14] Set("SIP/FlowRoute_LV-0000007b", "CALLERID(num-pres)=allowed_not_screened") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [1203227xxxx@from-trunk:15] Gosub("SIP/FlowRoute_LV-0000007b", "cidlookup,cidlookup_11,1()") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [cidlookup_11@cidlookup:1] Set("SIP/FlowRoute_LV-0000007b", "CURLOPT(httptimeout)=7") in new stack
[2016-08-16 14:11:50] VERBOSE[8599][C-00000021] pbx.c: -- Executing [cidlookup_11@cidlookup:2] Set("SIP/FlowRoute_LV-0000007b", "CALLERID(name)=BED BATH BEYOND") in new stack

PS: If I go to Dahdi Config and reload Asterisk and Dahdi, the problem goes away… but has continued to come back, so far.

So it’s a VoIP trunk? But then you talk about Dahdi. I am lost. Is this DAHDi or VoIP

The symptom was with our VOIP trunk, the problem, who knows. Dahdi Config is a handy way to restart asterisk. I think I have had problems with Dahdi as well but as I say my system has been so shaky and has had so many flavors of problems that I have not been able to find any firm ground.

I captured the logs from this latest outage and it is as good a place to start as any. This particular symptom has reoccurred several times.

I have been tempted to reload the distro and restore but I am not sure how smoothly that would go. Is that advised? How comprehensive is the full restore+audio?

If that’s what you pasted above, that’s not sufficient. There is no error there at all. Please provide the complete call log. If that IS the last thing you see, then there’s a problem with the CID lookup, disable it and then see if the problem goes away.

If I am correct, the log stops in the same place on both calls so it is unlikely the log was simply truncated. Since the call failed, I suspect a bigger problem.

After restarting using Dahdi Config, the system is acting normally. I will not be able to try that until it fails again. Thanks

I feel like you’re missing something here

Have you disabled CID lookup, or not? If you haven’t, then I don’t really know how I can explain this any other way.

Shall I wait till the problem returns, since I have restarted Asterisk?

No. Can you explain what is confusing you about the following statement?

I’m wondering if I’m being too technical or something?

If you want to know, I initially missunderstood CID. So I assume you are saying I should disable the module, as apposed to turning off the CID setting (set to superfecta I think)? And you want me to turn it off now while it is working, and when do I turn it on again?

By the way I just discovered the “Config Edit” feature, something I had asked for a couple of years ago. It is killer, just what I wished for.

Thanks.

I can’t remember who asked for it, but it wasn’t you… I just checked. @tm1000 can’t remember who it was, but it was over a year ago.

http://issues.freepbx.org/browse/FREEPBX-12905?jql=reporter%20%3D%20JessicaRabbit

Well it was some time ago. I’m not trying to take credit just happy it is here. There was quite a discussion about weather it was a good idea or a vulnerability and I suggested that appropriate files could be read only but would be very educational to include them and so on. Nice to have that ability.

My PBX is still not working. It functions following a reboot but after some time (hours?) it fails. The PBX GUI and all functions are stable and the condition does not deteriorate further so it does not appear likely to be hardware.It shuts down normally.

My VOIP vendor says that calls are not being responded to by the PBX such that they then get routed to my backup contact numbers. When I look at the CDR detail and the logs, the calls are coming in and, to my untrained eyes, do not look too much out of the ordinary.

It is the kind of thing that makes one think of a memory leak. The problems began with an up to date version 13. I have backed up to 6.12.65-32 and updated that. My next idea it to reload 6.12.65-32 and not up date it as that copy once was working.

I am at my wits end. Any ideas?

My system is working at the moment… who knows if I located the problem!

I did two things which may or may not be related. One, under General SIP I clicked on Detect External IP which changed to the current IP. Then I cleared the Stun Server (stun.services.mozilla.com) setting. I have no idea when the stun server setting originated.

My PBX problem seemed like it could be a network issue so I hope these changes are related. I have had issues with DDNS functionality in the past.

This will have been your issue. You’d had to have entered that manually. Probably after you installed WebRTC and saw the old warning to add a stun server.

Ok, but my system was working and then just stopped working. What changed?

If the stun or network settings were the cause of the problem the failure mode was not good. There was no sign of the source of the failure and I merely stumbled on it when my vendor said that my registered trunk was not responding when calls came in. I am still not sure what the cause of the failure is, and if it is fixed. The PBX symptoms were all over the place. Best case would have some notice if network is not coherent.

Probably the stun server was faulty

Sorry can’t do this.