Lost SIP Trunking Registration?

So typically the small handful of times we lose connectivity to our SIP trunking provider, I see events like this logged.

[2019-10-12 21:22:26] NOTICE[28800] chan_sip.c: Peer '5264232744GW2' is now UNREACHABLE!  Last qualify: 39
[2019-10-12 21:22:30] NOTICE[28800] chan_sip.c: Peer '5294877463GW2' is now UNREACHABLE!  Last qualify: 40
[2019-10-12 21:22:32] NOTICE[28800] chan_sip.c: Peer '5294877463GW1' is now UNREACHABLE!  Last qualify: 26
[2019-10-12 21:22:36] NOTICE[28800] chan_sip.c: Peer '5264232744GW2' is now Reachable. (154ms / 2000ms)
[2019-10-12 21:22:40] NOTICE[28800] chan_sip.c: Peer '5294877463GW2' is now Reachable. (39ms / 2000ms)
[2019-10-12 21:22:52] NOTICE[28800] chan_sip.c: Peer '5264232744GW1' is now UNREACHABLE!  Last qualify: 26

Today we had a hiccup. My SIP trunking provider sends me an automated SMS message informing me that SIP trunking registration has been lost. When I immediately hopped into their web portal, it likewise shows unregistered. Then after a couple of seconds we registered again.

When I looked at the Asterisk logs, I didn’t see any of the typical lost registration events like above. All I saw around the time of the blip was the following:

[2020-03-05 13:06:56] VERBOSE[3605][C-00003612] pbx.c: Executing [16143364545@from-trunk:19] Set("SIP/5264232744GW1-00006b69", "FAXOPT(faxdetect)=yes") in new stack
[2020-03-05 13:06:56] VERBOSE[3605][C-00003612] pbx.c: Executing [16143364545@from-trunk:20] Answer("SIP/5264232744GW1-00006b69", "") in new stack
[2020-03-05 13:06:57] VERBOSE[3605][C-00003612] pbx.c: Executing [16143364545@from-trunk:21] Wait("SIP/5264232744GW1-00006b69", "4") in new stack
[2020-03-05 13:07:01] WARNING[3605][C-00003612] channel.c: Exceptionally long queue length queuing to SIP/5264232744GW1-00006b69
[2020-03-05 13:07:01] WARNING[3605][C-00003612] channel.c: Exceptionally long queue length queuing to SIP/5264232744GW1-00006b69
[2020-03-05 13:07:01] WARNING[3605][C-00003612] channel.c: Exceptionally long queue length queuing to SIP/5264232744GW1-00006b69
[2020-03-05 13:07:01] WARNING[3605][C-00003612] channel.c: Exceptionally long queue length queuing to SIP/5264232744GW1-00006b69

That one warning repeated dozens of times (same timestamp) until the next line in the workflow finally kicked in. Specifically, hitting the daynite call flow control.

[2020-03-05 13:07:01] VERBOSE[3605][C-00003612] pbx.c: Executing [16143364545@from-trunk:22] Goto("SIP/5264232744GW1-00006b69", "app-daynight,0,1") in new stack
[2020-03-05 13:07:01] VERBOSE[3605][C-00003612] pbx_builtins.c: Goto (app-daynight,0,1)
[2020-03-05 13:07:01] VERBOSE[3605][C-00003612] pbx.c: Executing [0@app-daynight:1] GotoIf("SIP/5264232744GW1-00006b69", "0?timeconditions,2,1:timeconditions,1,1") in new stack
[2020-03-05 13:07:01] VERBOSE[3605][C-00003612] pbx_builtins.c: Goto (timeconditions,1,1)

Looking at the utilization around this time, it’s not like FreePBX was being slammed with concurrent calls, heavy CPU load, etc. First time I recall seeing this. Any suggestions?

I think its sleepy thread issues under-the-hood just not waking up when they should.

But the issue is complicated by separate “peers” which are the same remote IP.

Maybe limit the qualifies to only one of these peers ?

This way you aren’t re-pinging the same IP at the same time for N peers…

There are a total of two SIP trunks, both associated with one SIP provider. Each trunk registers against a gateway 1 DNS name and gateway 2 DNS name. So there is some duplication. Even if I eliminated the gateway 2 DNS name, there still would be a dupe in that both SIP trunks are registering against the same gateway 1 DNS name?

My question I guess really is if I should change anything in terms of my SIP trunk configurations. This was what SIP.US automatically provisioned in FreePBX through their integrated module. I would think it beneficial to have some form of redundancy in terms of each trunk first looking for GW1, then looking for GW2, right?

Drop some of the qualifies and other “extra” SIP signaling packets eg. MWI polling, so you don’t work the thread so hard ?

Move to static IPs so you don’t need to worry about traffic state (as much) in your firewall ?

Switch to PJSIP ? :wink:

Also how many phones ?

@gregarican - the thing about Chan-SIP is that it can only point at one IP ADDRESS. That means that, even if the name resolves to a dozen addresses, it’s only ever going to use one. PJ-SIP doesn’t have this restriction. Similarly, PJ-SIP can point to two (or considerably more) different IP addresses and use as many of them as you’d care to set up. That’s one of the (many) reasons that PJ-SIP is preferred over Chan-SIP.

On a personal config setup note - I never use FQDNs. I always use IP addresses, if only to avoid the problems that a flaky DNS Server can cause.

Given the above, I’d say “Yes, switch your config to PJ-SIP and point the system at all of the IP addresses associated with your provider’s inbound addresses.”

Thanks for all of the clarification! I will look to implement these changes soon. In a little over 2 years’ worth of rolling out FreePBX to our site locations, so far this was the only hiccup I’ve seen of this nature. So all in all I’m more than pleased and think that this recommended configuration change can only further solidify things. Appreciate the help!

The FreePBX has an internal, private LAN IP. I don’t expose it as a public service via NAT for inbound access on my firewall.

My firewall is a Cisco ASA. The router is a Cisco 1921, managed by AT&T. This was the only instance I see where we have had an issue in 2+ years, so I’m not terribly concerned. But would like to clean things up and be running optimally. Everyone’s suggestions are certainly helpful in that regard!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.