Internet goes down temporarily and phones lose registration. Phones never come back up


(Kafluke) #1

I’m having an issue where internet will go down at one of my sites. It doesn’t stay down but when it comes back up all the phones on site lose registration with endpoint (FreePBX is in a datacenter, not onsite).

What setting in FreePBX 14 will allow me to make it so the phone register again when internet is back up without me having to reboot all the phones. I usually just bounce the POE switch onsite and all the phones come back up. I’d like a solution that doesn’t require me to do this.


(Dave Burgess) #2

I assume you’re using PJ-SIP for your phones.

This sounds familiar:


(Kafluke) #3

I’m unable to duplicate this in my lab FYI. I have a phone registered in my lab to the same PBX that I’m talking about (not onsite. In the datacenter). I took down the internet for 10 minutes (the same amount of time that one of my sites went offline over the weekend). Brought the internet back up and the phone reconnected automatically.


#4

You could lower the registration expiration, but as for a parameter that will FORCE the registration automatically after an internet interruption, I don’t think there is something like that.


(Kafluke) #5

Where can I find that setting to lower the phone registration expiration. Site was only down for 10 minutes over the weekend.


#6

That parameter is on the phone. You would need to access the phone’s settings to change it. It is generally called registration expiration or something like that, maybe register expiry.


(Kafluke) #7

nope. Chan_sip


#8

Yes there is. Access the phone configuration, either through the phone’s web page or the phone display if it has one.


(Peter ) #9

This happened to me a bunch of times too, not from the internet going down, but from my VOIP provider losing registration when they were doing server maintenance. If I was on site I could do a “core reload” through the CLI and it would come back up. But I’m not always on site and needed a simple fix that the staff could follow. This is what I did:

I want to be able to do a “core reload” by dialling extension 301.

I created a Custom Destination with target from-internal-custom,301,1
I created a Misc Application with Custom Destination, and selected the new Custom Destination.

Then in extensions_custom.conf I added:
[from-internal-custom]

exten => 301,1,NoOP
same => n,System(asterisk -x “core reload”)
same => n,Hangup()

(In fact, on my production system I had to use ‘core reload’ rather than “core reload”. I have no idea why or what difference that makes, I just know double quotes didn’t work.)

And this morning it had a real life run (I still don’t know why it went down this time) and the staff fixed the problem by dialling 301. They were nearly as happy as I was.


(Kafluke) #10

Thanks, I’ll give this a try if my next idea doesn’t work. I don’t want to have to really do anything. Dialing an extension to do a core reload is just as easy as me bouncing the site switch. Before we moved all phone traffic to the VPN tunnels we never had this problem and I can’t duplicate the issue in my lab. Once internet is restored the phones would automatically come online. That doesn’t happen after moving to VPN tunnels. I’m thinking it has something to do with that. When we were running all the phone traffic over the general internet we switched provisioning to http instead of tftp. Now that we’ve moved back to internal tunnel traffic I’m going to try and change back to TFTP and see if that makes a difference.


#11

You never mentioned VPN on the OP. What are you using the VPN for? If the VPN is used to connect the phones to the FreePBX server over the internet and the internet connection goes down, then the VPN session is terminated and you lose registration.


(Dave Burgess) #12

This is what should happen. Since this isn’t the experience of the OP, there’s something missing out of the description.

The phones should fail and retry once they’ve lost connection with the server. From my experience, there are only a few things that will make that not work:

  1. The phones lock themselves out of the network by failing to connect to the server and the firewall gets involved.
  2. The VPN isn’t set up to actually collapse when the Internet connection goes away.
  3. The Internet connection isn’t what’s failing and we are looking at the wrong symptom.

The solution to 1 is to review all of the firewall logs (including the Intergrated Firewall logs in FreePBX). If that’s what killing your connections, then changing the firewall to allow those guys to fail more would be the right fix.

The solution to 2 is more complicated. If the VPN stays up but the connection fails, the phones should recognize they have lost the ball on the connection and try to restart. These are phone settings that need to be reviewed to determine how often they try before quitting, etc. Some devices will retry a set number of times and then give up. Normally, resetting the phone re-establishes this connection. If the phone is not recognizing that it is no longer connected, you need to deep-dive into some SIP debugs to figure out why the phone isn’t recognizing the connection is gone, and worse yet, why restarting the PBX makes the phone wake up. Note that this could also be a problem with the phone simply not maintaining the registration in a way the PBX recognizes, which is yet another set of phone (and extension) settings you need to make sure are correct.

The third possibility is that we are actually looking at something like the TLS issue that we’ve been talking about, where the encryption session restarts and the phone isn’t recognizing it. That is far more an infrastructure issue that a FreePBX one, though, so the recommendations we might offer are just suggestions based on guess-work from not fully understanding your setup.


(Kafluke) #13

I resolved the issue by changing over to TFTP provisioning instead of HTTP. Thanks for everyone’s help!


#14

Really strange that provisioning over one protocol or the other would cause issues with VPN reconnection, but I can’t say it is impossible.


(Kafluke) #15

So as it turns out this was not fixed by changing the provisioning protocol. Since we only started noticing this behavior after we moved all traffic to the IPSEC tunnel we researched the behavior. We learned that the problem was that our firewalls were trying to keep the SIP sessions active only using a different route when the tunnel was down (Internet was lost momentarily so tunnel was down). When the tunnel came back up all the SIP traffic was still trying to register with the incorrect route. Here’s the fix:

On the firewall itself create a new static route with a higher weight than the tunnel route and dump all SIP traffic (The interface on the fortigate that does this is called “black hole”). So when the tunnel goes down all SIP sessions are terminated. When the tunnel comes back up that static route has a lower weight so all SIP sessions are directed over the tunnel and phones automatically come online.

Who’d of thunk?


(system) closed #16

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.