Phone IP Addresses Stop Working After Power Failure

I have 5 extensions in an office, the only 5 operating on this phone server. The server sits a number of hops away in a colocation center, but through tunnels, the phones have direct, non-NATted, unfirewalled access to the phone server.

The phones each received their IP via DHCP in the office, along with the basic stuff needed for provisioning. All boiler plate, nothing fancy.

The system worked fine for a few months. Then we started having trouble with the local power company. Basically what will happen is an intermediate router will get rebooted, but not the phones or the phone server. Here’s the weird part…

Once this happens, the phones lose audio. They seem to provision fine upon reboot, and have a dial tone, but when I dial on them, nothing can be heard, not even ringing.

Now here’s the really bad part. Nothing fixes the phone’s IP address… Not rebooting the phones, not rebooting the FreePBX server, not rebooting the routers.

If I move the phone to a different IP address, it works again.

If I take any phone and move it to the failed IP address, it has the same problem, regardless of what phone/extension it is.

This is becoming a problem because I only have about 100 IP addresses left in the subnet to rotate the phones onto. Hopefully the power problems will be fixed before that happens, but there’s no guarantee.

NAT Mode is set to never - no RFC3581.

FreePBX 2.11.0.36
Asterisk 11.5.1

Thoughts???

If anyone wants to see relevant configuration, let me know what you think is relevant and I’ll post it. Honestly I’m not even sure where to begin.

Bump

  1. does restarting your local dhcp server fix the problem?
  2. does restarting your local router fix the problem?
  3. if you wait long enough, does the problem go away?
  • Restarting the local DHCP Server does not fix the problem.
  • Restarting the local Router does not fix the problem.
  • So far as I know, the problem is permanent. I can go there later this week and remap one of the old IP addresses back to the phones and confirm.

Are your phones being banned/blocked on the FreePBX box? If you look at the phones that are not working, do they still show as registered? Does the phone system still see them as registered? From what you’re describing it sounds like fail2ban or some other process is deciding to permanently block the phone’s IP (maybe it’s sending some mangled packets or something odd when the interruptions happen, who knows). If you restart fail2ban (or iptables) on the FreePBX system my guess is things will be fine for all IP addresses…

Rebooting the phone server (which effectively restarts fail2ban and iptables) does not resolve the problem.

The phones register fine, and can call each other without a problem. They just lose the ability to dial out successfully. They connect but there is no audio either way.

It has happened again. Here is some additional information.

  • Phones are all SNOM 821 handsets.

  • Asterisk version: 11.5.1

  • FreePBX framework version: 2.11.0.37

  • The FreePBX Server can ping the phones without a problem.

  • Rebooting the FreePBX server does NOT fix the problem.

  • Rebooting the phones does NOT fix the problem.

  • When phones are having problems, they can’t make or receive any calls.

  • When phones are having problems they fail to provision.

  • When phones are having problems, they show in Asterisk as:

  • Rebooting all intermediate equipment causes the problem, but does not fix it.

  • I can’t find anything wrong with the data path. Ping works fine, there are no MTU problems, no packet loss, low latency.

  • When non-working phone is UP:

localhost*CLI> sip show peers
Name/username Host Dyn Forcerport ACL Port Status Description
777/777 192.168.5.101 D A 3072 OK (57 ms)

  • When non-working phone is REBOOTED:

localhost*CLI> sip show peers
Name/username Host Dyn Forcerport ACL Port Status Description
777/777 192.168.5.101 D A 3072 UNREACHABLE

  • When non-working phone cones back UP:

localhost*CLI> sip show peers
Name/username Host Dyn Forcerport ACL Port Status Description
777/777 192.168.5.101 D A 3072 OK (62 ms)

  • When non-working phone’s IP is changed and it is rebooted (And it now provisions and works correctly):

localhost*CLI> sip show peers
Name/username Host Dyn Forcerport ACL Port Status Description
777/777 192.168.5.102 D A 3072 OK (62 ms)

  • Returning the phone to its original address causes the failure to reoccur. After a period of time (days?) I can then reuse the address. The problem is lasting, but not permanent!

  • Here’s perhaps the most interesting part. If I then return the phone’s IP to the original and reboot it, the phone fails to provision. Asterisk then shows this:

localhost*CLI> sip show peers
Name/username Host Dyn Forcerport ACL Port Status Description
777/777 192.168.5.102 D A 3072 UNREACHABLE

The phone is still pingable from Asterisk! But the phone obviously is having problems making the connection. I tried resetting the phone to defaults, etc. No dice.