Some Peers are become unreachable for 10-15 seconds, 2-3 times per hour

Hello, sorry for my english, this is not my native language.
Here is my problem:
I have a FreePBX 14.0.3.6 version, Asterisk 14.7.5 version, running on virtual machine hosted on Proxmox Virtual Enviroment. Also on Proxmox VE I have configured OpenVirtualSwitch for passing tagged traffic to virtual machines. 40 ip phones Yealink t19e2. The server and phones are in the same local network. There is no nat between Server and phones.
Sometimes peers are become Unreachable and after 10-15 seconds they are reachable again. Sometimes its become Lagged.
I can see this in Asterisk CLI:

[2018-06-04 05:24:46] NOTICE[3032]: chan_sip.c:24654 handle_response_peerpoke: Peer ‘2000’ is now Lagged. (2008ms / 2000ms)
[2018-06-04 05:24:56] NOTICE[3032]: chan_sip.c:24654 handle_response_peerpoke: Peer ‘2000’ is now Reachable. (12ms / 2000ms)
[2018-06-04 05:34:00] NOTICE[3032]: chan_sip.c:30172 sip_poke_noanswer: Peer ‘2000’ is now UNREACHABLE! Last qualify: 8
[2018-06-04 05:34:10] NOTICE[3032]: chan_sip.c:24654 handle_response_peerpoke: Peer ‘2000’ is now Reachable. (8ms / 2000ms)

When Peer is Unreachable i can ping it normally, without any losses.
Any advices, please.
Thank you.

1 Like

It is a 10-15 second pattern? Is there some task happening on Linux at the same time? (Check logs?)

I’m running on Proxmox VE as well. I sometimes see, “Peer xxx is now {UNREACHABLE,Reachable}” for Wi-Fi extensions, but never, “Peer xxx is now lagged.

Are you giving your FreePBX enough resources? CPU powerful enough? Enough RAM?

Does your PBX have a StaticIP?

1 Like

That’s a little vague. Please confirm that once-per-second pings to the peer, running on the FreePBX VM, started at least 30 seconds before the peer became unreachable and continued until after unreachable is reported, show no packet loss.

If so, run tcpdump on the VM and confirm that there is indeed no reply to the OPTIONS request sent.

If so, then check at the phone, e.g. using port mirroring/monitoring on the switch to which is is connected.

Unreachable means only one thing, that the PBX is not getting a 200 OK back from the endpoint when it sends a qualify OPTIONS packet. Lagged is the same except the 200 OK is very slow coming back. If its happening to all peers at once, I would suspect network issues.

Yes. Freepbx VM has 2 network interfaces, first for internet (trunks to provider) and the second is for local network. There is static IP on both interfaces.

no its happen with some peers. not all peers at once but in period of 4-5 hours all peers becomes unreachable at least one time each.

I’ve mirroring ProxmoxVE interface on my switch, and made network dump with wireshark. I saw, that FreePBX send Qualify request to peer.

26 0.704386 10.3.24.130 10.3.24.147 SIP 585 Request: OPTIONS sip:[email protected]:5060 |
27 0.704894 10.3.24.147 10.3.24.130 ICMP 590 Destination unreachable (Port unreachable)

later i have

51 1.704268 10.3.24.130 10.3.24.147 SIP 585 Request: OPTIONS sip:[email protected]:5060 |
52 1.712696 10.3.24.147 10.3.24.130 SIP 366 Status: 200 OK |

Yes, my FreePBX VM has at least 2cores of CPU @2.8GHz, 4Gb RAM. But i mean there is can be the lack of perfomance of my HDD. Can it make effect on VM networking?

An active rejection (especially Port unreachable) is most surprising. If this actually came from the phone (look at the source MAC address to check), that would imply that its SIP stack is crashing and restarting, or possibly that other traffic exhausted its ‘listen’ count. Is firmware on these devices up-to-date? Are there any other makes/models of extensions on this system that become unreachable (or not)? The phone itself has a packet capture capability that may prove useful.

If the ICMP came from somewhere else, that should be a clue as to where the trouble lies, e.g. IP address conflict, addresses being reassigned, etc.

OK. I re-read your post, just to find that we have the same exact issue.

We have FreePBX 14 Asterisk 15, with two NIC’s. First we had this issue and we solved it by following the information in the post.

We have Sangoma and Akuvox phones, the Sangoma phones are working just fine, but the Akuvox phones are becoming unreachable for 10 seconds.

I hope someone can point us here in the right direction…

Thanks for answer. Am I right, thah you have the same issue?

Today I change my settings on NIC which use for internet acces. I have removed default gateway settings from config. And peers become Unreachable much less often. Subjectively 60% less.
Now i’m waiting to the end of the workday, to try update firmware on my Phones and reconfigure my NIC for local network. I will set there default gateway.

Thanks for answer. I have ran wireshark to dump traffic from phone. Before it i update phone firmware to actual version. But i haven’t see any Unreachable or Lagget state of this phone. I will dump traffic untill tomorrow.

I assume so…

I’m not saying that this will help.

I just hope someone with more knowledge and experience will respond here…

Looks like the issue is solved! I reconfigure my NIC. Delete default gateway from NIC for Internet connection and make static routes. And setup default gateway on NIC for local network. Phones is not become unreachable at least about 8 hours.
Thanks to all people, who spend their time, answering in this topic. How can I mark this thread “Solved”?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.