Hello, sorry for my english, this is not my native language.
Here is my problem:
I have a FreePBX 184.108.40.206 version, Asterisk 14.7.5 version, running on virtual machine hosted on Proxmox Virtual Enviroment. Also on Proxmox VE I have configured OpenVirtualSwitch for passing tagged traffic to virtual machines. 40 ip phones Yealink t19e2. The server and phones are in the same local network. There is no nat between Server and phones.
Sometimes peers are become Unreachable and after 10-15 seconds they are reachable again. Sometimes its become Lagged.
I can see this in Asterisk CLI:
[2018-06-04 05:24:46] NOTICE: chan_sip.c:24654 handle_response_peerpoke: Peer ‘2000’ is now Lagged. (2008ms / 2000ms)
[2018-06-04 05:24:56] NOTICE: chan_sip.c:24654 handle_response_peerpoke: Peer ‘2000’ is now Reachable. (12ms / 2000ms)
[2018-06-04 05:34:00] NOTICE: chan_sip.c:30172 sip_poke_noanswer: Peer ‘2000’ is now UNREACHABLE! Last qualify: 8
[2018-06-04 05:34:10] NOTICE: chan_sip.c:24654 handle_response_peerpoke: Peer ‘2000’ is now Reachable. (8ms / 2000ms)
When Peer is Unreachable i can ping it normally, without any losses.
Any advices, please.
That’s a little vague. Please confirm that once-per-second pings to the peer, running on the FreePBX VM, started at least 30 seconds before the peer became unreachable and continued until after unreachable is reported, show no packet loss.
If so, run tcpdump on the VM and confirm that there is indeed no reply to the OPTIONS request sent.
If so, then check at the phone, e.g. using port mirroring/monitoring on the switch to which is is connected.
Unreachable means only one thing, that the PBX is not getting a 200 OK back from the endpoint when it sends a qualify OPTIONS packet. Lagged is the same except the 200 OK is very slow coming back. If its happening to all peers at once, I would suspect network issues.
An active rejection (especially Port unreachable) is most surprising. If this actually came from the phone (look at the source MAC address to check), that would imply that its SIP stack is crashing and restarting, or possibly that other traffic exhausted its ‘listen’ count. Is firmware on these devices up-to-date? Are there any other makes/models of extensions on this system that become unreachable (or not)? The phone itself has a packet capture capability that may prove useful.
If the ICMP came from somewhere else, that should be a clue as to where the trouble lies, e.g. IP address conflict, addresses being reassigned, etc.
Thanks for answer. Am I right, thah you have the same issue?
Today I change my settings on NIC which use for internet acces. I have removed default gateway settings from config. And peers become Unreachable much less often. Subjectively 60% less.
Now i’m waiting to the end of the workday, to try update firmware on my Phones and reconfigure my NIC for local network. I will set there default gateway.
Thanks for answer. I have ran wireshark to dump traffic from phone. Before it i update phone firmware to actual version. But i haven’t see any Unreachable or Lagget state of this phone. I will dump traffic untill tomorrow.
Looks like the issue is solved! I reconfigure my NIC. Delete default gateway from NIC for Internet connection and make static routes. And setup default gateway on NIC for local network. Phones is not become unreachable at least about 8 hours.
Thanks to all people, who spend their time, answering in this topic. How can I mark this thread “Solved”?