Hi to all,
just searching a little help to understand something that happens recently in my production environment, sorry for the long post, just wanna add more details as possible.
First of all I have a freepbx kvm inside a big HA proxmox cluster, in the same cluster is present another PFsense KVM that I use for networking stuffs and as gateway for freepbx and all the other VMs and workstations.
Behind this cluster there is a super fast VDSL 300/30Mb that is combined with a fritzbox router(this is the provider’s router and I cannot change it or I will loose internet connectivity) so freepbx is behind double NAT.
- Frepbx and pfsense are latest.
- I have 4 pjsip trunks to the Italian provider EHIWEB.
- CHAN_SIP is disabled so only CHAN_PJSIP DRIVER is available.
- my DNS are 127.0.0.1,188.8.131.52, 184.108.40.206
- I have around 100 pjsip extensions that are all local or through an IPSEC VPN.
- I have a public static IP.
- Since all extensions are local I don’t have forwarded any ports to freepbx.
Everything works perfect, I didn’t have any issue for 2 years(except some problems related to the voip provider)
in a massive production stage(30/35 external calls together plus all local traffic).
But in the last mounth something is happening and I didn’t understand why, aleatory I cannot ping anymore my voip provider IP voip.vivavox.it so trunks goes offline and to fix this I have to reboot. In that moment I can ping any other IP and from local LAN I can ping voip.vivavox.it with no problems, so is related only to freepbx. Anyway even if is a little anooying is not so bad to reboot freepbx sometime but this friday things went really wrong:
the node that is hosting FREEPBX and PFSENSE shutted down due an hardware failure but my proxmox cluster did things well and migrated the freepbx KVM and pfsense KVM from the faulty node to an online node, but obviously this wasn’t a live migration so the freepbx VM and Pfsense VM received an hard reset and then they came back online again. HERE COMES THE PROBLEMS again I was able to ping everything but not voip.vivavox.it and obviously all my trunks were unregistered receiving error “no response from voip.vivavox.it”, so I rebooted again in a safe way pfsense and freepbx and ping was working, the trunks turn back online but I have no bidirectional voice, I rebooted, disabled firewall, disabled fail2ban but with no success so production went down. I tried to register to my trunks with a softphone from the same pfsense connectivity and it was working fine. No way to fix this, after let’s say 1 hour I rebooted again freepbx and then everything worked fine. I really can’t understand why this happens, in particular why at a certain point I cannot ping anymore voip.vivavox,it. I called the voip provider(just in case) and they told to me that there wasn’t any particular issue in that moment.
I think that for some reason my freepbx locks the connection with this particular DNS,
if this help I noticed lately that when I send a netcat to this DOMAIN from my freepbx I’m receiving this
[email protected]:~# netcat -u -vv voip.vivavox.it 5060
DNS fwd/rev mismatch: voip.vivavox.it != voip.eutelia.it
voip.vivavox.it [220.127.116.11] 5060 (sip) open
or with ping
[[email protected] ~]# ping voip.vivavox.it
PING voip.vivavox.it (18.104.22.168) 56(84) bytes of data.
64 bytes from voip.eutelia.it (22.214.171.124): icmp_seq=1 ttl=55 time=22.3 ms
64 bytes from voip.eutelia.it (126.96.36.199): icmp_seq=2 ttl=55 time=21.1 ms
64 bytes from voip.eutelia.it (188.8.131.52): icmp_seq=3 ttl=55 time=21.2 ms
64 bytes from voip.eutelia.it (184.108.40.206): icmp_seq=4 ttl=55 time=21.1 ms
— voip.vivavox.it ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 3280ms
rtt min/avg/max/mdev = 21.153/21.459/22.319/0.496 ms
I’M PRETTY SURE THAT THIS MISMATCH WASN’T HAPPENING ONE MOUNTH AGO
Any suggestion on what I can try if this happens again?