I have tried to narrow this down and I have narrowed it down to FreePBX causing problems. Firstly, heres the network setup ONT (bridge mode) → Unifi Cloud Gateway Ultra (No SIP ALG) → Netgear JG5516PE Managed POE Switch (No VLAN) - latest FreePBX/Astrix installation on a small Dell Microcomputer, Fanvil X5U Desktop phones all connected to the same switch. Provider is Go\Trunk.
The issue: during calls, audio drops out 1 way (from our end to the other external call) Everything will be fine for 30 seconds to 1 minute and suddenly customer on the phone goes “Hello? HELLO!? Oh, i think we got disconnected. Hello???” And we hear them saying it, and we keep repeating “yea! yea! Im here! Can you gear me??” But they cant hear us. After about 5 seconds, we come back. Rinse and repeat every 30 seconds to a minute. We cannot for the life of us figure out what the heck is going on here.
We had our SIP provider on their end see if theres anything going on and they dont seem to see anything out of the ordinary, but this is only happening with external calls, not internal. Internal works perfectly fine.
Please provide the asterisk full log with RTP debugging enabled, and verbosity at least three, covering a dropout, and several frames either side of it.
If you can easily cause the problem at will, just capture traffic on the PBX with tcpdump, move the file to your workstation and examine it with Wireshark. At the time of the dropout, see whether RTP packets are being received from the extension and being sent back out to the trunk. Wireshark’s ability to play the RTP (if not encrypted) may be useful.
If it occurs only occasionally:
As root, run a command like this to capture all SIP and RTP: tcpdump -s 0 -C 100 -W 100 -w rbuf -Z root &
This writes a ring buffer of 100 files of 100MB each (rbuf00, rbuf01, …, rbuf99). (Be sure you have 10 GB available disk space.)
When a dropout is reported, before the buffer gets overwritten, find the file with the bad call (look at times last modified), copy it to your PC and analyze with Wireshark. (Also save the previous one, in case the call overlapped two files.)
If good RTP is being sent to the trunk, you can capture on the WAN side of the Ubiquiti to confirm that it is passing traffic correctly.
Also useful to know:
Are both inbound and outbound calls affected? If there are multiple calls in progress at the when a dropout occurs, are they all affected at the same time?
This is the most lengthy yet most educational response if I have seen one!
I will run a packet capture and see what happens (the system has a 1tb NVME SSD so, more than enough space)
Lastly, yes - both inbound and outbound calls are affected. Aslong as its an external call, its affected 100% of the time on each and every call. On multiple calls, yes. The dropout will happen, 1 way, at the same time on multiple calls, and the our audio outgoing picks back up at the same time on multiple calls. I wonder if that helps narrow down what could be wrong
Strange. It seems unlikely that FreePBX would be failing to relay RTP; in that case inbound audio and/or internal calls would also be affected.
Also seems unlikely that your internet connection is intermittently blocking all outbound traffic; surely you would notice 5-second delays fetching a web page, etc.
If the GoTrunk media servers don’t block ping, try running a continuous ping to one of them (get the address from pjsip logger or sngrep) and see whether packets are lost when the trouble occurs. If they do block pings, try pinging one of their SIP servers instead (amn.st.ssl7.net does respond to ping from here).
Do you have any work-from-home or other external extensions? If so, are calls between one of them and an internal extension affected?
I am currently pinging their media servers, but wow these ping times seem pretty crazy to me. Two of the servers 100+ ms? But their other two seem good at around 30 ms ping. Ill test on a soft phone tomorrow morning when im back in the office and give an update
I was about to create a topic with similar problem. Since 2 weeks some outgoing external calls have that problem that other side can’t hear you at all. It seems to happen only with some numbers. I can’t reproduce that with my xfinity/verizon phone, but testing with some t-mobile cell, the call once work, once not, you can try every 5 minutes and have different result.
following a TCP dump and reviewing the .pcap in Wireshark, i see zero issues. So what do you advise from here if I see nothing going on? Would this mean its an issue on the SIP provider at this point or can it still be something on my end?
Here is the .pcap file if you would like to look yourself, I filtered RTP and I didn’t see anything unusual, but if you see something I don’t, I would appreciate the help - this is really been killing our business here. Share cyNzUwO - Pingvin Share
Device 192.168.1.27 seems to be sending an extremely broken RTP stream. It has very few frames and the the frames seem to indicate that there are none missing and that the frames should be contiguous in time, but there are large gaps, in real time, between many of the frames.
What is 192.168.1.27? Is it an actual phone or ATA, or is it some sort of router. Maybe it is a router that has a SIP ALG that doesn’t know how to handle suppressed silence, and is overwriting the original time stamps?
Note that this device has a different media address from the signalling address.
I assume that Asterisk is 192.168.1.113.
SSRC
0x8670a79a
Max Delta
29999.597000 ms @ 22320
Max Jitter
6532.708976 ms
Mean Jitter
1287.968412 ms
Max Skew
-114734.224000 ms
RTP Packets
14
Expected
14
Lost
0 (0.00 %)
Seq Errs
0
Start at
34.675415 s @ 7410
Duration
114.99 s
Clock Drift
-114798 ms
Freq Drift
0 Hz (-99.83 %)
I also notice that there are comfort noise frames flying around. Whilst I don’t think they are the problem, Asterisk doesn’t handle them well, if at all.
the device sitting at 1.27 is my personal desktop phone. But what weird is this does not happen at the phone. Yes, Asterisk is at the 1.113 - but thats where the issues happen. If you call in, even the IVR drops out. Its not just the calls, its the entire system.
As far as comfort noise, I have zero clue what that is or how to remove it
To add, this test was done by calling in using my cell phone, picking up the phone, and putting the call on hold. You can hear on my iphone where the hold music does the audio drops. So IDK if thats why you see that issue on the desktop phone sitting on 1.27, could be.
We set a stopwatch for everytime we heard a drop out and it happens every 54 seconds for 5-10 seconds on the dot. Im really at a loss here, this is where my expertise begins to lack
Design flaw (doesn’t understand how to use timestamps properly).
Broken real time clock.
Extreme Ethernet collision rate (this would also cause long signalling delays, which I haven’t looked for, and would likely be a misterminated cable. I suspect, if this were the problem frames would continue to trickle out, after the end of the call, though.
The latter two would probably break media towards the device, although in different ways.
Im gonna be honest, I think this is some re-invite issue here. I have been able to reproduce the exact time the audio drops, each and every single time at exactly 54 seconds and I am at an absolute loss here
From now on, I think questions like this rather be answered “it’s almost always a NAT issue” should also be followed by “what else is on your network?”
I can’t believe this but I searched and searched and searched and nothing came up. I took a phone home with me and I have the same network gear we have in our IT rack, set it all up the same way it’s at work, plugged in a phone, took the PBX home…problem disappeared on my home network. So my thoughts: ISP? Or something on the network.
TL;DR: it was a printer server on our network causing the drop outs at exactly 54 second intervals for 5-7 seconds at a time, every time on the dot.
I came across a Reddit post of someone saying their NAS caused a similar issue? I thought that was nuts and we run a Synology NAS in our rack so I disconnected it just for the giggles. Still happened. Then I thought, maybe start removing other things like our NVR, printer servers, etc. I made a call and BAM, not a single drop and we did a 15 minute call. I started to plug one thing back into a switch one by one, and each device that got plugged in, we would do another 15 minute call of someone reading a news article. The second the printer server was plugged in? Issue was repeated. We unplugged the printer server, problem was gone. Plugged it back in again, problem came back. Those printer servers somehow, someway, were interfering not just with the Pc we put FreePBX on but learned THE ENTIRE NETWORK. Someone who uses WiFi at the office said their WiFi always would fail to load a webpage for a moment and they’d have to refresh. I said “try browsing around the next 30 mins and see if it’s still happening”. Problem completely disappeared. I couldn’t believe it. I’m gonna examine these things and see what’s up with them. I digress, problem is gone and it wasn’t even FreePBX related but thank you to anyone who attempted to help!
Common problem with print servers. Just to give you another story - I have a site with a Ricoh Afico 301 MP printer with an internal print server. They are remodeling so had to move the printer to another office. Printer plugs in and will not sync up ethernet. I send a tech out there and tell him to plug his laptop direct into the printer with no hub and see if he can connect to the printer. He does and can. But 3 hubs he brought out fail out. I have him pull the printer back to IT. In the department I can get the printer to connect to the network with only 1 out of FIVE different models of 4 port hubs. I even send a test print to it no problem so I send it back out into the field with the 1 model of working hub - and it won’t work in the field. I ended up going out there and the exact same setup I had working in the lab - printer, hub, laptop - fails to work in the field. And we have the identical model of printer in the lab that does not have these peculiarities.
Print servers are evil, whether inside the printer or external. I would bet money if you take your print server home to your home network it will work. I have plenty more stories of idiotic craziness with print servers. For example the printer that only works when the print server is IPv6 only. IPv4 fails. Whoever programmers they find to write print server firmware know less about network coding than would fit in a thimble.
You might also look to see if there’s a firmware update for your print server.