FreePBX - High Network Jitter

Hi Everyone!

I’ve been battling an issue on a production machine for two weeks now that just will not let up. The core of the issue is that any network traffic to my FreePBX host intermittently has much higher response time than expected, from a typical 3-12 ms, depending where I’m pinging from, to 200-500 ms. I’ve noted this behavior in both SIP packets and ICMP ping requests to the FreePBX.

This behavior is not exhibited when pinging from the FreePBX to another host, nor are there any other network issues between hosts on the same network, even in the same physical machine. For comparison, the highest ping from my typical test machine to a host on the same physical server and network averaged 3 ms, with a jitter of 0.4 ms, while pings to the FreePBX averaged 11 ms, with a jitter of 53 ms!

While doing research, I ran across the dahdi_test command, which showed that my accuracy sporadically dips very low, to 70%, but unless I am mistaken, I am not utilizing any dahdi devices or interfaces, only pjsip, so I wrote this off as a symptom, rather than an issue cause.

Here are the machine details:

  • KVM Virtual machine, 6 cores, 8 GB RAM
  • Started on FreePBX 10.13.66, upgraded to SNG7
  • Linux 3.10.0-693.5.2.el7.x86_64

Things I’ve done to try and remedy the issue, to no effect:

  • Increased server resources
  • Changed the network interface type, utilizing new driver
  • Updated from FreePBX 10.13.66 to SNG7

What has me particularly stumped is why the ping command shows the same jitter, but the issue is only effecting this one machine. Anytime I’ve seen issues like this, they are network dependent, and some fix must be applied to the network configuration to resolve them, but this issue seems to be within the server itself.
I am at my wits end!
If there is any information that I can provide, or tests I can run that would be of service, I’d be happy to do so!

Help me FreePBX Community, you are my only hope!

OK and what are the actual issues you are having outside of the increased ping times? Are you have call quality issues? Are things dropping? Locking up? Playback of sound files choppy? Devices not authing or getting messages?

What actual issues are you having on the system? Having high ping times within your host network but not from outside of it isn’t a FreePBX problem it’s a networking problem within your host setup.

Oh, right! I skipped right past that part! This jitter results in poor call quality, specifically random delays in call audio in both directions, up to a second or two of completely dropped audio, but I’ve never noted any dropped calls. We’ve had device issues where phones lose connection to the host, but I have no indication that this is due to networking issues, rather than simple device quirks.

I’ve noted this behavior on every type of call, internal, outgoing, and incoming, though I use internal calls to test the system after each attempted fix.
Thank you!

Have you done more than just look at ping and jitter tests? What is happening on the server when you are seeing high pings and jitter? What is the load? What processes are running? How many calls are active? A htop report when this is happening would be helpful.

Thank you, this really helped me dig in more to this issue! While my first cursory look at htop during while ping was bad led me to think the issue was load related, (hence why I increased server resources) I spent some time gathering data to back up this claim, and found that they are not related in any meaningful way. Here is a screenshot of a time when the system had a high ping,

Which I’ve directly correlated to call audio packets dropping, and here is a screenshot of a high load time,

which I have not found any significant correlation to audio drop.

In most cases, the time from a high load case, typically from fwconsole syncing data, and time to a high ping result was anywhere from 5 to 15 seconds, leading me to conclude that the two incidents were coincidental, as they both occur so frequently.

OK, so in your first screenshot you’re seeing high latency and jitter and while I see the load being at 1.26 your CPU is barely being touched. That means your system is queuing up processes to be used because the CPU is doing something that won’t let anything else thread out or use resources.

In your following screenshot, you’re showing normal latency and your CPU usage is peaked out on each core and you have low load averages. That’s because the CPU is processing everything like it should and isn’t queuing any processes to be handled.

The only thing I can say to try and test is to stop streaming your MoH and see if the issues go away.

You’ve got it!

Thank you for the suggestion, I tried disabling all of my streaming MoH applications, switching them to play files, or nothing at all, and while it reduced my load a somewhat startling amount, I am still getting the same high latency spikes and call audio drops. :frowning:

Thank you for the help, and I’ll gladly take any other suggestions!

Can you can add virtio NICs to your KVM to see if that helps?

This was solved. The exact reason not given but it was a network problem.

@BlazeStudios, thank you for your help, but I jumped the gun yesterday in a fog of working on this all day. After multiple tests I confirmed the initial behavior, the host in question still has much higher jitter than other hosts in the same physical machine. Darn!

@rfreeman1478, I absolutely can, if I change the existing NIC type, which I will do as soon as we pass our highest call volume time, in about an hour or so.

In other interesting developments, a clone of this machine behaves exactly the same way, while a fresh FreePBX installation does not. I’m afraid a fresh install may be my “nuke it from orbit” option if all else fails.

Alright, I switched the clone’s NIC to virtio, and alas, still got the dreaded jitter. Thank you for the suggestion, though!

Wow thats an interesting find. I’m out of ideas as to what could be causing it…

Are we all understanding what jitter actually is here?

Basically it is the non sequential delivery of packets.

This is always a network problem, especially if on your LAN, there is no reason that should happen, you need to look at at your QOS/TOS traffic shaping through your network, and what else your server is doing to pre-empt network traffic, (sysstat is a good tool).

If you have a flaky WAN connection, then a jitter buffer can (within some constraints of acceptable psycho-acoustics, any more than ~= 100 ms and your clients will be pissed!) reconstruct continuous audio but with a delay !!. Generally a long-tailed jitter buffer is excellent for packet radio, but is unlikely to help provide an acceptable VOIP experience in today’s internet.


1 Like

I mean, at this point, I don’t think I understand anything anymore! :sweat_smile: But I’m running on a definition similar to this one from Wikipedia:

packet delay variation ( PDV ) is the difference in end-to-end one-way delay between selected packets in a flow with any lost packets being ignored.

I also assumed it was a network problem at first, but if that is true, then why wouldn’t I see any similar behavior on other VM’s in the same network, on the same hardware, at the same time? And even more so, why would cloning the machine duplicate the problem? And to dog-pile from there, why would building a new machine with the exact same network settings and connection resolve the issue?

I don’t pretend to have any answers here, and if anyone does, please let me know, because I’ve only got one more Sangoma portal reset, and would rather not rebuild FreePBX again!

All of that as a long way of saying, the issue is resolved-ish now, as I rebuilt the machine, restored from a backup, and reactivated on the same deployment ID. Hooray!
Thank you all so much for your help in drilling down into this tough issue!

Your understanding is correct, but a jitter buffer is not going to help.

Virtualization is a level that you need to get support from the source as to networking and CPU overload. Neither asterisk nor FreePBX are practically involved.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.