We have had an ongoing issue regarding calls dropping from conference rooms and are now looking for some input. We have an older server that is still on FreePBX 12 / Asterisk 11 and recently built a new server with FreePBX 14 / Asterisk 13 and the issue is present on both of them regardless of whether we use MeetMe or ConfBridge (We are currently using Confbridge). We think we have narrowed the issue down to the fact that we are forcing recording on all of our conference rooms but can’t be sure yet. I have read about setups where people have a far greater number of conference rooms on much weaker hardware but I do not recall any details regarding recording. We only have 108 conference rooms at the moment. I don’t think we need that many because at peak hours we have 30-40 in use, so I was also thinking of cutting that number down as well.
Here are the tests we did:
Test # 1
Inbound call from my cell phone transferred into conference room and joined by an internal extension. The inbound call will drop between 13 and 17 mins.
Test # 2
Same scenario as above. After 30 minutes, the call was still connected. I voluntarily disconnected the call.
Reports from our internal users are inconsistent. Some will disconnect within a few minutes and some will go for an hour or 2 without issue. We are still doing some testing but I was hoping someone could take a look at a snippet from our log file. This is after the the call disconnected from the conference room and we start hearing MoH on the internal extension.
And after more testing, we are seeing calls drop around 15 minutes with the recording disabled. So this may not have anything to do with it at all. But if anyone has any input regarding this at all, it would be much appreciated.
Fifteen minutes is one of those “indicator” periods that leads you to a hard time-out. Chances are it’s a hard NAT-related timeout. Look around for “900 seconds” on your various time-outs and firewall settings and see if you can spot anything that looks sketchy.
I appreciate the input. I have considered the routers and I am still checking back through settings. Part of the issue is that I have inherited a network design that is not ideal…
The drops aren’t an exact time. Around 15 min in testing today, in the most recent tests. Generally 14-17 in prior tests. Anywhere from 5 min to an hour in reports from users. As far as users being muted, I have asked that question numerous times and very few of the instances have been muted on our end. In fact, most reports are mid-sentence. I cannot speak to the customer’s end though.
And with the router configuration being an issue, I have suspected this and the general layout of our network, but wouldn’t the call drops also be more common with direct calls? We handle a fair number of both and the vast majority of the drops occur only when conference rooms are involved.
“Nope, guess again” isn’t going to get you a lot more help.
If you want us to invest our time (we’re all users and volunteers here, just like you), you’re going to have to help us help you. Tell us more - what else have you looked at? What have you ruled out? How is your network set up? You hint that other people have had calls drop in other places than on conferences? Are you dropping calls everywhere at the 15-minute point, or just conference calls? What conference phones are you using?
Within my last post there were legitimate responses to questions asked by other users as well additional questions. Do you truly interpret my reply as “Nope, guess again?” This is not a game and I didn’t come here for fun. Thankfully I have gained a couple of leads that I am looking into and if I continue having issues, I will post again with as much information as I can manage. At this point I am still trying to determine what is relevant and what is not.
I do truly “appreciate the input.” That statement that apparently offended Dave was not intended to be dismissive at all. If it was read that way, I do apologize.
It should not be hard to see (from Asterisk’s point of view) what is going wrong, though it may not be easy to fix. As root, start a tcpdump capture of everything on the server, using a command like tcpdump -s 0 -C 100 -W 100 -w rbuf -Z root &
This writes a circular buffer of 100 files (rbuf00, rbuf01, … rbuf99), each 100 MB long. Be sure that you have 10 GB of extra disk space available.
If there are ~100 active call legs @ ~ 80 kbps * 2 each, you’ll have ~1.4 hours of historical data. When trouble is reported, identify the time the call ended, find the file (based on modification time) containing the problem, copy it to your PC and analyze it with Wireshark.
Probably, the trouble was not detected by Asterisk until after the agent noticed it. Likely possibilities include:
RTP from the speaking party stopped flowing or became silent.
RTP to the speaking party stopped some time ago and the trunking provider or carrier dropped the call (with luck, the BYE request will have a cause code or other useful info in some header).
RTP to the agent was interrupted by a network issue from PBX to phone.
IMO an Asterisk malfunction is less likely, e.g. RTP from the speaking party is coming in fine but nothing or silence is being sent to the agent.
Once you see what is wrong at Asterisk, you’ll know where to look next, e.g. at a capture taken on the WAN side of your firewall.