Music on Hold Sometimes Plays on Answer Instead of Normal Behavior

Hello,

I have a small FreePBX installation with 6 phones. I am baffled by a seemingly random issue where sometimes when someone answers a phone to take an incoming call from an external caller, the person answering the phone hears the hold music instead of the caller. It never happens when calling between extensions. It is fairly uncommon (maybe 1 or 2 per day out of 20 or so).

The phones are Ubiquiti UVP phones, and our trunk provider uses a session border controller inserted directly into the LAN. Five of the phones are in a single ring group that rings all the phones at once.

I have a log of one such call taken from the “full” log in FreePBX but since I am a new user in the forum I cannot post a link to it, plus it is ~400 lines, so it would be annoying to everyone to dump it here. The only things that stand out to me in the log are these entries right next to each other:

bridge_channel.c: Channel SIP/5-00000120 joined 'simple_bridge' basic-bridge
bridge_channel.c: Channel SIP/TrunkName-0000011c joined 'simple_bridge' basic-bridge
res_musiconhold.c: Started music on hold, class 'default', on channel 'SIP/5-00000120'

I have been trying to figure out why this is happening for quite some time and would appreciate any insight somebody might have. I’d be happy to provide more information if needed.

Some things I have tried include disabling hold music (users just hear silence when answered instead of music) and switching to Asterisk 15 instead of 13 (I had read somewhere that 13 might have been the issue).

Thanks everyone,

Adrian

Paste your full log at https://pastebin.freepbx.org/ .
Click the Create Shorturl box and create the paste.
The link will look like
https://pastebin.freepbx.org/view/1234abcd
Include the 8 hex characters after /view/ in your next post.

Paste is at the aforementioned URL with suffix of 80d44077

A slight attempt was made to remove the actual phone numbers and names in the log by replacing them with the string “Redacted”.

I will add that at the time this log was created, all 6 phones (extensions 1-6) were in the same ring group. Also, extension 3’s phone was not registered with the PBX. Neither of those details should be causing the issue though, as it occurs when all phones are online, and seemingly regardless of ring group settings.

Thanks

As I don’t see anything wrong at the dialplan level, I’m guessing that answering the trunk is failing, either not getting an ACK to the 200 OK or not receiving RTP from the provider; Asterisk is sending music while waiting for the RTP to appear. This is consistent with it not happening on internal calls.

To troubleshoot intermittent problems like this, I start tcpdump running continuously in the background, using a command (run as root) like:
tcpdump -s 0 -C 100 -W 100 -w rbuf -Z root &
This captures all packets into 100 files of 100 MB each, named rbuf00, rbuf01, …, overwriting the oldest one. After an error has occurred, you locate the file covering the relevant time period, download it to your PC and analyze it with Wireshark. You may need to download two of the files, if the call in question overlaps a 100 MB boundary.

Please describe any network complexity between the SBC and the PBX (hardware firewall, VLAN, VPN, PBX on virtual machine, multiple NICs, etc.)

I may have missed something obvious and hope that other members reading this will take a look at the log and maybe find a simple problem.

I have started the packet capture and will update this thread when I catch the issue again.

Some more details about the SBC:
The SBC has an IP address in the same subnet as the PBX so no hardware firewall is in play. The SBC is located at our ISP (the SIP provider) and connectivity goes through a dedicated cable modem that acts solely as a pass-through device to the SBC. Traffic runs un-tagged and no VLANs are used. No VPN is present. The PBX is a HyperV VM, so it runs through a vSwitch, but it is the only VM on the hypervisor. The vSwitch physical NIC is plugged into the same switch as the SBC.

I spoke with the SIP provider, and their sentiment was “the common denominator in those dead air calls is the PBX is giving a teardown notification that the line is unavailable”

Hope that provides some additional insight.

I have the packet capture. I can see ACKs to the OKs and the RTP streams between the PBX, the SBC, and the phone look normal to me. I can play them back in Wireshark to listen to the hold music.

This is the flow from the SBC to PBX. PBX is .100 and SBC is .101 SBCtoPBX

Phone to PBX:
PBXtoPhone

Phone to PBX looks fine.

On the trunk side, upon answer SBC sent an immediate re-invite; I don’t know why. Presumably, it specified media port 59774 because Asterisk started sending RTP there. But Asterisk expected RTP to come from that port and SBC either continued sending from 59772 or stopped sending (graph is ambiguous). You can inspect the SDP in the re-invite and the following RTP packets for details.

It would also be useful to compare with a call that doesn’t fail.

See whether turning off Strict RTP in Asterisk SIP Settings works around the problem. If not, perhaps the provider has a setting in the SBC that would avoid the re-invite, or maybe something on the Asterisk side e.g. timer-related would do that. What should the caller hear while waiting for an agent to answer? If just ringing is acceptable, turning off the early media playback may help.

Sure enough, a normal call does not contain the re-invite.

NormalCall

The re-invite did specify 59774, but it looks like RTP did start coming from that port on the SBC

Turning off Strict RTP does not seem to have worked around the issue. I have yet to reach back out to the provider about the SBC, but intend to. Just wanted to update this thread so it doesn’t die.

Stewart1 you mentioned turning off early media playback as an option - where could I find that setting?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.