Confusing RTP Issue With Endpoints Over VPN

Greetings, thank you ahead of time for any insight to a very confusing ongoing problem.

This is a FreePBX 14 setup that I have inherited. All endpoints are old Cisco SPA and Mitel phones using Chan_SIP. The endpoints on the local network work just fine, however here is the problem with the endpoints of remote locations via site-to-site VPNs.

  • We have replaced the routers at all locations and upgraded the ISP Internet connections.
  • The extensions were set to NAT=No, I had to change them back to the default setting of “Yes” in order for the endpoints to even register.
  • The PBX has the network subnets of the Voice VLANs of all locations set as trusted networks in the firewall.
  • However, randomly, I will see in the traffic logs of the remote routers where endpoints will try to establish RTP audio traffic to the external IP of the PBX, not the internal IP. The applied Endpoint Manager templates for these endpoints use only the internal IP of the PBX, so it is unknown where these endpoints are even obtaining the external IP. Again, this is only happening to the remote locations that are connected via site-to-site VPNs. The endpoints do register using the internal IP.

I am working on replacing endpoints so that I can convert the extensions to PJSIP, since that protocol works better across VPNs. However, what other settings on the PBX can I look at to resolve this weird external IP issue? So far I have not found a common variable as to why this problem starts. Sometimes the sites are stable, and sometimes the endpoints refuse to use the internal IP for the RTP traffic during a call.

Thank you again for your help. Please let me know what Asterisk settings you need me to include to help with the troubleshooting. I didn’t want to overload this initial post.

Is this basically your setup ?

Phone → (V)LAN 1 → Router A → VPN → Router B → LAN 2 → PBX

Can you ICMP ping cleanly between Phone and PBX ? And vice-versa ?

Is there some sort of STUN/ICE/TURN setup on the Phone ?

Thank you for your reply.

Yes, that is basically the setup with the VPNs and the routers.
Yes, I am able to ping the phones from the PBX. I see SIP on port 5060 sent through the VPN to the PBX internal IP, however the RTP traffic tries to use the external IP which is in the “External IP” field in the Asterisk SIP settings for CHAN_SIP. Also, NAT is set to “Yes” for CHAN_SIP.

I did not setup any STUN for the phones, since all VOIP traffic is supposed to go through the VPN.

In Asterisk SIP Settings, make sure that Local Networks includes the LAN subnet of the phones (in addition to what is already there). If you change this, after Submit and Apply Config you must restart (not just reload) Asterisk.

Stewart,

Yes, the subnets for the local networks are already in Asterisk SIP settings.

At the Asterisk command prompt, type
sip set debug on
make a failing call (this could be as simple as calling *43 echo test) and look at the SIP trace in the Asterisk log. If you have trouble interpreting it, paste the relevant log section at pastebin.com and post the link here.

And include rtp debugging to verify stable ip’s

rtp set debug on

Thank you. I went through the list of local networks in the Asterisk SIP settings to double check them. There was one missing, and I discovered a couple that were added by my predecessors that were no longer needed. I then ran a fwconsole restart.

I will know more tomorrow. If the problem continues, then I will turn on both debug logs as suggested. Will I access those debug logs via the web UI?

In the UI, you can go to Reports → Asterisk Logfiles but it’s awkward to get anything far back from there, so you may wish to access /var/log/asterisk/full directly.

Alright, I thought I had the issue solved, but not quite. These endpoints are older phones that only support Chan_SIP.

Yesterday I observed an incoming phone call, the user answered, but the caller could not hear them. The caller calls back, the same user answers with the same phone, and two way audio works fine.

Last night I edited all the extensions to set the NAT to “No”. This worked for the above phone, I tested many calls both directions and there was two way audio.

Now I hear today that the audio will cut out during a call. I don’t know how long or how often.

I did notice in the Asterisk SIP settings that a STUN server is set. Normally that is blank. Since all phones are using site-to-site VPNs, I should be able to remove that STUN server, correct?

Alright, I have now enabled the SIP and RTP debug logs as recommended earlier. I was waiting to do that in order to not over burden the PBX. I will post the logs as soon as I can.

Wanted to provide an update to this adventure.

Just to clarify, here is the basic setup when tracing the path:
Phone → VLAN on switch → Router A → VPN → Spectrum Router and modem → Router B → LAN 2 → PBX

Friday night I accessed the web UI of the Spectrum router, which is only there to pass the static public IP of our router. However, DHCP and SIP ALG on the Spectrum router were enabled, so I disabled those.

Today a user provided me with a couple times that she experienced audio issues during a phone call. I am looking through the CDR reports, Call Event Logging, and Asterisk log files, but am struggling to find the information about these calls to this extension.

I found one of the events by searching for the extension in the Call Event Logging. I see where an outside caller came in, and the user answered. One minute later I see the “CHAN_END” event and then “HANGUP”. The next call event is immediately afterwards with the user calling the external number back. The data between the “ANSWER” and “CHAN_END” is what I am struggling to find.

Any advice? I’ll try the CDR Reports next.

OK, I had to disable debug logging due to space concerns, let the logging processes run during the night, then download via SCP the full asterisk log file from yesterday to a Linux workstation in order to run the grep commands to find one of the call events.

This debug log is for the call that I described in my previous post. The next call event right after this one is the user of this extension calling back the phone number right away.

I have attached a tgz file that contains a text file of this grep output.

New Upload:
callresult.tgz (48.8 KB)

‘I have attached a tgz file that contains a text file of this grep output.’

No you havn’t :slight_smile:

Yes I did. It’s at the top of my post called “callresult.tgz”

that was weird, it opened a local file, sorry

That’s curious. I was using an Ubuntu VM in VirtualBox in order to search through the logs and compress the text file. I then copied that file to a VirtualBox shared folder I had setup in order to access it from the host machine to upload it. Apparently it didn’t copy correctly.

I have edited my previous post to upload the .tgz file directly from my VM.

what is the VPN network address ? here we have 172.0/16

[2023-12-04 09:11:42] VERBOSE[20682][C-00001ebb] res_rtp_asterisk.c: Got RTP packet from 172.21.99.46:16416 (type 00, seq 006911, ts 55494524, len 000160)
[2023-12-04 09:11:42] VERBOSE[20694][C-00001ebb] res_rtp_asterisk.c: Sent RTP packet to 172.16.99.24:62254 (type 00, seq 023048, ts 55494520, len 000160)

The VPN IPs are not in these logs, since the routers handle that traffic. I have the VLAN subnets in the list of local networks in the Asterisk SIP settings in order to prevent any NAT from happening.

The 172.21.99.46 IP address is the endpoint, and the 172.16.99.24 is the SIP SBC that the trunks use.

I’m a little confused, I thought you said the endpoints where connected over a VPN, in which case the SDP session would also be negotiated over the same network as the SIP invite unless direct media where invoked in which case thats a whole different kettle of fish

The endpoints themselves are not using a VPN. There is a site-to-site VPN between the routers. The router at this location sends the traffic on the Voice VLAN through this VPN tunnel to the main site.