Forced SRTP, Phantom Codecs and SIP INVITE Packet Fragmentation

Hey all,

I apologize for the long backstory to get to the actual post title… We covered a lot of ground in the last 2 days to actually make it here!

We noticed intermittent slow connections when dialing outbound from a remote site (sometimes waiting up to 30 seconds before the call would actually start to dial!). This remote office is not part of our WAN, so these phones (Sangoma P325/330s and Digium D65s) are going into our PBX externally. We have 5060 and RTP ports whitelisted from this office’s IP into the NSG (PBX is at Azure) and in the PBX firewall and intrusion detection. Inbound calls to the remote extensions were just fine.

We assumed something was going on like a firewall issue, NAT transversal, etc, so we broke out Wireshark for a little PCAP fund. What we discovered was that our SIP INVITEs were getting fragmented due to size, “Fragmented IP protocol”. We could see the packets fragment as they left the remote network, and then they would only partially show up at the PBX per a PCAP from the PBX itself. I am not sure if the partner fragment did not ever show, or if the PBX was just not reassembling them for some reason. Either way, it was not working! We observed a degrading timeline in the repeat requests, I assume this is a setting somewhere in the PBX. It would retry at like .5 sec, 1 sec, 2 sec, 4 sec, 8 sec, 16 sec, etc. As an aside, we did later determine that this fragmenting of packets is also affecting the phones in our WAN, but they would typically only have to send 1-3 vs 5+ INVITES to connect.

This launched us off into a many hour saga of trying to figure out how to reduce our INVITE request size. One thing we noticed was the presence of many more codecs than there should have been. We tried to trim them down in order to reduce packet size, but to no avail… We reduced the trunk to ulaw only, set Extension settings like “Disallowed Codecs: all”, “Allowed Codecs: ulaw”, and even tried to add disallows into the custom pjsip.conf files. I eventually figured out that the phantom 90 (Zulu) and 98 (Sangoma Phone) extensions’ codecs were somehow getting combined into the INVITE. As an example, these codecs were being offered in the SDP - “ulaw&alaw&vp9&vp8&h264” (a video codec!!!). This comes directly from the 90(extension) from Zulu, and brings me to my first questions.

1) Is there a way to stop the PBX from joining those phantom Zulu and Sangoma Phone extensions into the main extension request? It was somehow combining codecs from ext 101, 90101 and 98101. I would expect that if ext 101 on a Digium D65 only had ulaw enabled on the extension, then that is the only codec that would be presented in the request.

2) Is there a way to enable the Asterisk feature “compact_headers=yes”? We tried to manually add some variations of this line into pjsip_custom.conf and pjsip_custom_post.conf files, but either we were doing it wrong, or it doesn’t work. This seems like it could be a very quick way to reduce packet size.

During my research into this INVITE header size issue, I saw that TLS would not have these same fragmentation issues like UDP. After many failed attempts to reduce the packet size, I decided to try and give TLS a whirl. I followed this Sangoma TLS and SRTP guide, and it was fairly straight forward.

On this guide, I decided to not enable SRTP for now. I figured TLS was a good start, would solve the fragmentation problem, and I could just revisit SRTP later. I very quickly was able to get that same D65 registered with the PBX on TLS on 5061 instead of UDP 5060. However, when I went to make a test call, I was met with a busy signal and this error in the PBX logs - "res_pjsip_session.c: 225: Couldn't negotiate stream 0:audio-0:audio:sendrecv (nothing)". I reviewed the Invites again and saw that this 101 extension and the target both had overlapping codecs, so it did not seem to be a codec problem. After some more research, I figured out that this at the end of the SDP was to blame - "a=crypto:1 AES_CM_128_HMAC_SHA1_80". I learned somewhere online that this meant it was looking for SRTP audio, which I had not enabled! Which brings us to our final set of questions…

3) Is it possible to have TLS without SRTP in FreePBX? It would seem like you could have TLS without SRTP, but not the other way around. I could not find any configuration setting that would have forced it to require this media encryption…

3a) Follow up question - What settings could have caused that crypto to end up in the request? Things I checked…
“Media Encryption: None” on both the trunk and the extension.
“Allow Non-Encrypted Media (Opportunistic SRTP): True” on the extension (even though should be ignored since Media Encryption was blank.)
“Enable DTLS: No” on extension
“SIP encryption: no” in Device Settings under Advanced Settings

I did notice that “Media Encryption: DTLS-SRTP” and “Enable DTLS: Yes” were set on my 90 Zulu extension, but I tried this testing on a different pjsip extension (102) that did not have Zulu enabled or a 90 extension, and it still had this crypto in request.

I could never figure out why that crypto was there, however, when I finally enabled “Media Encryption: SRTP via in-SDP” on that extension 101, it immediately got dial tone and was able to make calls. I feel like there should be more settings related to this encryption in the General SIP Settings and the chan_pjsip settings of the PBX, but it is basically only mentioned in a label on that page and has no real settings!

On the bright side, this outbound connection delay due to fragmented packets is GONE with TLS :slight_smile:

Would loved to hear y’all’s thoughts on these items… Wondering if I am just missing something ultra basic. Thanks for your patience to read all of this!

PBX Version:
PBX Distro:12.7.8-2306-1.sng7
Asterisk Version:19.8.0

Just switching to TCP eliminates the fragmentation problem, maybe try that first ?

(securing the audio stream is not dependent on SIP signalling and that audio is only realistically vulnerable to MITM compromises )

1 Like

This is completely false. Packet fragmentation happens because the packet is too large for the MTU being used by one or more of the routers in the path. If there is an MTU problem moving to TCP isn’t going to eliminate the issue it is just going to cause the fragments to be rebuilt and add delay to the transmission.

A UDP packet has an 8 byte fixed overhead while TCP has a minimum of 20 byte overhead but could go up to 60 bytes in the overhead. So I’m not sure suggesting a larger packet is going to solve the packet fragmentation problem. A problem that is created because the packet is too big.

Perhaps we should see a PCAP of all this happening so we can determine what the actual problem is. Sure TCP might look like it fixes the problem but it really doesn’t and can cause other issues if there is a real underlying networking issue.

I’ll have to pull the PCAPs with fragmentation again… Thought I had them saved, but I guess not.

However, I do have the full invite of a TLS request with the rogue codecs and “crypto” handy. When I went to grab this and post it though, I glanced at the codecs again. I should have googled more on the codecs in the request in the first place, as I made some wildly false assumptions before. This is what happens when you are trying to learn SIP and FreePBX on the fly!

New theory that makes a whole lot more sense… I believe these extra codecs are coming from the D65 itself. Sangoma Phone Codecs for reference.

If this is the case, is it possible to stop the phone from advertising these? Seems like that would help reduce the SDP size and possibly lead to stopping fragmentation on a UDP request. I cleaned some stuff out of the request that might have been sensitive, not sure.

64671	INVITE sip:[email protected]:5061;transport=tls SIP/2.0	
64672	Via: SIP/2.0/TLS;rport;branch=z9hG4bKPjVyD4phvDCvz6cLbS4ufXuqALaJ5B2VD5;alias	
64673	Max-Forwards: 70	
64674	From: "101" <sip:[email protected]>;tag=D2eu54UwSprdXvxGxunShoyqIHlpMwci	
64675	To: <sip:[email protected]>	
64676	Contact: "101" <sip:[email protected]:54061;transport=TLS;ob>	
64677	Call-ID: -2ZA2.XIgvGdu6gfwx5YKjunZkhO4fvy	
64678	CSeq: 15293 INVITE	
64680	Supported: replaces, 100rel, timer, norefersub	
64681	Session-Expires: 1800	
64682	Min-SE: 90	
64683	User-Agent: Digium D65 2_9_25 000FD30BEFAD	
64684	Authorization: Digest username="101", realm="asterisk", nonce="", uri="sip:[email protected]:5061;transport=tls", response="", algorithm=MD5, cnonce="", opaque="", qop=auth, nc=00000001	
64685	Content-Type: application/sdp	
64686	Content-Length: 697	
64688	v=0	
64689	o=- 412631110 412631110 IN IP4	
64690	s=digphn	
64691	b=AS:84	
64692	t=0 0	
64693	a=X-nat:0	
64694	m=audio 4012 RTP/SAVP 0 8 9 111 107 118 58 58 96	
64695	c=IN IP4	
64696	b=TIAS:64000	
64697	a=rtcp:4013 IN IP4	
64698	a=sendrecv	
64699	a=rtpmap:0 PCMU/8000	
64700	a=rtpmap:8 PCMA/8000	
64701	a=rtpmap:9 G722/8000	
64702	a=rtpmap:111 G726-32/8000	
64703	a=rtpmap:107 opus/48000/2	
64704	a=fmtp:107 maxptime=20;maxplaybackrate=16000;maxaveragebitrate=20000;sprop-maxcapturerate=16000;usedtx=0	
64705	a=rtpmap:118 L16/8000	
64706	a=rtpmap:58 L16/16000	
64707	a=rtpmap:58 L16-256/16000	
64708	a=rtpmap:96 telephone-event/8000	
64709	a=fmtp:96 0-16	
64710	a=ssrc:1611397428 cname:691bea5415afac78	
64711	a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:BUdz7VsKLc6UhMQ0Oz+1SGauSdf3eHHYvakiw0Fx