Continuing from Hooking for fun and income
This post is a very basic introduction to SIP with specific focus on SIP for FreePBX and Asterisk. I know SIP reasonably well, but do not consider myself an expert. I deal with SIP signalling issues almost every single work day, and what I know of SIP serves me well when I stay in my lane (the FreePBX ecosystem). SIP is not simple, but it’s not rocket science either. As is the case with most things, debugging SIP problems gets easier with experience. If you spend your days looking at normal SIP dialogs, the occasional abnormal ones jump out at you. Unfortunately for the target reader of this post, you won’t have much experience debugging SIP nor enough of an understanding of the basics to even ask for help. This post should assist with the latter. Intermediate level readers and above are encouraged to metaphorically point and laugh at my crude explanations and dumb mistakes.
As stated in part 1 linked above, the end goal of the lecture series is to provide the casual user with enough tools to customize SIP headers beyond what is possible in the GUI, but that is still beyond the scope of today’s post. So without further preamble, I present …
What is SIP
The SIP protocol is standardized in RFC 3621 with numerous updates, extensions and clarifications. Grossly simplifying, SIP defines a signalling protocol that allows media to be exchanged between endpoints. SIP as it’s used in Asterisk/FreePBX for audio calls involves only a fraction of overall SIP capabilities. Go ahead and take a glance at the linked RFC but there is no need for the beginner to read it, I haven’t read it myself in full nor would doing so help me much in my day to day work. SIP is probably the most common, but note that the FreePBX/Asterisk ecosystem also supports other call protocols, notably IAX2, but they are beyond the scope of this post.
Recognizing a SIP issue
It’s not always easy to know when you have a signalling issue, vs. dialplan or config issue with the PBX and/or phone. Calls with one-way or zero way audio is a common misconfiguration, and can be debugged by looking at signalling. Calls that drop mysteriously at the X minute mark or after X seconds is a signalling issue. Failed inbound or outbound calls or registration failures can often be debugged by looking at signalling even if the cause is a misconfig of the trunk or phone. Intermittent failure of calls to a phone or a phone that goes unreachable are probably signalling problems. Things like garbled audio, dropped packets, audio jitter are not signalling issues. Calls that are misdirected say to a queue instead of a ring group are not signalling issues.
The Nuts and Bolts
In it’s basic form, SIP is done in plain text and easily human readable without decoding. When a SIP peer (or endpoint) initiates communication with another peer, it will send a SIP packet. The other peer responds back with another packet. Packets are sent back and forth in accordance with the RFC standard(s). Individual packets are grouped into transactions and all the transactions for a single SIP session is called a SIP Dialog. This exchange of packets, transactions and dialogs is what is referred to as SIP Signalling. When a dialog involves an audio call, the SIP signalling sets up the session so that the media (i.e. the call audio) can be exchanged. SIP signalling and media are separate, involving different port ranges and potentially involving different hosts.
Anatomy of a Dialog
A SIP packet has a number of headers, which will vary by method, but you’ll always have
Call-ID: All of the packets in a single dialog will have matching Call-ID. The CSeq increments by 1 for each transaction. Example, A phone sends the method, INVITE to a PBX with a CSeq of 29 and a Call-ID of X. The PBX responds back to the phone with 401 Authorization required, and the SIP packet will have the same Call-ID and same CSeq. The phone sends an ACK with same CSeq and same Call-ID. This ends the first transaction. A new transaction starts with the phone sending a new INVITE this time with authorization details, the CSeq will now be 30, and the Call-ID remains the same as before. The PBX responds back with 200 OK and a CSeq of 30 matching the second INVITE, and the phone sends an ACK again with CSeq of 30 and matching Call-ID. At the end of the call, the phone sends a
BYE, this time with CSeq of 31, and the PBX responds with 200 OK and CSeq 31. A classic failure of SIP signalling is seeing the same packet being sent over and over with the same CSeq number. This tells you that the far end is not receiving the response.
Seeing the Magic
There are a few ways to examine SIP dialogs so you can see what is going on behind the scenes. You can use
tcpdump to dump all the raw interface data to a packet capture file, download the pcap, then open it in Wireshark. Wireshark is probably the best tool for this, but it comes with a steep learning curve. I use Wireshark, but I would not say I’m comfortable with it, nor would I acquit myself well were I to attempt a Wireshark tutorial. The tcpdump/Wireshark method is cumbersome in that you can’t see the SIP data live, but you do end up with a pcap file that can be saved and shared.
Alternatively, You can open the Asterisk console and enable SIP debug using the various commands for each channel driver. When SIP debug is enabled, the SIP packets show up interlaced between all the other Asterisk console output. This makes it easy to share via pastebin or text attachment, but I find it excruciatingly difficult to get a clear picture of a SIP dialog when the individual packets are spread over hundreds lines in a console session or large log file. I’m sure there are users comfortable with this method, so if you have tips to share on how to filter specific sip dialogs from the full Asterisk log, I’d love to see them.
There has to be a better way!
You will periodically see references here in the forum to sngrep. Starting in Distro SNG7, the sngrep utility is installed by default, and if not installed you can install it with
yum install sngrep. From the bash prompt you run it by typing
sngrep. When run without any arguments, sngrep will show you all SIP packets inbound and outbound on all interfaces in real time. Horizontally across the bottom are individual menus that can be triggered with the F keys. Individual SIP dialogs are displayed vertically on the main screen. Up/Down arrow keys are used to move between dialogs, and enter is used to view the highlighted SIP dialog. Space bar is used to toggle select for one or more dialogs for further action. Escape is used to back up a screen or exit the program.
On a busy system, the main screen quickly becomes cluttered with lots of SIP dialogs, the majority of which you probably don’t care about. The fist step for me is usually to press F7 (filter) and using arrow keys and space bar, disable display of all packet types that don’t interest you. If you are debugging a registration issue, disable everything except REGISTER. To debug calls, diable everything except INVITE. When you press enter to activate the filter, only dialogs of interest are shown. On very busy systems you may need to press F3 and enter a filter string such as an extension number or IP address to further filter dialogs.
Use the arrow keys to highlight a dialog then press enter. All the individual packets for a single dialog are shown in a ladder diagram. IP addresses for the SIP endpoints are displayed at the top. Individual packets are shown with arrows indicating direction. The arrow keys allow you to browse the individual packets so you can see SIP headers (and body if applicable) on the right. When viewing an INVITE dialog, the RTP (media) details not displayed by default, press F3 to toggle RTP display.
As Asterisk is a back to back user agent, a typical call scenario involves one endpoint establishing a SIP session to the PBX, the PBX in turn establishing another SIP session (or sessions), and then bridging them. So a simple outbound call from a phone through a SIP Trunk will involve at least 2 dialogs which both display separately on the main sngrep screen. To view multiple related dialogs, use the space bar to select all of them and press enter to view all the individual packets in sequence as a ladder. You need to deselect the dialogs again to view others.
Once you press F5 to clear the main screen or once you exit sngrep, all SIP information is lost. You can save individual dialogs by using the space bar to select whatever you want to save and press F2. Enter a filename and the SIP Signalling (not the media) will be saved. You can download the pcap to share or open with Wireshark. Know that sngrep does something odd with the way it saves pcaps, individual packets are grouped by dialog, not written chronologically. They may look odd in Wireshark as a result. You can also open a pcap in sngrep by using the
-I command line option:
sngrep -I /path/filename.pcap
Asterisk is ignoring my packets
Just because sngrep shows an inbound SIP packet to the PBX, does not mean that Asterisk sees that packet. sngrep is showing the raw data on the interface before it goes through the firewall. If the IP address of the originating packet is blocked by iptables, the result showing in sngrep is an inbound packet with no corresponding response from the PBX. If these packets are legitimate, you need to revisit your Firewall and Intrusion Detection config. You will also see this if the inbound packets arrive on the wrong port.
Limitations of sngrep
You will sometimes see a SIP dialog with a missing packet, and very rarely an entire SIP dialog is missing. For example, you might see a phone send an INVITE to the PBX, and then immediately send an ACK to the PBX with no packet coming from the PBX to the phone. Experience with SIP would tell you that the PBX must have sent a 200 OK in between those two packets, but sngrep did not catch it. This is, unfortunately, somewhat common, so when in doubt do several tests and compare them. While much rarer, I’ve seen the same with tcpdump as well.
sngrep can’t handle WebRTC signalling so Zulu (or any other webrtc client) signalling debug is out. Zulu calls that go out an outbound trunk will show up as a single dialog only for the trunk leg of the call.
Encrypted TLS signalling is supported by sngrep, but not in the version available in the Distro. Debugging TLS signalling is pretty much limited to Asterisk console and logfiles.
sngrep is pretty good but not perfect and does not replace Wireshark. It has limited utility when it comes to debugging RTP, so issues with audio means tcpdump and Wireshark. I know of no way to use sngrep for debug media issues involving jitter, packet loss, etc., however this post by @PitzKey shows how to enable rtp so it can be exported.
I realize that any reader who didn’t know what the signalling should look like for a specific event, will still not know. But you have some basic tools now to find out. Run sngrep and look at a REGISTER dialog, an OPTIONS dialog and an INVITE. Watch what happens when a phone boots, when the MWI indicator on a phone changes. With the exception of an INVITE, all the dialogs look pretty much the same and are easy to decode. When you run into problems, you can ask for help here with actual useful information instead of vague questions.
Lorne Gaetz, SSCA®