RTP stream statistics and RTCP monitoring for quality control

Hi all,

Though not strictly FreePBX related, I still consider this topic vital to running a sound PBX solution and therefore worthwhile to share here. This will provide detailed insight in the network quality.

Primer
Tshark is Wireshark’s command line counterpart. It has a good deal of Wireshark’s functions and features, amongst which 2 I’m utilizing here: RTP stream statistics and custom selection of (RTCP) packet fields.

Example results
RTP stream statistics will log something like below;

==============================================================
Mon Mar 28 15:00:01 CEST 2011
========================= RTP Streams ========================
    Src IP addr  Port    Dest IP addr  Port       SSRC          Payload  Pkts         Lost   Max Delta(ms)  Max Jitter(ms) Mean Jitter(ms) Problems?
      10.0.0.94 21560       10.0.0.20 13054 0xFDE2D779 ITU-T G.711 PCMU  5344     0 (0.0%)           24.61            0.74            0.10 
      10.0.0.20 19900 193.169.138.162 10228 0x113E791C ITU-T G.711 PCMU  5343     0 (0.0%)           26.15            1.00            0.13 
193.169.138.162 10228       10.0.0.20 19900 0x77F87D71 ITU-T G.711 PCMU  5344     0 (0.0%)           26.11            1.28            0.29 
      10.0.0.20 13054       10.0.0.94 21560 0x0C92CC76 ITU-T G.711 PCMU  5344     0 (0.0%)           29.08            1.70            0.31 
193.169.139.167 11156       10.0.0.20 12318 0x1C261F02 ITU-T G.711 PCMU  2305     0 (0.0%)           28.40            1.41            0.42 
      10.0.0.20 19612       10.0.0.68 28782 0x5EC4AEE8 ITU-T G.711 PCMU  2301     0 (0.0%)           28.46            1.45            0.44 
      10.0.0.68 28782       10.0.0.20 19612 0x0DD89BE3 ITU-T G.711 PCMU  2299     0 (0.0%)           27.37            2.35            0.10 
      10.0.0.20 12318 193.169.139.167 11156 0x10DB52A4 ITU-T G.711 PCMU  2299     0 (0.0%)           32.70            2.39            0.13 

RTCP packet field selection will log something like below;

2011-03-28 15:55:48.790319    10.0.0.20 -> 193.169.138.162 RTCP      id == 0x28f32e0b  id == 0x3061e0d8  jtr == 19  cumnr == 2  fract == 1
2011-03-28 15:56:08.802233    10.0.0.20 -> 193.169.138.162 RTCP      id == 0x28f32e0b  id == 0x3061e0d8  jtr == 26  cumnr == 3  fract == 1
2011-03-28 15:56:28.790554    10.0.0.20 -> 193.169.138.162 RTCP      id == 0x28f32e0b  id == 0x3061e0d8  jtr == 21  cumnr == 4  fract == 1
2011-03-28 16:30:09.073794 193.169.139.167 -> 10.0.0.20    RTCP      id == 0x649cfc1c  id == 0x1afcadf2  jtr == 50  cumnr == 4  fract == 4
2011-03-28 16:30:14.073483 193.169.139.167 -> 10.0.0.20    RTCP      id == 0x649cfc1c  id == 0x1afcadf2  jtr == 51  cumnr == 6  fract == 2
2011-03-28 16:30:24.074152 193.169.139.167 -> 10.0.0.20    RTCP      id == 0x649cfc1c  id == 0x1afcadf2  jtr == 58  cumnr == 7  fract == 1
2011-03-28 16:35:54.982355    10.0.0.20 -> 193.169.138.166 RTCP      id == 0x1db75684  id == 0x3407dae3  jtr == 19  cumnr == 1  fract == 1
2011-03-28 16:43:44.978312    10.0.0.20 -> 193.169.138.166 RTCP      id == 0x1db75684  id == 0x3407dae3  jtr == 24  cumnr == 2  fract == 1

Where fract is the number of packets lost in this interval (5 second intervals of 250 packets total (50/s)).

Location, location, location
RTP stream statistics are calculated from the RTP media traffic captured by tshark on the pbx itself. So stats on streams sent by the PBX should by quite perfect. But this doesn’t say anything on how the other end will receive the stream. So the stream statistics are primarely usefull for determining the quality of incoming streams.

RTCP packets on the other hand contain statistics themselves, generated by a phone, pbx, trunk provider (if you’re lucky) and sent over to provide quality feedback. Asterisk generates them, Cisco phones generate them, my trunk provider generates them, and so do probably a whole lot of others.

Complications
Because tshark only outputs the RTP stream statistics when ended, I make it restart every 15 mins. This can cause it to be started again in the middle of a conversations, which UDP traffic it might not (immediately) interpret as RTP again. Further, sometimes off the chart figures are seen, but these are easily recognized and can only be ignored.

According to various records on the internet RTCP is far from perfect too: differences in calculation, odd figures, trouble passing NAT (it typically uses the RTP port +1) or even no support at all. Again, anomalies are often recognizable and can only be ignored.

As is true for all statistics; don’t consider the numbers complete or perfect, just give sufficient thought to their origin.

The source
The RTP stream monitor (schedule in cron to run ie. every 15 mins):

# Kill the running process
kill `cat ~/stream.pid`
# Remove any temporary files tshark might have left
rm -f /tmp/ether*
# Clean up files older then 20 days
/usr/bin/find /var/rtpmon -type f -mtime +20 | /usr/bin/xargs /bin/rm -f >/dev/null 2>&1
# Set current day for log file
STREAMDAY=$(date +"%m-%d-%Y")
# Append the date to the log
/bin/date >> /var/rtpmon/streams$STREAMDAY.log
# Start the new monitor process
/usr/sbin/tshark -a duration:897 -qp -z rtp,streams >> /var/rtpmon/streams$STREAMDAY.log &
# Write the started process's PID to file
echo $! > ~/stream.pid

The RTCP monitor (scheduled in cron to restart at midnight) (also note that I’m filtering out RTCP reports that have no packet loss):

# Kill the running process
kill `cat ~/rtcpmon.pid`
# Remove any temporary files tshark might have left
rm -f /tmp/ether*
# Clean up files older then 20 days
/usr/bin/find /var/rtcpmon -type f -mtime +20 | /usr/bin/xargs /bin/rm -f >/dev/null 2>&1
# Set current day for log file
STREAMDAY=$(date +"%m-%d-%Y")
# Append the date to the log
/bin/date >> /var/rtcpmon/rtcpmon$STREAMDAY.log
# Start the new monitor process
tshark -ltad -z proto,colinfo,rtcp.ssrc.fraction,rtcp.ssrc.fraction -z proto,colinfo,rtcp.ssrc.cum_nr,rtcp.ssrc.cum_nr  -z proto,colinfo,rtcp.ssrc.jitter,rtcp.ssrc.jitter -z proto,colinfo,rtcp.ssrc.identifier,rtcp.ssrc.identifier -R "rtcp.ssrc.fraction>0" | sed -u 's/rtcp.ssrc.identifier/id/g' | sed -u 's/rtcp.ssrc.jitter/jtr/g' | sed -u 's/rtcp.ssrc.cum_nr/cumnr/g' | sed -u 's/rtcp.ssrc.fraction/fract/g' | sed -u 's/Sender Report   Source description//g'  >> /var/rtcpmon/rtcpmon$STREAMDAY.log &
# Write the started process's PID to file
echo $! > ~/rtcpmon.pid

I hope anyone finds this usefull.

With best regards,

Bas

Nice thanks.

Just a little sed hint for you. You can just call sed once so all those substitutions can be done like this.

sed -u -e '{ s/rtcp.ssrc.identifier/id/g s/rtcp.ssrc.jitter/jtr/g s/rtcp.ssrc.cum_nr/cumnr/g s/rtcp.ssrc.fraction/fract/g s/Sender Report Source description//g }'

Thanks for the informative post.