PRI/ Alarm Notification

jarvisswope · May 12, 2016, 8:22pm

Been doing some homework on how to get a email sent when a PRI goes down in FreePBX.

I found this that dicko posted on a post
exec $(tail --follow=name /var/log/asterisk/full |while read LINE;do if [[ “$LINE” =~ “D-channel is down!” ]];then echo $LINE|mail -s “Cap’n we have a PRI problem!!” [email protected];fi;done ) &

but i cant get it to run and send a email.

when I run it at the cli I get no errors and no email. There is a D channel down error in the logs so i knwo it is there.

From this post

munozj · May 12, 2016, 8:41pm

Try this

dicko · May 12, 2016, 8:55pm

You should probably leave out the !'s as they might be “interpreted” by your shell. But IWFM still.

jarvisswope · May 12, 2016, 8:59pm

I need one to monitor just the trunks not when a failed call happens.

jarvisswope · May 12, 2016, 9:01pm

Yep already tried it. lol

this is the output

[root@pbx ~]# exec $(tail --follow=name /var/log/asterisk/full |while read LINE;do if [[ “$LINE” =~ “D-channel is down!” ]];then echo $LINE|mail -s “Cap’n we have a PRI problem!!” [email protected];fi;done ) &
exec $(tail --follow=name /var/log/asterisk/full |while read LINE;do if [[ “$LINE” =~ “D-channel is down” [email protected] ]];then echo $LINE|mail -s “Cap’n we have a PRI problem!!” [email protected];fi;done ) & ]];then echo $LINE|mail -s “Cap’n we have a PRI problem!!” [email protected];fi;done ) &

-bash: syntax error near unexpected token `]]’

jfinstrom · May 12, 2016, 9:09pm

dahdi_scan | grep alarms | awk -F"=" '{print $2}'

no need to hold open a file handle…

dicko · May 12, 2016, 9:11pm

As I said, take out ALL the ! characters, or single quote the strings.

dicko · May 12, 2016, 10:55pm

I would have to respectfully disagree with that James, The OP wanted an email notification, so the underlying answer is to monitor the log lines generated by sig_pri, my solution is to monitor the d-channel events because it would be very likely that any alarm condition on a span would force the d-channel down if it is of a “serious” nature and thusly send an email alert.

So for an immediate response, it IS necessary to keep that file handle(/var/log/asterisk/full by default) open as that is the only synchronous way to monitor the pri span, and further it IS necessary to keep that file open over rotations of that file when the filehandle changes.

If you spend time to deconstruct my method, I am sure you will agree.

(Although I would have to agree that an agi script would also work, although with I believe a much higher overhead, as it requires a network socket staying open rather than a basic use of bash functions )

Of course you could monitor less explicit events by changing the regex match. Just like I said in my original post to suit other events on other technologies as was originally requested.

I use the same method to monitor 911 calls as that is always a consideration when it comes to liability.

jfinstrom · May 12, 2016, 11:11pm

So I googled and got something I wrote 6 years ago for this. Zero additional overhead assuming you are already running fail2ban…

/etc/fail2ban/filter.d/pri_state.conf

[Definition]
failregex = VERBOSE.* .*:   == Primary D-Channel on span \d down
	    VERBOSE.* .*:   == Sending Set Asynchronous Balanced Mode Extended
	    VERBOSE.* .*: No D-channels available!  Using Primary channel 16 as D-channel anyway!
	    VERBOSE.* .*: No D-channels available!  Using Primary channel 24 as D-channel anyway!
	    
ignoreregex =

/etc/fail2ban/jail.conf

[pri-state]
enabled  = true
filter   = pri_state
action   = sendmail[name=PRI_STATUS, [email protected]]
logpath  = /var/log/asterisk/full
maxretry = 2

I think @dicko you may be more up to date on fail2ban stuff so you could probably comment on if this is still syntactically correct. Again this was written 6 years ago and I am assuming it worked back then.

dicko · May 12, 2016, 11:19pm

Probably, but again with respect, that is IMHO using a hammer when you need tweezers

Dahdi is dahdi, libpri is libpri. so monitoring sig_pri at the first derivative, which as far is asterisk is concerned is when the PRI breaks, dahdi notices it and signals Asterisk, this will surely be “more” failsafe.

Your solution would only notify after two loglines complaining that the D-channel is fu%^ed (which probably won’t ever happen unless it’s bouncing)

One line of bash is surely easier to debug than a WSL of file edits.

And a PRI/BRI without a D-channel is as useful as a fish with a bicycle.

jarvisswope · May 13, 2016, 4:02am

#!/bin/sh
ADMIN="emailaddress"
# set alert level red is default
ALERT=123
 dahdi_scan | grep "alarms=RED" | awk -F"=" '{print $2}' | while read output;
do
  #echo $output
  
  status=$(echo $output | awk '{ print $1 }' )
  if [  $ALERT ]; then
    echo "PRI Down  \"D-Channel is $status \" on $(hostname) as on $(date)" | 
     mail -s "Alert: PRI Down $usep" $ADMIN
  fi
done

jarvisswope · May 13, 2016, 4:17am

I also created this one but i keep getting

syntax error: unexpected end of file

#!/bin/sh


#SCRIPT TO MONITOR PRI CARD FOR STATUS OF RED AND EMAIL


ADMIN="[email protected]"
# set alert level red is default
ALERT=RED
ALERT1=OK

 dahdi_scan | grep "alarms=RED" | awk -F"=" '{print $2}' | while read output;

if

  [#echo $output]

then
  
  status=$(echo $output | awk '{ print $1 }' )
  if [ $ALERT ]; then
    echo "PRI Down  \"D-Channel is $status \" on $(hostname) as on $(date)" | 
     mail -s "Alert: PRI Down $usep" $ADMIN


else

status=$(echo $output | awk '{ print $1 }' )
  if [ $ALERT1 ]; then
    echo "PRI Up  \"D-Channel is $status \" on $(hostname) as on $(date)" | 
     mail -s "Alert: PRI Up $usep" $ADMIN

       
  fi

dicko · May 13, 2016, 4:27am

Sol many things wrong with that starting at line 1 , /bin/sh will throw errors , use /bin/bash to begin with.

But I would have to ask “how do you expect that to work ?”

jarvisswope · May 14, 2016, 3:08pm

My goal was to read the output of dadhi_scan and then reads the line for RED if read exist email me if red does not exist email me as well but different message.

I am not a scripting GURU. I google and then try things till i get it to work.

dicko · May 14, 2016, 6:39pm

Your proposed solution suffers as you will have to somehow rerun dahdi_scan all the time, and you would get an email every time. This is NOT what you want . . .

I suggest you start by watching /var/log/asterisk/full while you “pull the plug”

tailf /var/log/asterisk/full |grep -E “chan_dahdi|sig_pri”

you will see what happens . You can filter these two events as of importance

[2016/05/14 11:09:07] WARNING[79349] sig_pri.c: Span 1: D-channel is down!
[2016/05/14 11:11:29] VERBOSE[79349] sig_pri.c: [2016/05/14 11:11:29]   == Primary D-Channel on span 1 up

You will see a date stamp and a span/reason on each line , these are what I assume you want to email, so:-

tail -F /var/log/asterisk/full|while read LINE;do if [[ "$LINE" =~ "D-channel is down" || "$LINE" =~ "Primary D-Channel on span" ]];then echo $LINE;fi;done

will identify uniquely those events to email, just modify the echo $LINE to pipe through your mail program of choice and add the final & of my original solution to put the script in the background.
Bob’s your uncle in one(ish) line of bash.

As a side note, matching something like :-

“$LINE” =~ “[911@from-internal:1”

can also give you a heads up for possible “bad things” happening

cynjut · May 14, 2016, 10:27pm

I use the Monitor Trunk Failures feature to communicate trunk outages to my Nagios Server and then onto RT4 server.

This way, I get a ticket when the trunk goes down and (thanks to some cool programs in RT) the tickets resolve themselves automatically when the trunk comes back up. It shows up in NagMonDroid as a service outage as soon as it happens, and if it clears while I’m on my way in to troubleshoot, I can turn around and be more “measured in my response.”

I’m not saying that using a “tail -F” based solution is right or wrong - just want to say that this is now I handle my production systems.

dicko · May 14, 2016, 10:35pm

Generally trunk failures take an outbound call to trigger, this solution is is at a hardware level, other methods to detect 911 calls only trigger on hangup, many 911 services “hold the line open” on dahdi so that don’t work neither.

It is the “edge triggered event” (“tail -F” based solution) that need attention, consider using consul for a better experience than nagios perhaps.

It takes about 60 microseconds to process each line, use for you email address as one of (USA specific) :-

AT&T – [email protected]
Verizon – [email protected]
T-Mobile – [email protected]
Sprint PCS - [email protected]
Virgin Mobile – [email protected]
US Cellular – [email protected]
Nextel - [email protected]
Boost - [email protected]
Alltel – [email protected]

I bet you get the 911 call attempt or a problem with your D-channel PDQ, unless of course your network is broken . Add juicessh to your android phone (maybe others) and be even “more measured” . . .

Nothing needs to be added, it’s all already there between the log file and the bash primitives. Why not implement both and watch the timeliness?

That is how I handle my production systems

james · May 16, 2016, 3:27am

if sangoma cards, you can use wanpipemon -i -c Ta to get the debug info from driver level.

dicko · May 16, 2016, 3:35am

You can indeed, the good thing about wanpipe is that it presents also as regular network interface(s) before looking like dahdi, so things like check-mk also provide “edge triggered” events on failure. (that’s a nagios/OMD thing)