Nagios monitoring FreePBX 13

Hello,

Putting feelers out to see if others are using Nagios to monitor their FPBX installs, and if so which modules?

We’ve been using Nagios with check_sip to monitor older 2.11 installs, however since moving to 13 & PJSIP this is now getting an Unauthorised response. I guess this module doesn’t allow for proper authentication, or because it just sends an OPTIONS packet perhaps chan_sip wouldn’t authenticate that and pjsip does?

Thanks.

I use Nagios for monitoring all of my systems. Check the options on the check_sip program - I think there’s a -U option that allows you to specify a username…

Pretty sure the sys admins are setting the username to a known extension (we put a test extension on each PBX). There is no option for a password though.

Are you using check_sip with pjsip? Afaik all it does is send an OPTIONS and waits for a response, I don’t think it attempts to place a call at all.

This would be a good ari project

You’re right. Could start out with the existing check_sip script and add some functionality for checking some of the other technologies as well. Getting a health check from the server would be a snap.

There are already check_ scripts for connections and stuff like that - starting with those and trying to connect through ARI would be relative simple.

Some examples

# curl -u 'USERNAME:PASSWORD' -X GET -H "Content-Type: application/json" 127.0.0.1:8088/ari/asterisk/info
{
  "build": {
    "os": "Linux",
    "kernel": "2.6.32-431.el6.x86_64",
    "machine": "i686",
    "options": "COMPILE_DOUBLE, LOADABLE_MODULES, OPTIONAL_API",
    "date": "2016-10-05 00:22:11 UTC",
    "user": "mockbuild"
  },
  "system": {
    "version": "13.11.2",
    "entity_id": "52:54:00:f5:21:37"
  },
  "config": {
    "name": "",
    "default_language": "en",
    "setid": {
      "user": "",
      "group": ""
    }
  },
  "status": {
    "startup_time": "2016-10-10T18:06:54.623-0700",
    "last_reload_time": "2016-10-10T18:18:59.706-0700"
  }
}
# curl -u 'USERNAME:PASSWORD' -X GET -H "Content-Type: application/json" 127.0.0.1:8088/ari/endpoints
[
  {
    "technology": "PJSIP",
    "resource": "1001",
    "state": "offline",
    "channel_ids": []
  },
  {
    "technology": "PJSIP",
    "resource": "Test",
    "state": "offline",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "1002",
    "state": "online",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "1004",
    "state": "online",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "jamws",
    "state": "unknown",
    "channel_ids": []
  },
  {
    "technology": "PJSIP",
    "resource": "anonymous",
    "state": "offline",
    "channel_ids": []
  },
  {
    "technology": "PJSIP",
    "resource": "122222",
    "state": "offline",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "1233333",
    "state": "unknown",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "fpbxtrunk1",
    "state": "online",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "991002",
    "state": "unknown",
    "channel_ids": []
  },
  {
    "technology": "SIP",
    "resource": "fpbxtrunk2",
    "state": "online",
    "channel_ids": []
  }
]

I just checked my Nagios server, and I’m using a modified version of check_asterisk_ami to manage my Asterisk monitoring. I added code to do pjsip, sip, sccp, and dahdi “peer” checking, as well as the existing “channels”, etc. I’ll put it up on the Cynjut Sourceforge page in a few minutes, after I install an Asterisk 13 PJ-SIP enabled system so I can test the code out.

UPDATE: Nagios Plugin for Asterisk is now available. It uses “pjsip show endpoints” to get the online/offline equivalent of “sip show peers”.

Hi. Can anyone share a “how to” guide on setting up and using nagios to monitor freepbx servers? I’ve been installing and supporting freepbx servers for a while now … actually since Tony Lewis went to trixbox training in Boston LOL, but I’m a total noobie when it comes to nagios or how I can monitor the health of my freepbx install base remotely.

I don’t have a good “How To” reference, but I do recommend you get one of the Nagios “managers” once you get the software installed. Also, up front, Nagios is a lot of tool to monitor a single service. Installations of under 5 servers are probably better managed using simpler tools.

Also, FreePBX 13 implements Asterisk 11 or Asterisk 13. The AMI interface is available to both, but ARI is only available in Asterisk 13. At some point, I expect the AMI interface to be deprecated, at which time, the shell script that is the “Nagios Asterisk Interface” will no longer work.

There are two versions of Nagios now - the Open Source version and Nagios XI.

My situation is different than a lot of folks - I use NetBSD for my monitoring and management systems (I used to maintain the 386BSD FAQ back in the '90s). I wouldn’t install Nagios on a phone server - it’s kind of a resource pig. Because of that, I install everything using pkgsrc on a separate server. This simplifies my installation jobs, unless it goes horribly wrong (like it has this weekend). More on that later.

I use the Open Source version of Nagios. Nagios will run on any reasonable LAMP server. The display installs under an Apache server as a subdirectory in “/var/www/html”. It has a few prerequisites that need to be installed. If you are using this in a commercial space, I recommend the “XI” version. You’ll get better support than through the forums and they can help you with the early setup stuff.

As far as management goes, Webmin used to have a management module for it, but I haven’t used it in years. I’ve also been playing around with NagiosQL to set up the system. You can also start with something like “nmap2nagios” to get the basic network map. Basically, they manage the config file (OS uses config files, XI uses a database) and help you set up host and service templates, as well as all the rest of the little onsey-twosy settings in the system.

Because of the comprehensive nature of the system, it does take some time to get it set up. It also requires some tuning once you get it installed (to filter out the false positives). Once you have the basic service monitoring set up,

The Nagios AMI interface (that I mentioned in a message above) requires an Asterisk manager login and access to AMI from outside the phone server. You will probably need to set up the “Collector” IP address as a trusted host in the FreePBX firewall with access to port 5038. It can be used for things like channel usage, active calls, on-line/off-line phones and pretty much anything else that AMI exposes. A couple hints: when you set up the manager user, you have to give it write access - I don’t know why. If you try to give it “read only” access, the system won’t connect.

The system does notifications through email, pagers, and other “custom” interfaces, If you run a ticketing system (Jira or RT4, for example), Nagios can be set up to interface with most of them through the notification interface. My system automatically creates or updates tickets for services and hosts that go down and closes them automatically if they come back up by themselves. This way, when the tech(s) get done fixing the problem, they can track their time.

There are also apps that you can load onto your phone that can interface with Nagios (NagMin is an example).

Hey thanks so much for the info. Wow yes this sounds like a major project which I’m not sure if I have the time to dedicate.

I guess I was hoping for something a little easier.

Can anyone else share info re how they monitor their install base of freepbx servers? Is there anything that would be easier yet that still monitors the primary functions of asterisk. We are try to be more proactive with the service / maintenance of the servers we deploy for our clients.

I didn’t mean to hijack this thread, if I should start a new question please mention and I will take care of that.

No, not a problem. I started this thread with the intention of discussion and debate. I think your second last post is still on-topic;

Thanks … well hopefully other folks will share what they are using or how they accomplish this.

Nagios, or as we call it in-house Nagy, is quite the capable animal, and depending on your installation, you have either some sort of semi-automated configuration tool with it, or like me, have had the installation running for 10+ years, and wrote the configuration files by hand. With some Pico / Nano / Emacs action, you too can have a functioning Nagy environment.

You may wish to perform these steps on a test FreePBX installation, especially if you are new to the linux command lines. Then again, if you have a functioning Nagy box, you likely have worn down the letters on your keyboard caused by the setup process.

Each Nagy installation is completely custom. I cannot give a bullet item list of instructions, as each linux install is different across families: RedHat/CentOS is different than OpenSuSE. But as you already have a working Nagy installation, you should be able to see where I am going with this code.

As mentioned above, you will need to enable an AMI account. This is done in Settings --> Asterisk Manager, and in my case, I created a new Manager. I also modified the permissions to be read only, with the exception of system and command – those two need to have write access. It is also a good idea to change the Deny / Permit fields, and limit the permitted IP numbers to your local subnet where your Nagy box lives. Do not give the bad guys gifts like unsecured ports.

Next, you need to go to your FreePBX box, and open a firewall port on 5038. I made mine on both TCP and UDP.

Next, go to your Nagy box, and open a command line. Telnet into the Asterisk box, making sure that you can make a connection. Telnet -IP-of-FreePBX 5038 This command will initiate a connection on the port, and test your firewall settings. IF you don’t receive an answer, drop the firewall, and see if it connects. Connection? You have a firewall config error. No connection? You have something else wrong and need to fix that first.

Next, you need to go to your Nagy box, and locate where you check scripts are. Mine are in /usr/local/nagios/libexec and in there, I installed the Nagios Plugin for Asterisk described above. Make sure you change the owner (chmod) and permissions (755) for the file so it will execute. I named my script check_pbx_ami

Next, you will need the netcat command. Some linux systems have NC which is the same thing. Type in which netcat, and see if it is present in your path. If not, try which nc and see there. No exe? Then you need to install it. The script above wants netcat, and as I have nc, I changed the script to reference that instead.

Ok, now you can run the script by hand on the Nagy box, and see if Asterisk talks to you.

./check_pbx_ami -H ip-asterisk-box -q sippeers -u ami-username -p amipasswd!

Here is my box’s answer: OK: 15 online, 3 offline SIP peers|online=15 offline=3

That’s good. If you don’t have an answer, you have a problem. Did the Telnet command above work?

Integration into Nagy
Assuming all is good, you now have connectivity from Nagy to FreePBX, and need to bolt the program into Nagy. Because Nagy service checks have to be unique, I needed to make a unique Nagy command to check each feature of the Asterisk box. This is done in the commands.cfg file in a standard installation.

define command{
command_name check_freepbx_sippeers
# Warn me when 5 extensions are Offline, and Critical when 7 are offline
command_line $USER1$/check_pbx_ami -H $HOSTADDRESS$ -q sippeers -u user -p passwd -w 5 -c 7
}

define command{
command_name check_freepbx_channels
# Warn me when more than 5 extensions are active. Might run out of trunks.
command_line $USER1$/check_pbx_ami -H $HOSTADDRESS$ -q channels -u user -p passwd! -w 5
}

define command{
command_name check_freepbx_calls
# Warn me when more than 5 calls are active
command_line $USER1$/check_pbx_ami -H $HOSTADDRESS$ -q calls -u user -p passwd! -w 5
}

define command{
command_name check_freepbx_extension
# Check a particular extension
command_line $USER1$/check_pbx_ami -H $HOSTADDRESS$ -q namepeer -n $ARG1$ -u user -p passwd!
}

define command{
command_name check_freepbx_iaxpeers
command_line $USER1$/check_pbx_ami -H $HOSTADDRESS$ -q iaxpeers -u user -p passwd! -w 5 -c 3
}

The above commands I defined in the commands.cfg file, with the weak part that the usernames and passwords are hard coded into the text file. This is a secure box, and I wish there was a better way, and if you have a better way, tell us how to make it better. :slight_smile:

Next, I had to make a matching entry into my services.cfg file

define service{
use servers-alert ; Name of service template to use
host_name FreePBX
service_description Check Calls
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups servers-alert
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_freepbx_calls
}

define service{
use servers-alert ; Name of service template to use
host_name FreePBX
service_description Check Operator Extension
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups servers-alert
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_freepbx_extension!103
}

Notice the two service definitions… you will need a definition for each command you want to check. Also note the second one that has this !103 Did you see, above in the command.cfg file, the $ARG1$ entry? That’s how you pass a variable in Nagy. Here, I want the script to specifically check x103 for a problem.

Go ahead and restart Nagios. If you have any typo problems, the service should alarm you with a warning and prompt you to fix it. If not you have some things to test!

In the command file, I had some -w and -c listings, which are for warning and critical respectively. Note that the code requires a warning defined in order to support a critical, meaning if you leave warning blank, you will never see a critical alarm. You will need to tune the numbers accordingly to your environment.

That’s how I did it. Feel free to try on your own.

Christian

1 Like

That’s awesome.

A few additions - there are lots of pieces out there in the Nagios world that work with various versions of Asterisk. Some work better than others and some have been maintained better over time.

If you’re up for a really exciting weekend, you can implement SNMP on the server and use that as your Nagios data source. Fun will be had by all.

Before anyone jumps up to shout this down, Nagios can hammer your network if you set it up to be too aggressive. It can also flood your inbox with spurious “your system is still gronked” messages.

The advantage of this setup is that it’s a separate server, so you can monitor all of your Asterisk servers throughout the world, as well as mundate stuff like toner cartridge levels and web server stuff. In the early days (and I haven’t looked since), I added some code to “invert” the meaning of some of the checks. I used this for “SATAN” scans and other “bad ports” that I didn’t want open on some machines.

Thanks for the reasonably complete explanation. I’m sure people will be using it a lot.

Hello Dave,

Thank you for the code plugin sent to the Nagy plugins area. Appreciate your efforts in getting the commands hammered out so I could codify it into my box.

For the rest of you out there, yes, a Nagy box can monitor many things. Truly a remarkable piece of open source software. But it does take time to configure properly. Do not expect to get it right the first time. And yes it can generate a tremendous amount of email if you configure things wrong.

You can even send after hours alarms to your boss by mistake. :slight_smile:

Have a good evening,

Christian

Perhaps look to check-mk

https://mathias-kettner.de/checkmk_multisite.html

We use Nagios for monitoring all our FreePBX installs and it works great. I have one site that is constantly losing it’s connection to Asterisk. fwconsole restart fixes the problem but it would be nice to get an alert when this happens. I’m having to constantly check the gui right now. This problem will be going away tonight after I re-install FreePBX but it would be nice to know if there is some way to have Nagios monitor the connection to Asterisk.

Any thoughts?

Maybe one of these

https://exchange.nagios.org/directory/Plugins/Telephony/Asterisk

any ami based one would likely be the better ones, (needs tcp 5038 open on your firewalls)

perhaps

https://mathias-kettner.com/

is a nice improvement to nagios.

1 Like