How to verify that an endpoint is down from dial plan

I want to be able to check if my primary endpoint is up and running from DialPlan. If my primary is down I want to be able to find out so I can route traffic to a backup endpoint.

For testing, I have used an ip address I know won’t work, 1.1.1.1, I will be referring to this endpoint as the ‘broken primary’. The ‘backup’ endpoint, is an asterisk PBX I have verified works.

I setup OPTIONS pings for both the broken primary and backup, and as expected, I don’t receive a response from the broken primary, but I do receive a response from the working backup. So far so good.

I get the following message from the asterisk CLI to indicate that the broken primary is unreachable after the qualify timeout is reached, as expected.

– Contact X.X.X.X/sip:1.1.1.1:5060 is now Unreachable. RTT: 0.000 msec

The problem is, that I haven’t found a good way to check the qualify status of the ‘broken primary’ endpoint in my dial plan.

I did some research and saw that there was a SIPPEER function, so I tried to register the working backup endpoint as a peer by doing the following in my sip.conf:

[test-peer]
type=friend
context=phone
allow=ulaw,alaw
secret=1234
host=X.X.X.X ; in my conf this is a valid IP

But I get the following error when I run the asterisk CLI, indicating that the peer is unreachable.

[2019-07-31 14:24:48] NOTICE[5691]: chan_sip.c:30181 sip_poke_noanswer: Peer ‘test-peer’ is now UNREACHABLE! Last qualify: 0

So essentially I am asking a few things:

  1. Is there an analogous function to SIPPEER for endpoints in dial plan? I didn’t find one after extensive search in the documentation, but I could be wrong. Note that I specifically need something that would tell me if an endpoint is up i.e. the “status” item of SIPPEER as I understand it.

  2. Where is the following message logged:
    – Contact X.X.X.X/sip:1.1.1.1:5060 is now Unreachable. RTT: 0.000 msec
    Perhaps I can check this log and do some parsing to leverage it for my purposes, but I was unable to locate its source.

  3. Is it even possible in asterisk to determine the status of a remote endpoint using options pings?

*** This is my first post and I am fairly new to asterisk so excuse any missteps in jargon.

Within the scope of Asterisk the DEVICE_STATE dialplan function[1] can be used to determine if an endpoint is available or not. There is also the ChanIsAvail dialplan application[2]. The dial attempt can also just be attempted and if it fails, then go to the failover.

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+13+Function_DEVICE_STATE
[2] https://wiki.asterisk.org/wiki/display/AST/Asterisk+13+Application_ChanIsAvail

So just to clarify, can I use DEVICE_STATE on an endpoint (not just on a peer)?

I tried to use this on a working endpoint I had defined in my pjsip.conf file and I got “INVALID”, but when I tried with a peer that pointed to a nonsense ip I got “UNREACHABLE”.

It can be used on any device in the system. Without specifics (Asterisk version, configuration, what pjsip show contacts showed for it) I can’t state why one was INVALID but the given functionality is what powers BLFs and extension monitoring, so it’s generally kept on top of and ensured to make sure it works as expected.

That being said what people generally do is just try the call and then failover. This is because there’s a potential race condition between you doing your check and any subsequent action, allowing a window where the remote may change to unreachable or reachable.

Okay makes sense, thank you!

So the issue I’m having is once the Dial function is executed via the ‘broken primary’, +1xxxxxxxxx@brokenprimarypstn, it seems that asterisk crashes, so I am unable to reach the failover, is this behavior expected?

Define “crash”. What exactly happens? Does asterisk itself stop working? Does the program actually crash? What version of Asterisk?

I get the following:

Disconnected from Asterisk server
Asterisk cleanly ending (0).
Executing last minute cleanups

I ran a core show version:
Asterisk certified/13.21-cert2

I have a bunch of functions to handle different DIALSTATUSES, but I am unable to handle the Dial function failure because it just crashes asterisk.

I am still new to Asterisk, what is an example of a failover

Is this strictly Asterisk and no FreePBX? If so in the future the Asterisk community forums[1] would be a better place.

As for your problem that version of Asterisk is not supported, so it is entirely possible whatever is causing the crash has been resolved in current supported. I also don’t have a failover example handy, but I’m sure they exist on the internet.

[1] https://community.asterisk.org/