BLF Warm Spare Failover not working

Hello Community,

i have a FreePBX Warm Spare Setup running on 2 Different Locations. Each one has it’s own Public IP.

The Failover Scenario is an A Record Hostname with TTL 60s.

Also all phones do a re-register every 60 seconds.

So if i change the A record to the Failover PBX all phones and trunks are up within 1 Minute and eherything works except BLF Lights.

The BLF lights on all phones do not show anything anymore and stuck.

After a reboot of all phones the BLFs begin to work on the Failover PBX.
I have no idea how i can automate this Failover sceanrio with working Hints.

I already tried to dynamically generate hints but this did not solve the problem.

We are using Chan_SIP.

Does anyone have an idea how to get this working or maybe another Free Failover Solution which works on 2 different locations with 2 different IPs.

What phones are you using. Not all phones support this. I use Yealink now, used to use Grandstream gxp2160, and I know that both of these BLFs work in the scenario u had described.

You might be better served using prioritized SRV records in place of manipulating A records.

Blockquote

I use Yealink T46S, T56A, T58V and all do not work with this scenario…that means, everything works perfectly except BLF until i reboot all phones.

I cannot do this because i need it strict handlet which pbx should be used. A Script is changing the A Record, disabling my trunks on primary and activate them on the failover machine when the Primary is down or when i do some maintanance.

Also the most phones do not work out of the box with SRV records. Most phones have A Record as default setting

I must be missing something then, if you prioritize then everything is handled by the ip of the highest prioritised IP if it is there, only when that IP disappears when “The server is down” or "You do some maintainance and let DNS know you are " that the “Backup Serrver” will ever see traffic.

For me i prefer the A Record Failover. Srv records do not work with all phones and also needs to be special configured for every phone.
If i can solve the BLF problem im already really happy :heart_eyes::heart_eyes:

Well, good luck then.

Just as a point of record Yealinks T46S, T56A, T58V are very happy with SRV records, you could even say prefer it, just change your transport type to DNS-NAPTR (one line in your provisioning setup) and you should be good to go.

1 Like

Alot of phones only do their initial subscribe for hints on boot. Maybe the phone is caching the DNS resolve for BLF buttons and not realizing the BLF IP has changed. Do a packet capture and verify the PBX is sending out the notifies on BLF. If it is then you know its a phone issue and take it up with Yealink.

1 Like

Asterisk will only send NOTIFY packets to devices that have active subscriptions. You can get list of active subscriptions from Asterisk with:

sip show subscriptions
pjsip show subscriptions inbound
1 Like

Thank you @lgaetz. I think i have found the problem with your help.

All peers are registred instandly but there are no subscriptions on the backup machine.

Is there a way i can force a new Subscription from the Asterisk side when a Failover Scenario starts? I think the phones will do the subscription again after 1800 Seconds as i can see on the Primary machine, but thats a bit too long for this type of Failover Scenario

Maybe you could have a trigger that looks at if the trunks are active on the spare. If they are you could trigger a new subscription. You would probably need to add a few more pieces, so it doesn’t keep doing it, but that might work.

Thank’s that’s what im looking for. How can i trigger a new Subscription to the phones from the Server side via cli?
My only solution i found was setting the session timer in Yealink and Snoms config to a value like 60s.
But if the Server can force a new Subscription via cli i can leave the setting on the phones on 3600s.

You can’t have the PBX do it. The phone has to intiate it. You would have to get yealink to support it when the phone fails over to subsribe all their hints again.

I already suspected this… :cry:

The phone does not fail, my problem was the default Setting of 3600s

I tried it with Snom and Yealink:
If the phone just “Re-Registers” they do not Re-Subscribe until the Session-Expire timer ends.

So when i switch to the failover PBX and i wait about 1 Hour then it should work.

I think i will override the Session timers in Yealinks and Snoms Config to 60s

Well they should handle that better. That’s crappy software design for them not to subsribe to all hints on failover. You are going to overload asterisk if you resubsribe every 60 seconds. Asterisk BLF system is not great to begin with and that will only cause you more issues.

I just re-read your comments above. The phone is not dual registering but using FQDN. Ya that won’t work very well as the phone doesn’t know the PBX has changed as far as BLFs are aware. How are you going to make sure all phones use a DNS server that will honor such a short TTL like 60 seconds. Lots of DNS servers will ignore such a short TTL

FreePBX HA for instance handles this as it’s realtime sync so all the hints are synced so nothing gets missed so on failover everything keeps working.

You are absoluteley right :wink:

I have not tried it with Sangoma Phones, Maybe they handle this better. If so i am not aware of using Sangoma S500 Phones instead of Yealink T46S for my next projects.

Thanks for the note that asterisk will probably overload.

I just thinked about leaving this things and just initiate a “check-sync” command with fwconsole epm command on the backup PBX. So phones reboot after they re-registred on the Failover PBX and then everything should work. I think it is okay for a Failover Scenario that i hopefully never will need. This is nothing for a 24/7 customer requirement but i dont accept that customers at the moment as i do not have ressources for this at the moment.

I only sell Cloud Hosted PBX Systems with a own Dedicated DSL Line provided from our partner. So i can make sure everything works (Short Traceroutes, Perfect Ping and the TTL60 works.). If the customer uses his own Internet Access then it is on his own risk.

I dont know if HA already supports multiple Server Destinations without own internal network. (Each Server only has a Public IP)
The normal customer with 30 Extension is not willing to pay the high price for HA. I know it is worth it, but so the monthly fee for the customer is about 6-10 Times higher than without HA.

You delivering their interenet doesn’t solve DNS as your phones I am sure are setup with DHCP so they will get DNS servers from their DHCP server.

The provider Delivers the right modem and this serves DHCP, the phones are not in the customers primary company network so the providers has full control of DNS which works because he also offers SIP Trunks and also his own PBX solution and he cares about that this has to work, but i only use their Trunks and DSL Lines, i prefer PBXAct and FreePBX.