Fwconsole reload from Web GUI

FreePBX 13. For a remote site (50 year old copper unable to provide acceptable Internet), I have a WiFi link to a building, acriss an airfield, to provide our main Internet service. It is (has been) astoundingly reliable and consistent (and affordable). But every once in a while, it will drop, then come back a few seconds later.

To protect against drops for business-critical data (including VOIP), a router, pre-WiFi, will fail-over to (expensive) LTE data, and then fail-back when the primary link has been recovered.

During a failover, VOIP continues. FreePBX reached out to the VOIP supplier through the LTE IP address, both sides reconnect, and life continues.

After a fail-back, and depending on timing, we reconnect to the VOIP supplier, and we can make outgoing calls. But the VOIP supplier cannot reconnect to us, so we miss incoming calls. I suspect this blockage is an combination of timing, open ports, and firewalls closing ports which are perceived to be idle.

However, if I SSH in and issue ā€œfwconsole reloadā€ then two-way communication is re-established.

The Question (finally): Is there a way, though the web GUI, to do a Reload (other than an Apply Config, after updating modules)? I have admin staff at this site that can handle a Web interface and can trigger a reload, who are not going to be able to SSH to a server and use a command-line interface.

Jim

the red reload button in the top right corner

For unstable ssh use screen
https://www.saltycrane.com/blog/2012/10/how-start-long-running-process-screen-and-detach-it/

You need to perform an action (such as submit a page without changes) that triggers the ā€˜apply configā€™ button to appear. Clicking the apply config button is the GUI equal of doing fwconsole reload.

fwconsole reload also reloads Asterisk, which probably starts the registration againā€¦

IMO: You should focus on getting the SIP Trunk configured properly (maybe itā€™s a ChanSIP limitation? Then convert to PJSIP)

But if that wonā€™t work now, you can also implement a custom dialplan which reloads asterisk, so the users can do it themselves.

The SIP trunk is configured fine - the problem is a monster-mash of having double-gateway, double-NAT, double-firewall, inbound ports correctly opening because of outbound connections, with a layer of switching networks, then switching back after a long-enough period that a firewall closed an idle port.

The solution is to stop the WiFi link from failing. One way to handle this is to set some firewall rules to implement (secure) port forwarding on the Gateway, so the VOIP provider can initiate a connection.

But in the meantime, reload.

That said, Iā€™m still on chan_sip. No pain, so no incentive to change to pjsip. So far.

I have a FreePBX 15 server spun up, but have not yet migrated. Same configuration (SIP, firewalls, etc). I do notice that it maintains the connection, at times when reePBX 13 does not. But it is not in production, so it has no inbound activity.

Interesting problem. I can think of four general approaches:

  1. Find out what is going wrong with the Wi-Fi and fix it. An airplane blocking the beam for a second or two should not cause a drop, just a few lost packets. What bridges are you using? External antennas, if any? Signal strength?

  2. Given the Wi-Fi drop, find out why the VoIP connection doesnā€™t recover and fix that.

  3. Write a simple continuously running script that issues an fwconsole reload whenever it detects trouble. If detecting the actual problem proves difficult, issue an fwconsole reload shortly after every public IP address change.

  4. Move the PBX to the cloud, where it will have a solid connection to the trunking provider. Use a site-to-site VPN to connect the extensions. The keep-alive mechanisms in the VPN should reestablish connection quickly on failover in either direction. If extensions become unreachable in spite of this, calls can be automatically forwarded to mobile phones.

A few questions: How does the VoSP know where to send calls (registration, IP auth, some combination)? Is there explicit failover configured at the ISP (try A, if that fails try B)?

If using registration, what is the expiry? If long (20 minutes or more), does the provider allow shorter? With most providers, each successful registration overwrites any previous address of record. However, some will allow two or more unexpired AORs. When this is the case, you will see a 200 OK response to REGISTER with two or more Contact headers. Does your main connection have a static IP address? Is the LTE connection ā€˜upā€™ all the time? Does it have a static IP?

What does the Asterisk log show when the problem occurs (unreachable, registration timeout or failure, etc.)?

Have you tried any packet captures at the PBX or elsewhere in your network to troubleshoot?

What other debugging tools have you tried? What else did you observe that may be relevant?

1 Like

Or tmux, our favorite, which is effectively screen on steroids:

Is there any way to add fwconsole restart from the GUI? Itā€™s nice to avoid a complete reboot with this softer option.

1 Like

No, but you can do an Asterisk restart which would probably be the most common thing youā€™re trying to accomplish with an fwconsole restart. Browse to Admin -> Asterisk CLI and run either:

core restart now
core restart when convenient
1 Like

trunkalert.agi might be helpful in this scenario - change the action from send email to fwconsole reload

Nope. trunkalert.agi provides notification of failed outgoing calls. The OPā€™s issue is affecting only incoming calls.

This advanced setting lets you do Reload any time you want:

2 Likes

First, thank you for the outpouring of comments and suggestions. Synopsis so far:

  • Outbound calls continue without problem - FreePBX has no trouble establishing an outbound connection regardless of which gateway I am using at the moment,
  • Solving the dropped WiFi connections is the obvious best solution. The connection is tested every 30 seconds by pinging Google (8.8.8.8) and CloudFlare (1.1.1.1). If both pings fail three consecutive times, the gateway concludes the link is down, and switches to LTE.
  • After switching, testing of the primary link continues. Most often, 5 seconds after switching, the connection switches back - so, after an outage of 95 seconds. Iā€™ve changed the ping period to once every 10 seconds, to keep the total duration under 60 seconds, and maybe the firewall will not close ports due to being perceived as idle,
  • I am exploring the option of just leaving FreePBX on the LTE connection. LTE data is expensive, but the LTE link has never dropped, and our call volumes (thus our data traffic) is low,
  • I am exploring configuring the gateways/firewalls so inbound connections (IP/Protocol/Port) are port-forwarded to FreePBX. With this configuration, the inbound port is never closed to traffic. Proper security configuration is essential,
  • We are on chan_sip. I should flip over to pjsip, but only one change at a time,
  • The DEVRELOAD option is a great resource. Leave the button up, and someone other than me can reload when required using the GUI, rather than SSH and command line and havoc. This removes the requirement that I be available on demand.

To answer questions:

  • The WiFi link is about 1,800 feet. Iā€™m running it LOS, on 5G, using outdoor flat-panel directional antennas. The APs are consumer-level (TP-Link), and ran for 20 months before the first glitch appeared - they still function fabulously. The connection protocol is WDS. Throughput is blazing - at time of installation using iPerf the WiFi link was getting speed in the multi-hundreds of megabits per second. Our external Internet speed seems to be better than 50mbps, end-to-end once all the various gateways and firewalls and ISP are traversed.
  • I have a suspicion that the antenna at the Internet end is not fully functional - probably water incursion. But Iā€™ll need to rent a man-lift to get to it.
  • Static IP addresses all around.

Thanks again to everyone!

Jim

A possible solution to your LOS network problem might be to upgrade your wireless terminals to something a little less consumer based. Iā€™ve used these guys equipment for long-haul (several mile) PTP connections and never had a unit fail. My oldest station is about 15 years old and should probably be replaced at some point.

http://www.tranzeo.com/

1 Like

I think fixing the wireless link is the best solution, Iā€™d try Ubiquiti PTP wireless AP, and then monitor through UNMS to really see whats happening, in these situations I find its better to get the internet connection stable than struggle to find ways to overcome internet dropouts

ie. Fix the core of the issue rather than find work arounds

Use an outdoor TPLink AP that is integrated into the antenna instead of a separate antenna and AP. TPLink has a number of devices like that and I have never had trouble with them. Expensive, though.

Remember tplink like many vendors has a ā€œconsumer-modelā€ and a ā€œbusiness-modelā€ Ignore the people dissing tplink here they have never used the tplink business gear.

I suspect that the problem is the antenna on the other side of the airfield - I had a problem last year and spend quite a bit of time debugging it - swapping antennas (that I could reach), swapping APs, etc, and the only thing resolving anything was to switch the direction of the WDS connection!!!

The only thing not swapped was the one antenna.

Ubiquiti makes great hardware. Well beyond budget (you can see what is happening to the airline industry right now - that trickles down to airports and flying schools). And an Ubiquiti with a bad antenna performs just as poorly as a TP-Link with a bad antenna.

Iā€™m using a couple of TP-Link integrated bridges for shorter distances (one for a 150 foot link, one for a link of 2,000 feet). My experience has been good - but not as solid as I would like.

Tranzeo looks very interesting.

Tes: I agree, TP-Link is surprisingly good. Iā€™ve had more headaches with Cisco gear than I can count. Netgear is also surprisingly good.

Again, thank you to everyone for your thoughts.

From multiple systems when I attempt to run this command via the GUI I see the following error:

fclose() expects parameter 1 to be resource, boolean given
File:/var/www/html/admin/libraries/php-asmanager.php:495

I see the same, but it appears to only be cosmetic, Asterisk actually does restart. Open a ticket with the error message if you wish.

1 Like

Fair enough. As long as asterisk restarts Iā€™m good. Thank you!