System keeps changing access ports. Reboot fixes it

I have a server that randomly seems to change/shut down access ports. I can access the system via HTTP and SSH and then suddenly SSH is gone and when I try to access HTTP I see

Forbidden

You don’t have permission to access / on this server.

I physically reboot the server and everything is good. Standard HTTP and SSH access is resolved. When I look at port management in System Admin, everything is good. Any ideas why this keeps happening?

This sounds like fail2ban blocking you. Something your IP is doing is triggering fail2ban

I have actually run into this sporadically over the last 5 to 10 years of deploying FreePBX and was never able to narrow it down to something specific.

This has only happened to PBXs that are installed on bare metal (typically something like an Intel NUC) and on-prem with the client. Access to any of the typical management ports (Web GUI or SSH) doesn’t work even from the local network. Phones are unable to connect to asterisk either and these systems are typically setup in a space where it’s not convenient to have a monitor connected to them to have the staff look at it to tell us what they can see on the terminal.

Troubleshooting this issue is further complicated by the fact that phones are typically time sensitive things in most of the businesses that we support so we simply ask them to power cycle the unit and since it takes about 30 seconds for these to boot up, phones are back near instantaneously.

What’s more this happens RARELY on the same unit, some go a year or two without a restart before this will happen to them again. We only noticed the pattern above because we have many deployed like this and it would be a different unit doing this every so often so we can’t even really be ready with a monitor at the same client when it happens the next time.

Looking through logs over the years also hasn’t surfaced a pattern or a cause.

My hope, just like the long standing cron bug with cert renewals and backups, is that eventually this issue will just fade away as FreePBX moves away from CentOS and outdated fail2ban instances.

TLDR: Have seen this same symptom pretty consistently over the years and we were never able to figure out the cause.

Sounds pretty close to my issue except it happens on a particular system every couple of weeks. Local http/ssh does not work and local phones can not register. Perhaps fail2ban is blocking local traffic also?

Thanks, James. I’ll give that a shot. I’ve never experienced this before and like Igor’s response when it happens, local access to the system is shut down. Phones don’t register, ssh won’t respond and HTTP shows the Forbidden message which I typically only get when I change the admin port and Let’s Encrypt is using port 80

Yea, it’s not a port change, it’s just that the system wholesale stops accepting any network requests even though local IPs are whitelisted in multiple spots within FreePBX and everything works just fine after a reboot.

If you have a device that this happens to often enough would you mind connecting a monitor to it and see if there are any messages that show up on the local terminal that would help us troubleshoot the issue when this happens?

As mentioned before I haven’t had luck parsing logs after an event either so coming by an explanation has been tricky.

I’ll give that a shot. What confuses me is that when I attempt to browse to the admin HTTP port I receive the Forbidden message, which tells me traffic is getting to the phone system but I’m trying to connect on what it thinks is the wrong port. I usually see that when I move 80 to Let’s Encrypt which is not the case in this system

Yea, I’ve seen that same exact message, just fully white background with a bold Forbidden message. I can’t remember if I had determined in the past if that’s something that comes back from the PBX or the browser when it doesn’t receive the correct HTTP response when trying to connect to either the HTTP or HTTPS ports. We change our https GUI ports to something completely non-default and we still receive the same error message on the same custom GUI HTTPS ports.

Try using telnet to the web gui port that you typically connect to and see what you get. If the connection is refused then it’ll simply time out, if it’s not it will allow you to connect and return something like:

Trying < some IP address >
Connected to < FQDN or IP >.
Escape character is ‘^]’.

If it was the firewall or fail2ban that was suddenly blocking all inbound connections and that page was coming back from apache on the PBX then I am not sure how the connection got past the firewall for apache to respond with the error message.

If it’s happening that frequently, perhaps locking down access from the internet as tightly as possible and disabling fail2ban for a period to see if the behavior stops would be a brute force test. If the problems go away then you have a pretty good indication that it’s fail2ban causing the issue. Then it just becomes a question of why?

As an aside, we’ve been using APIBan for a few months now on all of our systems and have seen a dramatic reduction in fail2ban hits. Easy to implement and seems very affective.

Honestly though, even if I nailed it down and knew a 100% that fail2ban is doing it, there would be nothing that could be done anyway as we are not really interested in changing the packages that get natively installed with the Sangoma distribution.

FreePBX 17 is slated to go on Debian so the hope is that these weird rare bugs will just disappear once the distribution is more in line with what’s currently available.