Urgent Help Required (fail2ban failure)

I have new FPBX 15 install I have just installed

PBX Version:15.0.17.12
PBX Distro: 12.7.8-2012-1.sng7
Asterisk Version: 16.15.1

When first booted fail2ban works fine. System is behind a router so I haven’t been using the firewall. I have fail2ban set to ban after 4 failed logons and this has protected my system fine.

Logs and bans quite a few IP’s, however after about 10 hours fail2ban stops adding banned IP addresses to the fire wall and the constant login attempts start to become a DOS attack. Restarting the fail2ban service makes no difference, only a complete reboot seems to help.

I compared the failban log files and settings with a sister system I have which working fine and can’t see any difference?

Has anyone had similar issues? or could give me a clue where to look next?

I don’t have specific guidance, but I’d check (and yes these are obvious, but it never hurts to go over the obvious):

  • is the fail2ban-server python process still running (has it crashed, what’s it’s state?)
  • do the entries in iptables match what /var/log/fail2ban.log has (ie. iptables -L fail2ban-SIP -n)
  • is the volume of entries becoming absurd
  • is memory consumption high / available memory low
  • is disk i/o or free space becoming an issue?
  • are the log files fail2ban is consuming being updated?
1 Like

Assuming that you are referring to the new integration between fail2ban and the Sangoma firewall, then that might be a bug. (possibly it just has a display limit?..)

Anyhow, you should look in why you are getting hammered with requests. Are these SIP, SSH or Apache auth failures?

If SIP, you may want to install API BAN.

Eitherway, you should be considering not exposing your PBX if you are behind a firewall…

Thanks for the reply

is the fail2ban-server python process still running (has it crashed, what’s it’s state?)

I assume this is the standard fail2ban service, in which case yes it is, shows as running in the FPBX Dashboard and when I run systemctl and both show correctly when I stop and start the fail2ban service.

do the entries in iptables match what /var/log/fail2ban.log has (ie. iptables -L fail2ban-SIP -n)

Yes, to the point where fail2ban has been working these match, when fail2ban stops working no IP addresses are logged in /var/log/fail2ban.log

is the volume of entries becoming absurd

While its running and working the entries look as I would expect, an odd one here and there.

is memory consumption high / available memory low

No all looks good here.

is disk i/o or free space becoming an issue?

No, less that 5% disk space used.

are the log files fail2ban is consuming being updated?

Looking at this now .

Anyhow, you should look in why you are getting hammered with requests. Are these SIP, SSH or Apache auth failures?

They are sip requests to register against an extension, I have fail2ban set to ban after 4 failed attempts, due to a time lag to entry the IP into the firewall tables I some times see around 5 to 10 log entries where an attempt has been refused usually because the extension does not exist, occasional they hit on the correct extension but wrong password. Once fail2ban kicks in and the IP is banned the logon attempts stop in the asterisk log files. When I am having this fail2ban failure the IP is not banned and the logon attempts just keep coming until the system performance is compromised.

Eitherway, you should be considering not exposing your PBX if you are behind a firewall…

The only ports I have exposed are 5060 for PJSIP signalling and 5062 for Chan_Sip signalling (I have PBX to PBX trunks) and the RTP Port Ranges.

All the requests come on 5060, so I have always assumed that the scammers scan for port 5060 and when they get a positive results attempt some registrations, fail2ban cuts in at 4 attempts and with a second the IP is banned and all is well. Its the failure to ban the IP that leads to the continuing registrations and in effect a DOS attack?

Have you thought of not using udp/5060?, or perhaps using tcp instead? even better , try tls/5061. The calls to udp/5060 will probably never stop, banning the ones you catch is like herding cats, it doesn’t work.

Having watched a failed attack / successful block this is the log file flow as I can see it

Failed log on recorded in /var/log asterisk/full

Then the IP address of that failed login transferred to /var/log asterisk/fail2ban (By Free PBX?)

Log in the /var/log/fail2ban.log WARNING [asterisk-iptables] Ban (relevant IP address)

and the IP address (relevant IP address) has been blocked as it should.

I will now have to wait until I get a failure again and see if the log trail breaks. If anyone can help me understand what generates /var/log asterisk/fail2ban I would appreciate it?

The ‘log file’ settings for that file in FPBX are parsed by the F2B asterisk jail’s regexes for any matches, over limits are sent to iptables and the F2B log, other jails work similarly if enabled.

FreePBX generates the Asterisk config for Asterisk to create this log. It’s only purpose is for logging Asterisk security events for fail2ban to monitor.

So this morning I came to understand a little more around this issue.

At 5.45 fail2ban had failed again and was no longer adding IP’s to the firewall

From the logs I was able to see this had started at 5.15, with a bit more hunting I found a cron job I had put in years ago and carried over into this build to restart the fail2ban service at 5.15.

So it it would appear the failure point was the service restart at 5.15, after that point the /var/log/atsrerisk/fail2ban files starts growing quickly

fail2ban

and there are no more entries in /var/log/fail2ban.log or IP’s added to the firewall.

What I am not sure about is does the /var/log/atsrerisk/fail2ban file size growth happen only as a result of the continues attempts to log on from hackers

Or does the pause in fail2ban service allow the file to start to grow and its the size of the file growing going forward that causes the continued failure?

In order to get fail2ban working I deleted /var/log/atsrerisk/fail2ban and restarted the Asterisk service which generated a new file, all seems to be working fine now and I have removed the cron entry and will see how things go over the next few days.

Its a bit worrying through if a fail2ban service restart can cause it to fail like that?

I just checked my own install and it seems it’s still running a very old version of fail2ban (0.8.14)

Could be during fail2ban restart you’re getting hit with this

I went through this on a different (non freepbx) system, where after getting smashed by a distributed attack causing many thousands of entries, fail2ban would take >30 minutes to unload/load.
If entries are added to /var/log/asterisk/fail2ban.log near to the speed that fail2ban is processing them during a restart, it may take an immense amount of time to complete, possibly indefinitely, ensuring even more logged entries occur.

F2B is not a complete firewall, just a set of rules inserted into iptables. as such you should not be starting/stopping it yourself, leave that to the ‘master’ firewall to add those rules at its discretion and with the right precedence .

You are seeing a flood of registration failures , and the bigger the file the longer to read when F2B starts. It would be better to ‘logrotate’ your files as just deleting them might prevent asterisk from adding new log lines.

F2B supports various methods to ‘parse’ logfiles it is in your interest to install pyinotify as it is the fastest and will be used if found.

Thanks for the information, I suspect this may be the issue. Now I have the fail2ban service running correctly and without any expected interruptions I will monitor it for a few days and see if its stable.

I have never used the “master” firewall in FPBX. I run my system behind a router with only port 5060 open, the only security issues I have ever had is hackers trying to pass calls through which my set up always refuses to non registered clients, and hackers attempting to register on my extensions, for which fully passworded extensions and the use of fail2ban has always worked great.

I do miss the ability to lock extension registration to the local LAN which was possible in with Chan_Sip but seems not to be under PJSIP.

Maybe I need to look at using the “master” firewall going forward as an additional line of defence.

Its a sad world really where within a couple of hours of putting a phone system on line you are hammered by strangers who feel they have the right to use and abuse it?

Why do you have 5060 open through the firewall if you only have local devices registering?

I have another two PBX’s that trunk in to facilitate extension to extension calls and just one extension number that can logged on to remotely for use on my iPhone.

| have considered a VPN solution by my general preference is for simplicity which I just find more a eloquent solution.

I do hope that PJSIP allows for a local LAN lock down going forward as it develops, I did enjoy the added confidence that provided in Chan_Sip.

Restrict the forwarding at your router to only the approved IP addresses.

Or use a DDNS.

Also, AFAIK, PJSIP allows you to permit an IP or network, which results that everything else is blocked.

iPhone would be a dynamic address, as are the the PBX’s although they do have dynu.net look ups so that might be an option.

Yes, but if you apply a filter (192.168.61.0/255.255.255.0) to PJSIP at an extension level then it prevents any client in that range registering to any extension on the PBX.