Tuning Fail2ban
Fail2ban is an awesome tool for shunning attackers that are attempting to brute force accounts on your system. Many folks rely on it to protect SSH and HTTPS authentication. By default, fail2ban uses ICMP port unreachables to ban malicious source IPs. I went down the rabbit hole to figure out if this was really the most effective option, or if you would be better served returning ICMP host unreachables or even quietly dropping the packets. I also looked at other ways you can optimize the response of fail2ban.
Test Setup
My test set up consisted of three Ubuntu 24.04 servers, all on the same network. Fail2ban was configured slightly differently on each system. When triggered, one returned the default ICMP port unreachables, one returned ICMP host unreachables, and the third just quietly dropped the traffic. My goal was to have the same source scan all three systems to see if I could elicit different responses. I chose to monitor SSH authentication, as this is arguably the most common use case for fail2ban. I also enabled password authentication as SSH returns a slightly different error when a bad password is used versus only supporting key authentication. I stuck with only making one configuration change at a time to each system. I collected data for 10 days. While this is not a statistically valid sample size, I expect it’s sufficient to identify any clear trends.
What’s Wrong With Port Unreachables?
What prompted this testing was that I noticed fail2ban was returning an error code that was easy to spot and ignore by malicious attackers. ICMP port unreachables are normally returned when you attempt to connect to a UDP port that has no service listening. However, fail2ban is predominantly used to protect TCP-based services. With TCP, closed ports return a TCP reset. So as an attacker, you could simply ignore all ICMP port unreachables when performing TCP-based brute force attacks. However, ICMP host unreachables are universal. You can potentially receive these when using any IP transport if the target host is currently offline.
Some General Stats
Over the 10 days of testing, across all three systems, I detected a total of 5,180 brute force attempts from 531 unique IP addresses. Note that this number would have been far larger if fail2ban was not triggering. Here’s the list of the top 10 source IPs, with the total number of attempts for each:
140 218.92.0.162 138 218.92.0.166 135 218.92.0.114 132 218.92.0.217 128 218.92.0.156 125 218.92.0.198 124 218.92.0.225 121 218.92.0.221
Hummm. I’m noticing a pattern here. 😉
whois -h whois.cymru.com " -v 218.92.0.154" AS | IP | BGP Prefix | CC | Registry | Allocated | AS Name 4134 | 218.92.0.154 | 218.92.0.0/16 | CN | apnic | 2001-06-28 | CHINANET-BACKBONE No.31,Jin-rong
I’ve been seeing this network beat up on SSH for months now. What’s interesting is that the attack is coordinated. When fail2ban shuns one IP, the next steps up trying different account/password combinations. So this is clearly not the work of a single attacker.
One of the first takeaways is that if you don’t do business in China, blocking access to your network from 218.92.0.0/16 is a quick win on reducing brute force attacks. I would expect that there are other malicious patterns originating from this network as well.
As far as which accounts are being brute forced, here’s the top 10:
1679 root 221 admin 22 user 10 pi 6 ubuntu 6 ubnt 6 teste 6 config 5 loginuse 4 hadoop
The above results should help to stress the importance of protecting the root account.
Port vs Host vs Drop
Comparing the results of returning ICMP port unreachables, versus ICMP host unreachables, versus quietly dropping the attack sessions returned some interesting results. These are shown in the chart below. Note that I tracked the total number of connections, versus the number of unique source IPs, as well as the number of failed authentication attempts. While connection attempts are interesting, what really matters is how many times an attacker was able to try different login/password combinations. The greater the number of “Failed Authentications” attempts, the more likely they will eventually discover the correct combo.
At the end of the day, what really matters is the number of failed authentication attempts. The greater the number of guesses, the more likely the attacker will discover the correct password and compromise the account. I have to admit, I was left scratching my head when comparing port unreachables to drops. When shunning with port unreachables, there were a greater number of connections, but a fewer number of failed authentication attempts when compared to drops. On the surface, this didn’t make a whole lot of sense until I dug into the details of the data. I think the answer lies in how the attacking system perceives the target network.
By default, fail2ban watches for failed authentication attempts within a rolling 10-minute window (findtime = 10m). The default is to look for five failed attempts (maxretry = 5). If this is detected, the source IP is shunned for 10 minutes (bantime = 10m).
There is a misconception that quietly dropping traffic “stealths” the firewall. With TCP, this could not be further from the truth. In fact, the only time some form of response will not be returned to the transmitting system is when a firewall is dropping the traffic. So returning no response to a TCP packet is pretty much advertising there is a firewall in the way.
So here’s how this all played out in the above testing. With either ICMP error code, the attacking system would make five quick attempts, get shunned, and then start banging away again 10 minutes later. Neither error seemed to convince the attacking system that it should move on to another IP address. When fail2ban was configured to drop the traffic, the attacking system adapted and slowed down its attempts at password guessing. This would usually result in six to eight attempts per connection before fail2ban would trigger. The result is that the attacking system was able to make a greater number of password attempts per connection, before being shunned. This explains why the “Drop” column in the above chart is so skewed.
Fail2ban Tuning Recommendations
If you are looking to optimize fail2ban and ultimately reduce the chances of a brute force attack, here are some recommendations.
Use Public/Private Keys for SSH Authentication
The best way to ensure that your SSH accounts do not get brute forced is to simply stop using passwords. Enable public/private key authentication only. This solves the problem, regardless of whether you are using fail2ban. For HTTPS authentication, digital certificates can be used. For other services that may not support public/private key authentication, see if there is a way to add a second authentication factor.
One minor caveat when using keys with SSH and fail2ban. If you have multiple keys stored in SSH Agent, the keys will be attempted in sequential order. So given enough keys and a small enough “maxretry” value, it’s possible to lock yourself out even though passwords are not being used. Remember that fail2ban is monitoring all failed authentication events, not just password attempts. This also holds true for automated jobs that kick off at system startup. If the key(s) are not yet loaded into SSH Agent, some number of initial attempts will fail authentication.
Use Fail2ban
If exclusively using public/private key authentication is not an option, installing fail2ban will go a long way towards frustrating attackers that attempt to gain access to the system. Combine it with something like PAM’s cracklib to ensure strong passwords are being used.
Don’t Drop Fail2ban Failures
As discussed above, having fail2ban “drop” hostile source IP’s can cause them to slow their authentication attempts, ultimately causing them to try more login/password combos per hour, thus increasing the chances that they will crack an account. Stick with the default ICMP port unreachables, or ICMP host unreachables.
Use bantime.* to Increase the Bantime
By default, fail2ban sets the “bantime” value to 10 minutes in /etc/fail2ban/jail.conf. Each time a ban is triggered, it will last for a total of 10 minutes. However, there are a number of values that can be set which will cause the ban time to increase with each subsequent triggering from a source IP address. Since users who forget their password usually only trigger one or two bans, this may be a useful way to assign a longer ban interval to would-be attackers.
For example, “bantime.factor” allows you to specify a series of multipliers. Setting this value to “2” would cause the ban time to double each time it is triggered. The value “bantime.multipliers” permits you to identify an exact number of minutes by which to increase each subsequent ban. If you want to get really fancy, “bantime.formula” lets you specify the exact math you wish to use to increase each subsequent ban.
Increase the Time IPs Are Monitored
The “maxretry” value monitors how many times a failed password can be entered before a ban will be put in place. The default value of 5 is usually a pretty optimal value to use. There is an associated value called “findtime” which identifies how long a source IP should be monitored for failed logins. This defaults to 10 minutes, but it may be beneficial to increase this value to something larger like 60 minutes. Since a user getting their password wrong tends to be a quick event, not something they drag out over an hour, increasing this value should only impact attackers.
Increase the “bantime” Value
As mentioned, fail2ban will shun a banned source IP for 10 minutes. In the testing I’ve performed, increasing this value to one hour or more seems to convince most attacking systems to go bother someone else. The only downside of increasing this value is that it will impact legitimate users that inadvertently get their password wrong. With this in mind, it may make more sense to use one of the above-mentioned multipliers to keep the first ban time short, but all subsequent ones much longer.
Conclusion
In an ideal world, we would all stop using passwords and replace them with public/private keys or similar technology. Unfortunately, most of us do not live in an ideal world. When you must use passwords, fail2ban can be an awesome tool for thwarting brute force password attacks. With just a little bit of tweaking, fail2ban can be even more effective.
Chris has been a leader in the IT and security industry for over 20 years. He’s a published author of multiple security books and the primary author of the Cloud Security Alliance’s online training material. As a Fellow Instructor, Chris developed and delivered multiple courses for the SANS Institute. As an alumni of Y-Combinator, Chris has assisted multiple startups, helping them to improve their product security through continuous development and identifying their product market fit.