使用 Fail2ban 限制机器人流量
Rate Limit Bot Traffic Using Fail2ban
我们在 Suse-Linux-Enterprise Server 中配置了 Fail2ban 以限制机器人流量的速率。下面是在 jail.local 文件中完成的配置。
[apache-badbots]
enabled = true
port = http,https
filter = apache-badbots
action = iptables-allports[name=apache-badbots, port="http,https" protocol=tcp]
logpath = /var/persistent/apache2/logs/site1-access.log
findtime = 60
bantime = 600
maxretry = 1
下面是正则表达式配置。
failregex = <HOST> -.*(EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|Googlebot/2\.1)
日志格式如下:
[14/Jul/2020:11:38:09 +0000] 192.168.1.14 TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 "GET /sessionValueLink.action?crud=s&keyValue=JsMethodName&insertValue=submitShippingAddress();&dt=Tue%20Jul%2014%202020%2017:08:09%20GMT+0530%20(India%20Standard%20Time) HTTP/1.1" 200 44 [0/1894] "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/192.168.2.1 Safari/537.36"
当我 运行 正则表达式测试时,我们得到以下结果。
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-badbots.conf
结果
Failregex: 2438 total
|- #) [# of hits] regular expression
| 1) [2438] <HOST> -.*(Googlebot/2\.1)
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [113634] Day/MONTH/Year:Hour:Minute:Second
`-
Lines: 113634 lines, 0 ignored, 2438 matched, 111196 missed
Missed line(s): too many to print. Use --print-all-missed to print all 111196 lines
问题是,当我们连续点击 Google Bot 时,我们在日志中看到 IP 地址被阻止,但它没有显示有效的 IP 地址,也没有阻止 bot 流量。请找到以下日志供您参考。
2020-07-14 14:17:18,330 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 403 = ['0.0.1.147']
2020-07-14 14:17:18,330 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 403 = ['0.0.1.147']
2020-07-14 14:17:18,612 fail2ban.actions[431]: INFO [apache-badbots] 0.0.1.147 already banned
2020-07-14 14:27:03,274 fail2ban.actions[431]: WARNING [apache-badbots] Unban 0.0.1.147
2020-07-14 14:38:40,817 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 302 = ['0.0.1.46']
2020-07-14 14:38:41,073 fail2ban.actions[431]: WARNING [apache-badbots] Ban 0.0.1.46
2020-07-14 14:39:49,903 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 403 = ['0.0.1.147']
2020-07-14 14:39:50,162 fail2ban.actions[431]: WARNING [apache-badbots] Ban 0.0.1.147
我们这里犯了什么错误?如何解决问题。我是 fail2ban
的新手,我们将不胜感激。
我们找到了解决方案。 Regex 有问题,它没有从日志中获取正确的 IP 地址。我们已更改为下面提到的正则表达式,它工作正常。
failregex = (?:\[\]\s+)?\<HOST> [^"]*"[^"]*" \d+ \d+ [^"]*"[^"]*\b(?:EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|Googlebot/2\.1)\b
我们在 Suse-Linux-Enterprise Server 中配置了 Fail2ban 以限制机器人流量的速率。下面是在 jail.local 文件中完成的配置。
[apache-badbots]
enabled = true
port = http,https
filter = apache-badbots
action = iptables-allports[name=apache-badbots, port="http,https" protocol=tcp]
logpath = /var/persistent/apache2/logs/site1-access.log
findtime = 60
bantime = 600
maxretry = 1
下面是正则表达式配置。
failregex = <HOST> -.*(EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|Googlebot/2\.1)
日志格式如下:
[14/Jul/2020:11:38:09 +0000] 192.168.1.14 TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 "GET /sessionValueLink.action?crud=s&keyValue=JsMethodName&insertValue=submitShippingAddress();&dt=Tue%20Jul%2014%202020%2017:08:09%20GMT+0530%20(India%20Standard%20Time) HTTP/1.1" 200 44 [0/1894] "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/192.168.2.1 Safari/537.36"
当我 运行 正则表达式测试时,我们得到以下结果。
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-badbots.conf
结果
Failregex: 2438 total
|- #) [# of hits] regular expression
| 1) [2438] <HOST> -.*(Googlebot/2\.1)
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [113634] Day/MONTH/Year:Hour:Minute:Second
`-
Lines: 113634 lines, 0 ignored, 2438 matched, 111196 missed
Missed line(s): too many to print. Use --print-all-missed to print all 111196 lines
问题是,当我们连续点击 Google Bot 时,我们在日志中看到 IP 地址被阻止,但它没有显示有效的 IP 地址,也没有阻止 bot 流量。请找到以下日志供您参考。
2020-07-14 14:17:18,330 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 403 = ['0.0.1.147']
2020-07-14 14:17:18,330 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 403 = ['0.0.1.147']
2020-07-14 14:17:18,612 fail2ban.actions[431]: INFO [apache-badbots] 0.0.1.147 already banned
2020-07-14 14:27:03,274 fail2ban.actions[431]: WARNING [apache-badbots] Unban 0.0.1.147
2020-07-14 14:38:40,817 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 302 = ['0.0.1.46']
2020-07-14 14:38:41,073 fail2ban.actions[431]: WARNING [apache-badbots] Ban 0.0.1.46
2020-07-14 14:39:49,903 fail2ban.filter [431]: WARNING Determined IP using DNS Lookup: 403 = ['0.0.1.147']
2020-07-14 14:39:50,162 fail2ban.actions[431]: WARNING [apache-badbots] Ban 0.0.1.147
我们这里犯了什么错误?如何解决问题。我是 fail2ban
的新手,我们将不胜感激。
我们找到了解决方案。 Regex 有问题,它没有从日志中获取正确的 IP 地址。我们已更改为下面提到的正则表达式,它工作正常。
failregex = (?:\[\]\s+)?\<HOST> [^"]*"[^"]*" \d+ \d+ [^"]*"[^"]*\b(?:EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|Googlebot/2\.1)\b