htaccess 中被阻止的 bot IP 仍在访问网站
Blocked bot IP in htaccess still visiting website
我正在使用 Xenforo
网站来阻止机器人(爬虫)的 IP,因为它在服务器上变得疯狂。
我添加了三行来实现此更改,但他们一直在抓取我的网站。
Order Deny,Allow
Deny from 93.158.178.201
RewriteCond %{HTTP_USER_AGENT} ^YandexBot [OR]
我的整个 .htaccess
文件是这样的:
# Mod_security can interfere with uploading of content such as attachments. If you
# cannot attach files, remove the "#" from the lines below.
#<IfModule mod_security.c>
# SecFilterEngine Off
# SecFilterScanPOST Off
#</IfModule>
ErrorDocument 401 default
ErrorDocument 403 default
ErrorDocument 404 default
ErrorDocument 500 default
<IfModule mod_rewrite.c>
RewriteEngine On
# If you are having problems with the rewrite rules, remove the "#" from the
# line that begins "RewriteBase" below. You will also have to change the path
# of the rewrite to reflect the path to your XenForo installation.
#RewriteBase /xenforo
# This line may be needed to enable WebDAV editing with PHP as a CGI.
#RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{HTTP_USER_AGENT} ^YandexBot [OR]
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(data/|js/|styles/|install/|favicon\.ico|crossdomain\.xml|robots\.txt) - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
</IfModule>
Order Deny,Allow
Deny from 93.158.178.201
我是不是做错了什么?
编辑:在您的评论和回答之后,我的网站上仍然有 Yandex 机器人...
93.158.178.201 - - [23/Sep/2015:13:56:23 +0200] "GET /threads/g%C3%BCnl%C3%BCk-blog2014.8514/page-87 HTTP/1.1" 403 521 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
Yandex 机器人具有以下 User-Agent
字符串:
User-Agent Mozilla/5.0 (compatible; Yandex...)
User-Agent Mozilla/5.0 (compatible; Yandex...)
string identifies
Yandex
robots. Robots can send GET
(for example, YandexBot/3.0
)
and HEAD
(YandexWebmaster/2.0
) requests to a server. A reverse DNS
lookup can be used to check the authenticity of Yandex robots.
试试这个:
RewriteCond %{HTTP_USER_AGENT} compatible;\ yandex [NC]
RewriteRule ^ - [F,L]
F
标志将禁止机器人抓取您的页面。
我正在使用 Xenforo
网站来阻止机器人(爬虫)的 IP,因为它在服务器上变得疯狂。
我添加了三行来实现此更改,但他们一直在抓取我的网站。
Order Deny,Allow
Deny from 93.158.178.201
RewriteCond %{HTTP_USER_AGENT} ^YandexBot [OR]
我的整个 .htaccess
文件是这样的:
# Mod_security can interfere with uploading of content such as attachments. If you
# cannot attach files, remove the "#" from the lines below.
#<IfModule mod_security.c>
# SecFilterEngine Off
# SecFilterScanPOST Off
#</IfModule>
ErrorDocument 401 default
ErrorDocument 403 default
ErrorDocument 404 default
ErrorDocument 500 default
<IfModule mod_rewrite.c>
RewriteEngine On
# If you are having problems with the rewrite rules, remove the "#" from the
# line that begins "RewriteBase" below. You will also have to change the path
# of the rewrite to reflect the path to your XenForo installation.
#RewriteBase /xenforo
# This line may be needed to enable WebDAV editing with PHP as a CGI.
#RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{HTTP_USER_AGENT} ^YandexBot [OR]
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(data/|js/|styles/|install/|favicon\.ico|crossdomain\.xml|robots\.txt) - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
</IfModule>
Order Deny,Allow
Deny from 93.158.178.201
我是不是做错了什么?
编辑:在您的评论和回答之后,我的网站上仍然有 Yandex 机器人...
93.158.178.201 - - [23/Sep/2015:13:56:23 +0200] "GET /threads/g%C3%BCnl%C3%BCk-blog2014.8514/page-87 HTTP/1.1" 403 521 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
Yandex 机器人具有以下 User-Agent
字符串:
User-Agent Mozilla/5.0 (compatible; Yandex...)
User-Agent Mozilla/5.0 (compatible; Yandex...)
string identifiesYandex
robots. Robots can sendGET
(for example,YandexBot/3.0
) andHEAD
(YandexWebmaster/2.0
) requests to a server. A reverse DNS lookup can be used to check the authenticity of Yandex robots.
试试这个:
RewriteCond %{HTTP_USER_AGENT} compatible;\ yandex [NC]
RewriteRule ^ - [F,L]
F
标志将禁止机器人抓取您的页面。