crawlers/bots 如何运作？区分 bots/crawlers 个 http 请求

Question

我在一个网站上工作。

我需要了解我的网站是否从 Google 或任何其他搜索引擎的 crawlers/bots

获得访问

在我的应用程序中，我正在拦截 http 请求。并且需要查明 crawlers/bots 是否正在发出 http 请求来抓取我的网站。

我该怎么做？

Answer 1

检查用户代理字符串以查看它是否是已知机器人。一个例子：

protected void Page_Load(object sender, EventArgs e)
        {
            if (Request.UserAgent.Contains("Googlebot"))
            {
                //it's one of the google robots
            }
            else if (...)
            {
                ...
            }
        }

对于 google，可以在此处找到他们使用的代理列表 here。

其他的，你自己去了解吧。

crawlers/bots 如何运作？区分 bots/crawlers 个 http 请求

How crawlers/bots work? differentiating bots/crawlers http requests

c#

asp.net

seo

search-engine

google-search