在控制器上过滤以检查用户代理,然后根据结果是否为真进行重定向

Filter on Controller to check User Agent and then redirect based on if result is true

------------ 注意(编辑)- 我可能完全错了,如果这实际上是错误的,任何指导将不胜感激(mvc 的新手)

在解决方案中,存在一个 robots.txt 文件来阻止站点中的所有抓取工具。唯一的问题是,Facebook crawler/scraper 没有遵守规则,仍然是 crawling/scraping 网站,导致每隔几分钟记录一次错误并发送电子邮件。为此发送的错误是“在控制器 'SolutionName.Web.Controllers.QuoteController' 上未找到 public 操作方法 'Customer'。”

这个问题的解决方案是在控制器上创建一个过滤器来检查代理名称。如果代理名称用于 facebook,则将它们重定向到“无机器人身份验证页面”。过滤器必须在控制器上,因为该站点可满足 3 条不同的路线,其中每条路线都有自定义 link 并且客户可以访问在 facebook 上共享的直接 links(从而创建一条路线为此,在路由配置中将不起作用)。

我面临的问题是解决方案没有在控制器过滤器上立即重定向。它正在加入操作方法(这些操作方法是部分页面),然后由于无法重定向而失败(视图已经开始呈现 - 这是正确的)。 有没有办法在第一次访问这个过滤器时立即重定向?或者是否有更好的解决方案?

为了测试和排除故障,我正在更改代码中的用户代理以匹配记录的内容。 从过滤器重定向时的错误:“不允许子操作执行重定向操作。”

当前由于 F​​acebook 的爬虫记录的错误:“在控制器 'SolutionName.Web.Controllers.QuoteController' 上找不到 public 操作方法 'Customer'。”

来自堆栈跟踪的用户代理:

这是我所做的:

自定义过滤器:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Reflection;
    using System.Web;
    using System.Web.Mvc;

    namespace SolutionName.Web.Classes
    {
        public class UserAgentActionFilterAttribute : ActionFilterAttribute
        {
            public override void OnActionExecuting(ActionExecutingContext filterContext)
            {
                try
                {
                    List<string> Crawlers = new List<string>()
                    {
                        "facebookexternalhit/1.1","facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)","facebookexternalhit/1.1","Facebot"
                     };

                     string userAgent = HttpContext.Current.Request.UserAgent.ToLower();
                     bool iscrawler = Crawlers.Exists(x => userAgent.Contains(x));
                     if (userAgent != null && iscrawler)
                     {
                        filterContext.Result = new RedirectResult("~/Home/NoRobotsAuthentication");
                        return;
                     }
            
                    base.OnActionExecuting(filterContext);

                 }
                 catch (Exception errException)
                 {
                    LogHelper.LogException(Severity.Error, errException);
                    SessionHelper.PolicyBase = null;
                    SessionHelper.ClearQuoteSession();
                    filterContext.Result = new RedirectResult("~/Home/NoRobotsAuthentication");
                    return;
                }
            }
        }
    }

NoRobotsAuthentication.cshtml:

@{
        ViewBag.PageTitle = "Robots not authorized";
        Layout = "~/Views/Shared/_LayoutClean2.cshtml";
 }

 <div class="container body-content">
     <div class="row">
    <div class="col-lg-12 col-md-12 col-sm-12 col-xs-12 container-solid">
        <div class="form-horizontal">
            <h3>@ViewBag.NotAuthorized</h3>
        </div>
    </div>
</div>

无机器人操作方法:

    #region Bot Detection
    public ActionResult NoRobotsAuthentication()
    {
        ViewBag.NotAuthorized = "Robots / Scrapers not authorized!";
        return View();
    }

    #endregion

我要检查的控制器之一:

    namespace SolutionName.Web.Controllers
    {
        [UserAgentActionFilter]
        public class QuoteController : Controller
        {

            public ActionResult Customer()
            { //Some logic }
        }
    }

过滤器为运行:

时出现错误的部分页面ActionResult
    public ActionResult _Sidebar()
    {
        var model = SessionHelper.PolicyBase;
        return PartialView("_Sidebar", model);
    }

这是因为您使用的是 ActionFilterAttribute。如果您在此处查看文档:https://docs.microsoft.com/en-us/aspnet/core/mvc/controllers/filters?view=aspnetcore-3.1 它解释了过滤器的生命周期,基本上 - 当您到达操作过滤器时,为时已晚。您需要一个授权过滤器或资源过滤器,以便您可以 short-circuit 请求。

Each filter type is executed at a different stage in the filter pipeline:

Authorization Filters

  • Authorization filters run first and are used to determine whether the user is authorized for the request.
  • Authorization filters short-circuit the pipeline if the request is not authorized.

Resource filters

  • Run after authorization.
  • OnResourceExecuting runs code before the rest of the filter pipeline. For example, OnResourceExecuting runs code before model binding.
  • OnResourceExecuted runs code after the rest of the pipeline has completed.

下面的示例取自文档,它是资源过滤器的一个实现。据推测,授权过滤器可以实现类似的实现,但我相信在授权过滤器失败后返回有效的 Http 状态代码可能有点 anti-pattern.

// See that it's implementing IResourceFilter
public class ShortCircuitingResourceFilterAttribute : Attribute, IResourceFilter
{
    public void OnResourceExecuting(ResourceExecutingContext context)
    {
        context.Result = new ContentResult()
        {
            Content = "Resource unavailable - header not set."
        };
    }

    public void OnResourceExecuted(ResourceExecutedContext context)
    {
    }
}

我已尝试将其与您提供的内容合并 - 请注意,这可能无法开箱即用。

public class ShortCircuitingResourceFilterAttribute : Attribute, IResourceFilter
{
    public void OnResourceExecuting(ResourceExecutingContext context)
    {
        try
        {
            // You had duplicates in your list, try to use Hashset for .Contains methods
            var crawlerSet = new Hashset<string>()
            {
               "facebookexternalhit/1.1",
               "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)",
               "Facebot"
            };
                    
            string userAgent = HttpContext.Current.Request.UserAgent;
            // You're unnecessarily and incorrectly checking if the userAgent is null multiple times
            // if it's null it'll fail when you're .ToLower()'ing it. 
            if (!string.IsNullOrEmpty(userAgent) && crawlerSet.Contains(userAgent.ToLower()))
            {
                // Some crawler
                context.Result = new RedirectResult("~/Home/NoRobotsAuthentication");
            }
         }
         catch (Exception errException)
         {
            LogHelper.LogException(Severity.Error, errException);
            SessionHelper.PolicyBase = null;
            SessionHelper.ClearQuoteSession();
            context.Result = new RedirectResult("~/Home/NoRobotsAuthentication");
         }
    }

    public void OnResourceExecuted(ResourceExecutedContext context)
    {
    }
}