硒 C# drive.PageSource - 'is too long, or a component of the specified path is too long.'

Selenium C# drive.PageSource - 'is too long, or a component of the specified path is too long.'

我试图将 driver.PageSource 从 Selenium C# 传递到 HTML Agility Pack,但是这行代码 htmlDoc.Load(driver.PageSource); returns 错误:'...'太长,或者指定路径的某个组件太长。

p.s。当我试图在 Python 而不是 C# 中做同样的事情时,Selenium Python 和 Beautiful Soup 不会产生这个错误。

如何解决这个问题?

完整代码:

using System;
using System.Threading;
using HtmlAgilityPack;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;

namespace SeleniumSharp
{
    public static class WebScraping
    {
        public static void GetPageData()
        {
            // initial setup
            IWebDriver driver = new ChromeDriver();
            driver.Navigate().GoToUrl("<url>");

            // dropdown
            var dropdown1 = driver.FindElement(By.Id("cpMain_ucc1_ctl00_liResidentialFront"));
            dropdown1.Click();
            
            // enter search query
            var search = driver.FindElement(By.Id("cpMain_ucc1_ctl00_txtResidentialSearchBox"));
            search.Click();
            search.SendKeys("london");
            Thread.Sleep(3000);

            // submit search
            var submit = driver.FindElement(By.XPath("//div[@id='cpMain_ucc1_ctl00_pnlContentResidential']//a[@class='search-button']"));
            submit.Click();

            // Html Agility Pack
            HtmlDocument htmlDoc = new HtmlDocument();
            htmlDoc.Load(driver.PageSource);

            var address = htmlDoc.DocumentNode
                .SelectNodes("//div[@class='grid-address']")
                .ToList();

            foreach(var item in address)
            {
                Console.WriteLine(item.InnerText);
            }

        }

        
    }
}

这行代码returns错误:

htmlDoc.Load(driver.PageSource);

错误:

'<html source>'is too long, or a component of the specified path is too long.
at System.IO.PathHelper.GetFullPathName(ReadOnlySpan`1 path, ValueStringBuilder& builder)
   at System.IO.PathHelper.Normalize(String path)
   at System.IO.Path.GetFullPath(String path)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)  
   at System.IO.StreamReader..ctor(String path, Encoding encoding)
   at HtmlAgilityPack.HtmlDocument.Load(String path)

这是因为您使用的方法是Load而不是LoadHtml。 Load 方法使用包含 HTML 的文件路径,而不是 HTML 源 (driver.PageSource).

// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

所以尝试使用

htmlDoc.LoadHtml(driver.PageSource);