端口 BeautifulSoup 到 HtmlAgility

Port BeautifulSoup to HtmlAgility

我正在尝试将此程序从 python 移植到 C# :

from __future__ import print_function

import requests
from bs4 import BeautifulSoup

r = requests.get('http://www.forexfactory.com/calendar.php?day=nov18.2016')
soup = BeautifulSoup(r.text, 'lxml')

tables = soup.findAll("table", {'class':'calendar__table'})

for table in tables:
    for row in table.findAll("tr"):
        for cell in row.findAll("td"):
            print (cell.text, end = " ")
        print()

这是我在 C# 中使用 HtmlAgilityPack 的 [代码片段] 尝试,但它不起作用:

HtmlWeb browser = new HtmlWeb();
string URI = "http://www.forexfactory.com/calendar.php?day=nov18.2016";

ServicePointManager.ServerCertificateValidationCallback += (sender, cert, chain, sslPolicyErrors) => true;
ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3 | SecurityProtocolType.Tls | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls12;

HtmlDocument document = browser.Load(URI);

foreach (HtmlNode row in document.DocumentNode.Descendants("table").FirstOrDefault(_ => _.Id.Equals("calendar__table")).Descendants("tr"))
    Console.WriteLine(row);

您可以使用此代码查询 id 和单个节点

document.DocumentNode.SelectSingleNode("//table[@id='calendar_table']").Descendants("tr");

但我猜你需要通过 class 查询,而不是通过 id,所以代码看起来像这样

document.DocumentNode.SelectSingleNode("//table[@class='calendar_table']").Descendants("tr");

此外,python 代码中的 class 名称带有两个 __ 符号,但在 c# 代码中带有一个 - _