使用 HtmlAgilityPack、嵌套列表和 Linq

Working with HtmlAgilityPack, nested List and Linq

List<List<string>> table = playerDoc.DocumentNode
    .SelectSingleNode($"//*[@id='lg_team_user_leagues-{leagueId}']/div[4]/table/tbody")
    .Descendants("tr")
    .Skip(1)
    .Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
    .ToList();

我有这个代码块,它从网站上的 table 收集所有正确信息。我的问题是数据如下所示:

我正在尝试找出如何在数据中搜索 2 个匹配的字符串,例如 S16Pre 并能够设置一个名为 [=14 的 class =](一个 class 的道具,如果需要我可以 post)。我尝试了 LINQ 语句的不同变体并使用了 foreach 循环,但我要么抛出异常,要么在 table.

中得到所有内容

我正在尝试简化我的代码,因为使用 foreachxpaths 检索数据大约需要 3-4 秒,当我测试 LINQ 语句时它作为 Elapsed 返回:00:00:00.0068306.

任何帮助将不胜感激,因为我仍在学习 C# 等等。如果我需要 post 示例网页或代码的任何其他部分,我会这样做。谢谢。

编辑:

foreach (var careerStats in findCareerNode)
{
    if (careerStats
        .SelectSingleNode($"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[1]").InnerText.Trim() != seasonId)
    {
        index++;
        continue;
    }
    else if (careerStats
       .SelectSingleNode(
           $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[2]")
       .InnerText.Trim() != "Reg")
    {
        index++;
        continue;
    }
    var type = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[2]")
        .InnerText;
    var record = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[3]")
        .InnerText;
    var amr = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[4]")
        .InnerText ?? "0.0";
    var goals = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[5]")
        .InnerText;
    var assists = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[6]")
        .InnerText;
    var sot = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[7]")
        .InnerText;
    var shots = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[8]")
        .InnerText;
    var passC = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[9]")
        .InnerText;
    var passA = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[10]")
        .InnerText;
    var keypass = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[11]")
        .InnerText;
    var interceptions = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[12]")
        .InnerText;
    var tac = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[13]")
        .InnerText;
    var tacA = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[14]")
        .InnerText;
    var blk = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[15]")
        .InnerText;
    var rc = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[16]")
        .InnerText;
    var yc = careerStats
        .SelectSingleNode(
            $"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[17]")
        .InnerText;
    ...
}

要过滤职业统计数据 table,您可以使用 LINQ 方法 Where. And then filtered data can be used to create list of CareerProperties objects using LINQ method Select

以下是我们如何获取所选 seasonIdReg 的职业统计数据:

// Now the return type is a List of CareerProperties.
List<CareerProperties> table = playerDoc.DocumentNode
    .SelectSingleNode($"//*[@id='lg_team_user_leagues-{leagueId}']/div[4]/table/tbody")
    .Descendants("tr")
    .Skip(1)
    // Up to here is your code. Here you select all rows from the table.
    // Each row is presented as List<string>.
    .Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
    // Here we filter table rows by "seasonId" and "Reg".
    .Where(tr => tr[0] == seasonId && tr[1] == "Reg")
    // Here we create objects CareerProperties from filtered rows.
    .Select(tr => new CareerProperties
        {
            Type = tr[2],
            Record = tr[3],
            Amr = tr[4],
            Goals = tr[5]
            Assists = tr[6],
            // Fill other properties.
            ...
        })
    .ToList();