通过 HTML Agility Pack 获取另一个 table 中嵌入的 table 的行和列

Get Rows and Columns of an embedded table within another table via HTML Agility Pack

VB.2012 使用 HTML 敏捷包。 我花了几个小时试图解决这个问题,这是我对输入格式的无知。情况就是这样,这是我的输入:一个简单的 HTML table 和另外两个 table 嵌入的

<table cellpadding="0" cellspacing="0" border="0">
    <tr>
        <td width="100%">
            <table cellpadding="0" cellspacing="0" border="0" class="plan">
                <tr>
                    <td class="textBold" valign="bottom">XX&nbsp;<u>999</u></td>
                    <td class="centerText" valign="bottom">X1</td>
                    <td class="centerText" valign="bottom">X2</td>
                    <td class="centerText" valign="bottom">X3</td>
                    <td class="centerText" valign="bottom">X4</td>
                    <td class="centerText" valign="bottom">X5</td>
                    <td class="centerTextTotal" valign="bottom">TOTAL</td>
                </tr>
                <tr>
                    <td class="Text">PRIMARY</td>
                    <td class="centerText">4</td>
                    <td class="centerText">8</td>
                    <td class="centerText">&nbsp;</td>
                    <td class="centerText">1</td>
                    <td class="centerText">3</td>
                    <td class="centerTextTotal">16</td>
                </tr>
                <tr>
                    <td class="TextColor">SECONDARY</td>
                    <td class="centerTextColor">&nbsp;</td>
                    <td class="centerTextColor">&nbsp;</td>
                    <td class="centerTextColor">2</td>
                    <td class="centerTextColor">&nbsp;</td>
                    <td class="centerTextColor">2</td>
                    <td class="centerTextTotal">4</td>
                </tr>
                <tr>
                    <td class="TextTotal">TOTAL</td>
                    <td class="centerTextTotal">4</td>
                    <td class="centerTextTotal">8</td>
                    <td class="centerTextTotal">2</td>
                    <td class="centerTextTotal">1</td>
                    <td class="centerTextTotal">5</td>
                    <td class="centerTextTotal">20</td>
                </tr>
            </table>
        </td>
    </tr>
    <tr>
        <td width="100%">
            <table cellpadding="0" cellspacing="0" border="0" width="100%">
                <tr>
                    <td width="75%" class="" textcolorvalign="bottom">Number of fuelings:0</td>
                    <td width="25%" class="" textcolorvalign="bottom" align="right">Meals:2</td>
                </tr>
            </table>
        </td>
    </tr>
</table>

我只关心内部 table "plan".

中的数据
        Dim html As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument
        html.OptionOutputAsXml = False
        html.LoadHtml(htmlTable)

        Dim docNode As HtmlAgilityPack.HtmlNode = html.DocumentNode

        'parse the plan table if it exists
        If docNode IsNot Nothing Then
            Dim hTable As HtmlAgilityPack.HtmlNode = docNode.SelectSingleNode("//table[@class='plan']")
            If hTable IsNot Nothing Then
                For Each hRow As HtmlAgilityPack.HtmlNode In hTable.SelectNodes("//table[@class='plan']//tr") '"//tr"
                    Debug.Print("   InnerText=>[{0}] InnerHtml=>[{1}]", hRow.InnerText, hRow.InnerHtml)

                    For Each hCol As HtmlAgilityPack.HtmlNode In hRow.SelectNodes("//table[@class='plan']//tr//td") '"//td"
                        Debug.Print("      InnerText=>[{0}] InnerHtml=>[{1}]", hCol.InnerText, hCol.InnerHtml)
                    Next hCol
                Next hRow
            End If
        End If

右边是我最初使用的字符串 //tr 和 //td。我的逻辑是,因为我正在使用节点 hTable 和 hRow,所以我会得到相应的子节点。然而,这似乎会让我得到所有 table 的所有行和所有列。经过测试,我似乎必须使用 //table[@class='plan']//tr 和 //table[@[=26= 来完全限定每个循环]='plan']//tr//td。这是为什么???这对我来说没有意义,因为我明确地使用了子节点对象 hTable 和 hRow。

根据 this,在 XPath 中 // 表示从根开始搜索,如果您想从当前上下文中搜索,则需要 .//。因此,尝试 .//tr.//td 进行相对于当前元素的搜索。