检索 <li> 中的部分文本

Retrieve parts of text inside <li>

我有HTML这样的

<li class="in-ttl-b">(a) kanji; a Chinese character [ideograph]
    <ul class="list-data-b-in"><li class="text-jejp text-c"><span class="ex">漢字で書く</span></li><li class="text-jeen text-c">write in <i>kanji</i> [<i>Chinese characters</i>]</li></ul>
    <ul class="list-data-b-in"><li class="text-jejp text-c"><span class="ex">常用漢字</span></li><li class="text-jeen text-c"><i>Chinese characters</i> for everyday use (in Japan)</li></ul>
</li>

如何才能只得到kanji; a Chinese character [ideograph]

您可以通过选择作为外部 li 元素子元素的第一个 文本节点 来获得它。例如,假设可以有多个 liclass="in-ttl-b" 的实例:

Dim lis = HTMLDoc.DocumentNode.SelectNodes("//li[@class='in-ttl-b']")
For Each li As HtmlNode in lis 
    'select the first text node in <li> :
    Dim txt = li.SelectSingleNode("text()[1]")
    Console.WriteLine(li.InnerText)
Next

dotnetfiddle demo

输出:

(a) kanji; a Chinese character [ideograph]