vb.net 使用 htmlagilitypack 获取所有属性值
vb.net get all attributes value using htmlagilitypack
这是html
<div id="catlist-listview" class="cat-listview cat-listbsize">
<ul>
<li><a href="http://wantedlink1" rel="bookmark" title="sometitel1" class="sonra">title1</a></li>
<li><a href="http://wantedlink2" rel="bookmark" title="sometitel2" class="sonra">title2</a></li>
<li><a href="http://wantedlink3" rel="bookmark" title="sometitel3" class="sonra">title3</a></li>
<li><a href="http://wantedlink4" rel="bookmark" title="sometitel4" class="sonra">title4</a></li>
<li><a href="http://wantedlink5" rel="bookmark" title="sometitel5" class="sonra">title5</a></li>
<li><a href="http://wantedlink6" rel="bookmark" title="sometitel6" class="sonra">title6</a></li>
<li><a href="http://wantedlink7" rel="bookmark" title="sometitel7" class="sonra">title7</a></li>
<li><a href="http://wantedlink8" rel="bookmark" title="sometitel8" class="sonra">title8</a></li>
<li><a href="http://wantedlink9" rel="bookmark" title="sometitel9" class="sonra">title9</a></li>
<li><a href="http://wantedlink10 " rel="bookmark" title="sometitel10" class="sonra">title10</a></li>
</ul>
</div>
我的代码是
dim htmldoc as new htmldocument
htmldoc.loadhtml(source)
for each link as htmlnode in htmldoc.document.selectnodes("//*[@id='catlist-listview']/ul")
textbox3.text = link.innerhtml
next
输出是
<li><a href="http://wantedlink1" rel="bookmark" title="sometitel1" class="sonra">title1</a></li>
<li><a href="http://wantedlink2" rel="bookmark" title="sometitel2" class="sonra">title2</a></li>
<li><a href="http://wantedlink3" rel="bookmark" title="sometitel3" class="sonra">title3</a></li>
<li><a href="http://wantedlink4" rel="bookmark" title="sometitel4" class="sonra">title4</a></li>
<li><a href="http://wantedlink5" rel="bookmark" title="sometitel5" class="sonra">title5</a></li>
<li><a href="http://wantedlink6" rel="bookmark" title="sometitel6" class="sonra">title6</a></li>
<li><a href="http://wantedlink7" rel="bookmark" title="sometitel7" class="sonra">title7</a></li>
<li><a href="http://wantedlink8" rel="bookmark" title="sometitel8" class="sonra">title8</a></li>
<li><a href="http://wantedlink9" rel="bookmark" title="sometitel9" class="sonra">title9</a></li>
<li><a href="http://wantedlink10 " rel="bookmark" title="sometitel10" class="sonra">title10</a></li>
我只想得到所有 http://wantedlink1
到 http://wantedlink10
我尝试属性 ("href") 但我只得到一个 link
我想像这样列出所有 link :
http://wantedlink1
http://wantedlink2
http://wantedlink3
.
.
.
http://wantedlink10
有什么帮助吗??
基本上,您可以将 SelectNodes()
的 XPath 更改为选择单个 <a>
个元素而不是 <ul>
。那么从这一点出发,就很容易遍历结果,一个一个的得到href
属性了。或者您使用 LINQ 实现相同的效果,例如:
'select <a> elements'
Dim links = htmldoc.Document.SelectNodes("//*[@id='catlist-listview']/ul/li/a")
'project to IEnumerable of href attribute value'
Dim hrefs = links.Cast(Of HtmlNode)().Select(Function(x) x.GetAttributeValue("href", ""))
'join the `hrefs`, separated by newline, into one string'
textbox3.text = String.Join(Environment.NewLine, hrefs)
这是html
<div id="catlist-listview" class="cat-listview cat-listbsize">
<ul>
<li><a href="http://wantedlink1" rel="bookmark" title="sometitel1" class="sonra">title1</a></li>
<li><a href="http://wantedlink2" rel="bookmark" title="sometitel2" class="sonra">title2</a></li>
<li><a href="http://wantedlink3" rel="bookmark" title="sometitel3" class="sonra">title3</a></li>
<li><a href="http://wantedlink4" rel="bookmark" title="sometitel4" class="sonra">title4</a></li>
<li><a href="http://wantedlink5" rel="bookmark" title="sometitel5" class="sonra">title5</a></li>
<li><a href="http://wantedlink6" rel="bookmark" title="sometitel6" class="sonra">title6</a></li>
<li><a href="http://wantedlink7" rel="bookmark" title="sometitel7" class="sonra">title7</a></li>
<li><a href="http://wantedlink8" rel="bookmark" title="sometitel8" class="sonra">title8</a></li>
<li><a href="http://wantedlink9" rel="bookmark" title="sometitel9" class="sonra">title9</a></li>
<li><a href="http://wantedlink10 " rel="bookmark" title="sometitel10" class="sonra">title10</a></li>
</ul>
</div>
我的代码是
dim htmldoc as new htmldocument
htmldoc.loadhtml(source)
for each link as htmlnode in htmldoc.document.selectnodes("//*[@id='catlist-listview']/ul")
textbox3.text = link.innerhtml
next
输出是
<li><a href="http://wantedlink1" rel="bookmark" title="sometitel1" class="sonra">title1</a></li>
<li><a href="http://wantedlink2" rel="bookmark" title="sometitel2" class="sonra">title2</a></li>
<li><a href="http://wantedlink3" rel="bookmark" title="sometitel3" class="sonra">title3</a></li>
<li><a href="http://wantedlink4" rel="bookmark" title="sometitel4" class="sonra">title4</a></li>
<li><a href="http://wantedlink5" rel="bookmark" title="sometitel5" class="sonra">title5</a></li>
<li><a href="http://wantedlink6" rel="bookmark" title="sometitel6" class="sonra">title6</a></li>
<li><a href="http://wantedlink7" rel="bookmark" title="sometitel7" class="sonra">title7</a></li>
<li><a href="http://wantedlink8" rel="bookmark" title="sometitel8" class="sonra">title8</a></li>
<li><a href="http://wantedlink9" rel="bookmark" title="sometitel9" class="sonra">title9</a></li>
<li><a href="http://wantedlink10 " rel="bookmark" title="sometitel10" class="sonra">title10</a></li>
我只想得到所有 http://wantedlink1
到 http://wantedlink10
我尝试属性 ("href") 但我只得到一个 link
我想像这样列出所有 link :
http://wantedlink1
http://wantedlink2
http://wantedlink3
.
.
.
http://wantedlink10
有什么帮助吗??
基本上,您可以将 SelectNodes()
的 XPath 更改为选择单个 <a>
个元素而不是 <ul>
。那么从这一点出发,就很容易遍历结果,一个一个的得到href
属性了。或者您使用 LINQ 实现相同的效果,例如:
'select <a> elements'
Dim links = htmldoc.Document.SelectNodes("//*[@id='catlist-listview']/ul/li/a")
'project to IEnumerable of href attribute value'
Dim hrefs = links.Cast(Of HtmlNode)().Select(Function(x) x.GetAttributeValue("href", ""))
'join the `hrefs`, separated by newline, into one string'
textbox3.text = String.Join(Environment.NewLine, hrefs)