如何使用 vb.net 读取和打印网页上的所有 html 标签内部 html 文本

How can i use vb.net to read and print all html label innerhtml text on a webpage

所以我有 HTML 个敏捷包。

我正在尝试阅读网页 html。我需要标签的内容,但不确定如何获得它。

我知道 for 属性是什么..但我不知道如何使用它来获取标签的内部html。

谁能帮忙

Private Sub SetTextboxText(ByVal Text As String)
    DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_userID"), mshtml.HTMLInputElement).value = ""
    DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_pwd"), mshtml.HTMLInputElement).value = ""
    ClickNormalButton()
    Memorable_Reader()
    End Sub

'Gets and Sets Memorable Information
Private Sub Memorable_Reader()
    'Read Label 'For' Attribute
    'Display Innerhtml Text in msgbox
End Sub

'CLICKS THE SUBMIT BUTTON
Private Sub ClickNormalButton()
    GetCurrentWebForm.submit()
End Sub

更新:

Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
    WebBrowser1.Navigate("https://online.lloydsbank.co.uk/personal/logon/login.jsp?WT.ac=PLO0512")
    Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
    htmlDoc.LoadHtml(WebBrowser1.DocumentText)
    Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='frmLogin:strCustomerLogin_userID']")
    Dim labelText = ""
    If labelElement IsNot Nothing Then
        labelText = labelElement.InnerText
    End If

    MsgBox(labelText) <---- Comes out with nothing aka ""
    MsgBox(labelElement.InnerText) <---- same as above
End Sub

先看这个简单的例子:

Dim htmlString = "<form><label for='something'>text text</label></form>"
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
     labelText = labelElement.InnerText
End If

现在 labelText 变量包含 text text

这里是使用 WebClient

从给定 link 加载 html 的示例
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim webClinet As New System.Net.WebClient
Dim html As String = ""
'add your web page link here
html = webClinet.DownloadString("http://yourlink.com/")
htmlDoc.LoadHtml(html)
'and here add your for attribute value for that label instead of something
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
    labelText = labelElement.InnerText
End If

Update:因为你说你已经在 WebBrowser 控件中打开了它,所以使用 DocumentText 属性 来获取 html 文字如下:

Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
htmlDoc.LoadHtml(webBrowser1.DocumentText)
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
   labelText = labelElement.InnerText
End If

**更新:**关于如何从 WebBrowser 控件中获取 Html 字符串的示例

Public Class Form1
    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        WebBrowser1.Navigate("https://www.google.com")
    End Sub

    Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
        MessageBox.Show(WebBrowser1.DocumentText)
    End Sub
End Class