如何使用 vb.net 读取和打印网页上的所有 html 标签内部 html 文本
How can i use vb.net to read and print all html label innerhtml text on a webpage
所以我有 HTML 个敏捷包。
我正在尝试阅读网页 html。我需要标签的内容,但不确定如何获得它。
我知道 for 属性是什么..但我不知道如何使用它来获取标签的内部html。
谁能帮忙
Private Sub SetTextboxText(ByVal Text As String)
DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_userID"), mshtml.HTMLInputElement).value = ""
DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_pwd"), mshtml.HTMLInputElement).value = ""
ClickNormalButton()
Memorable_Reader()
End Sub
'Gets and Sets Memorable Information
Private Sub Memorable_Reader()
'Read Label 'For' Attribute
'Display Innerhtml Text in msgbox
End Sub
'CLICKS THE SUBMIT BUTTON
Private Sub ClickNormalButton()
GetCurrentWebForm.submit()
End Sub
更新:
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
WebBrowser1.Navigate("https://online.lloydsbank.co.uk/personal/logon/login.jsp?WT.ac=PLO0512")
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
htmlDoc.LoadHtml(WebBrowser1.DocumentText)
Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='frmLogin:strCustomerLogin_userID']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
MsgBox(labelText) <---- Comes out with nothing aka ""
MsgBox(labelElement.InnerText) <---- same as above
End Sub
先看这个简单的例子:
Dim htmlString = "<form><label for='something'>text text</label></form>"
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
现在 labelText
变量包含 text text
这里是使用 WebClient
从给定 link 加载 html 的示例
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim webClinet As New System.Net.WebClient
Dim html As String = ""
'add your web page link here
html = webClinet.DownloadString("http://yourlink.com/")
htmlDoc.LoadHtml(html)
'and here add your for attribute value for that label instead of something
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
Update:因为你说你已经在 WebBrowser
控件中打开了它,所以使用 DocumentText
属性 来获取 html 文字如下:
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
htmlDoc.LoadHtml(webBrowser1.DocumentText)
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
**更新:**关于如何从 WebBrowser 控件中获取 Html 字符串的示例
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
WebBrowser1.Navigate("https://www.google.com")
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
MessageBox.Show(WebBrowser1.DocumentText)
End Sub
End Class
所以我有 HTML 个敏捷包。
我正在尝试阅读网页 html。我需要标签的内容,但不确定如何获得它。
我知道 for 属性是什么..但我不知道如何使用它来获取标签的内部html。
谁能帮忙
Private Sub SetTextboxText(ByVal Text As String)
DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_userID"), mshtml.HTMLInputElement).value = ""
DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_pwd"), mshtml.HTMLInputElement).value = ""
ClickNormalButton()
Memorable_Reader()
End Sub
'Gets and Sets Memorable Information
Private Sub Memorable_Reader()
'Read Label 'For' Attribute
'Display Innerhtml Text in msgbox
End Sub
'CLICKS THE SUBMIT BUTTON
Private Sub ClickNormalButton()
GetCurrentWebForm.submit()
End Sub
更新:
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
WebBrowser1.Navigate("https://online.lloydsbank.co.uk/personal/logon/login.jsp?WT.ac=PLO0512")
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
htmlDoc.LoadHtml(WebBrowser1.DocumentText)
Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='frmLogin:strCustomerLogin_userID']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
MsgBox(labelText) <---- Comes out with nothing aka ""
MsgBox(labelElement.InnerText) <---- same as above
End Sub
先看这个简单的例子:
Dim htmlString = "<form><label for='something'>text text</label></form>"
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
现在 labelText
变量包含 text text
这里是使用 WebClient
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim webClinet As New System.Net.WebClient
Dim html As String = ""
'add your web page link here
html = webClinet.DownloadString("http://yourlink.com/")
htmlDoc.LoadHtml(html)
'and here add your for attribute value for that label instead of something
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
Update:因为你说你已经在 WebBrowser
控件中打开了它,所以使用 DocumentText
属性 来获取 html 文字如下:
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
htmlDoc.LoadHtml(webBrowser1.DocumentText)
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
labelText = labelElement.InnerText
End If
**更新:**关于如何从 WebBrowser 控件中获取 Html 字符串的示例
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
WebBrowser1.Navigate("https://www.google.com")
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
MessageBox.Show(WebBrowser1.DocumentText)
End Sub
End Class