在 VB.Net 中使用 HtmlAgilityPack 从网站获取文本
Using HtmlAgilityPack in VB.Net to Get Text from a Website
我正在为我的女朋友编写一个程序,允许她打开该程序,它会自动从星座网站上收集她的报价并将该行文本显示在文本框中。
我现在看到的,基本上是HTML显示整个网站,这不是我想要的。这是我需要抓取的 HTML 行。
<div class="fontdef1" style="padding-right:10px;" id="textline">
"You might have the desire for travel, perhaps to visit a friend who lives far away, Gemini. You may actually set the wheels in motion to make it happen. Social events could take up your time this evening, and you could meet some interesting people. A friend might need a sympathetic ear. Today you're especially sensitive to others, so be prepared to hear a sad story. Otherwise, your day should go well.
</div>
我目前的代码是。
Imports System.Net
Imports System.IO
Imports HtmlAgilityPack
Public Class Form1
Private Function getHTML(ByVal Address As String) As String
Dim rt As String = ""
Dim wRequest As WebRequest
Dim wResponse As WebResponse
Dim SR As StreamReader
wRequest = WebRequest.Create(Address)
wResponse = wRequest.GetResponse
SR = New StreamReader(wResponse.GetResponseStream)
rt = SR.ReadToEnd
SR.Close()
Return rt
End Function
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Label2.Text = Date.Now.ToString("MM/dd/yyyy")
TextBox1.Text = getHTML("http://my.horoscope.com/astrology/free-daily-horoscope-gemini.html")
End Sub
End Class
感谢您提供的任何帮助。老实说,我现在不知道该计划去哪里。已经3天了,没有任何进展。
学习 XPath or LINQ 使用 HtmlAgilityPack 从 HTML 文档中提取某些信息。这是一个使用 XPath 选择器的控制台应用程序示例:
Imports System
Imports System.Xml
Imports HtmlAgilityPack
Public Module Module1
Public Sub Main()
Dim link As String = "http://my.horoscope.com/astrology/free-daily-horoscope-gemini.html"
'download page from the link into an HtmlDocument'
Dim doc As HtmlDocument = New HtmlWeb().Load(link)
'select <div> having class attribute equals fontdef1'
Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//div[@class='fontdef1']")
'if the div is found, print the inner text'
If Not div Is Nothing Then
Console.WriteLine(div.InnerText.Trim())
End If
End Sub
End Module
输出:
You might have the desire for travel, perhaps to visit a friend who lives far away, Gemini. You may actually set the wheels in motion to make it happen. Social events could take up your time this evening, and you could meet some interesting people. A friend might need a sympathetic ear. Today you're especially sensitive to others, so be prepared to hear a sad story. Otherwise, your day should go well.
我正在为我的女朋友编写一个程序,允许她打开该程序,它会自动从星座网站上收集她的报价并将该行文本显示在文本框中。
我现在看到的,基本上是HTML显示整个网站,这不是我想要的。这是我需要抓取的 HTML 行。
<div class="fontdef1" style="padding-right:10px;" id="textline">
"You might have the desire for travel, perhaps to visit a friend who lives far away, Gemini. You may actually set the wheels in motion to make it happen. Social events could take up your time this evening, and you could meet some interesting people. A friend might need a sympathetic ear. Today you're especially sensitive to others, so be prepared to hear a sad story. Otherwise, your day should go well.
</div>
我目前的代码是。
Imports System.Net
Imports System.IO
Imports HtmlAgilityPack
Public Class Form1
Private Function getHTML(ByVal Address As String) As String
Dim rt As String = ""
Dim wRequest As WebRequest
Dim wResponse As WebResponse
Dim SR As StreamReader
wRequest = WebRequest.Create(Address)
wResponse = wRequest.GetResponse
SR = New StreamReader(wResponse.GetResponseStream)
rt = SR.ReadToEnd
SR.Close()
Return rt
End Function
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Label2.Text = Date.Now.ToString("MM/dd/yyyy")
TextBox1.Text = getHTML("http://my.horoscope.com/astrology/free-daily-horoscope-gemini.html")
End Sub
End Class
感谢您提供的任何帮助。老实说,我现在不知道该计划去哪里。已经3天了,没有任何进展。
学习 XPath or LINQ 使用 HtmlAgilityPack 从 HTML 文档中提取某些信息。这是一个使用 XPath 选择器的控制台应用程序示例:
Imports System
Imports System.Xml
Imports HtmlAgilityPack
Public Module Module1
Public Sub Main()
Dim link As String = "http://my.horoscope.com/astrology/free-daily-horoscope-gemini.html"
'download page from the link into an HtmlDocument'
Dim doc As HtmlDocument = New HtmlWeb().Load(link)
'select <div> having class attribute equals fontdef1'
Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//div[@class='fontdef1']")
'if the div is found, print the inner text'
If Not div Is Nothing Then
Console.WriteLine(div.InnerText.Trim())
End If
End Sub
End Module
输出:
You might have the desire for travel, perhaps to visit a friend who lives far away, Gemini. You may actually set the wheels in motion to make it happen. Social events could take up your time this evening, and you could meet some interesting people. A friend might need a sympathetic ear. Today you're especially sensitive to others, so be prepared to hear a sad story. Otherwise, your day should go well.