无法从网页中获取标题
Can't fetch the titles from a webpage
我在 vba 中结合 IE
编写了一个脚本,以从网页中获取不同 charts
的 titles
,但我无法到。似乎我使用了正确的 class
名称和 tag
名称来访问内容但没有骰子。它也不会抛出任何错误。
这是我目前的方法:
Sub GetTitle()
Const Url As String = "https://www.fbatoolkit.com/"
Dim IE As New InternetExplorer, Html As HTMLDocument
Dim post As Object
With IE
.Visible = True
.navigate Url
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set Html = .document
End With
Application.Wait Now + TimeValue("00:00:05")
For Each post In Html.getElementsByClassName("chart")
With post.getElementsByTagName("text")
If .Length Then R = R + 1: Cells(R, 1) = .item(0).innerText
End With
Next post
End Sub
标题如下所示,在每个图表上方可见:
Toys & Games
Health & Household
我不期待任何与 selenium
相关的解决方案。谢谢。
老实说,这有点作弊。将其视为占位符,直到我找到更好的方法,因为我猜你特别想访问 those titles.
Option Explicit
Public Sub GetInfo()
Dim ie As New InternetExplorer, html As HTMLDocument, titles(), i As Long
With ie
.Visible = True
.navigate "https://www.fbatoolkit.com/"
While .Busy Or .readyState < 4: DoEvents: Wend
Set html = .document
titles = GetTitles(html.body.innerHTML, "id=""visualization([^""]*)")
For i = LBound(titles) To UBound(titles)
Debug.Print titles(i)
Next
.Quit '<== Remember to quit application
End With
End Sub
Public Function GetTitles(ByVal inputString As String, ByVal sPattern As String) As Variant
Dim Matches As Object, iMatch As Object, s As String, arrMatches(), i As Long
With CreateObject("vbscript.regexp")
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = sPattern
If .test(inputString) Then
Set Matches = .Execute(inputString)
For Each iMatch In Matches
If iMatch.SubMatches(0) <> vbNullString Then
ReDim Preserve arrMatches(i)
arrMatches(i) = Replace$(Replace$(iMatch.SubMatches(0), Chr$(95), Chr$(32)), Chr$(32) & Chr$(32), Chr$(32) & Chr$(38) & Chr$(32))
i = i + 1
End If
Next iMatch
End If
End With
GetTitles = arrMatches
End Function
虽然这个答案完全受到 QHarr
的影响,但我想将它发布给未来的读者。使用 IDS
是最好的策略。以下解决方案几乎类似于类别名称。
这里是:
Sub GetChartInfo()
Const Url As String = "https://www.fbatoolkit.com/"
Dim IE As New InternetExplorer, Html As HTMLDocument
Dim itemvisibility As Object, otitle As Object, I&
With IE
.Visible = False
.navigate Url
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set Html = .document
End With
Do: Set itemvisibility = Html.querySelectorAll("div[class='chart-container']"): DoEvents: Loop While itemvisibility.Length <= -1
With Html.querySelectorAll("div[class='chart-container']")
For I = 0 To .Length - 1
Do: Set otitle = .Item(I).querySelector(".chart"): DoEvents: Loop While otitle Is Nothing
Cells(I + 1, 1) = Application.WorksheetFunction.Proper(Replace(Replace(Split(otitle.getAttribute("id"), "visualization_")(1), "__", " "), "_", " "))
Next I
End With
End Sub
我在 vba 中结合 IE
编写了一个脚本,以从网页中获取不同 charts
的 titles
,但我无法到。似乎我使用了正确的 class
名称和 tag
名称来访问内容但没有骰子。它也不会抛出任何错误。
这是我目前的方法:
Sub GetTitle()
Const Url As String = "https://www.fbatoolkit.com/"
Dim IE As New InternetExplorer, Html As HTMLDocument
Dim post As Object
With IE
.Visible = True
.navigate Url
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set Html = .document
End With
Application.Wait Now + TimeValue("00:00:05")
For Each post In Html.getElementsByClassName("chart")
With post.getElementsByTagName("text")
If .Length Then R = R + 1: Cells(R, 1) = .item(0).innerText
End With
Next post
End Sub
标题如下所示,在每个图表上方可见:
Toys & Games
Health & Household
我不期待任何与 selenium
相关的解决方案。谢谢。
老实说,这有点作弊。将其视为占位符,直到我找到更好的方法,因为我猜你特别想访问 those titles.
Option Explicit
Public Sub GetInfo()
Dim ie As New InternetExplorer, html As HTMLDocument, titles(), i As Long
With ie
.Visible = True
.navigate "https://www.fbatoolkit.com/"
While .Busy Or .readyState < 4: DoEvents: Wend
Set html = .document
titles = GetTitles(html.body.innerHTML, "id=""visualization([^""]*)")
For i = LBound(titles) To UBound(titles)
Debug.Print titles(i)
Next
.Quit '<== Remember to quit application
End With
End Sub
Public Function GetTitles(ByVal inputString As String, ByVal sPattern As String) As Variant
Dim Matches As Object, iMatch As Object, s As String, arrMatches(), i As Long
With CreateObject("vbscript.regexp")
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = sPattern
If .test(inputString) Then
Set Matches = .Execute(inputString)
For Each iMatch In Matches
If iMatch.SubMatches(0) <> vbNullString Then
ReDim Preserve arrMatches(i)
arrMatches(i) = Replace$(Replace$(iMatch.SubMatches(0), Chr$(95), Chr$(32)), Chr$(32) & Chr$(32), Chr$(32) & Chr$(38) & Chr$(32))
i = i + 1
End If
Next iMatch
End If
End With
GetTitles = arrMatches
End Function
虽然这个答案完全受到 QHarr
的影响,但我想将它发布给未来的读者。使用 IDS
是最好的策略。以下解决方案几乎类似于类别名称。
这里是:
Sub GetChartInfo()
Const Url As String = "https://www.fbatoolkit.com/"
Dim IE As New InternetExplorer, Html As HTMLDocument
Dim itemvisibility As Object, otitle As Object, I&
With IE
.Visible = False
.navigate Url
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set Html = .document
End With
Do: Set itemvisibility = Html.querySelectorAll("div[class='chart-container']"): DoEvents: Loop While itemvisibility.Length <= -1
With Html.querySelectorAll("div[class='chart-container']")
For I = 0 To .Length - 1
Do: Set otitle = .Item(I).querySelector(".chart"): DoEvents: Loop While otitle Is Nothing
Cells(I + 1, 1) = Application.WorksheetFunction.Proper(Replace(Replace(Split(otitle.getAttribute("id"), "visualization_")(1), "__", " "), "_", " "))
Next I
End With
End Sub