无法从网页中获取标题

Can't fetch the titles from a webpage

我在 vba 中结合 IE 编写了一个脚本,以从网页中获取不同 chartstitles,但我无法到。似乎我使用了正确的 class 名称和 tag 名称来访问内容但没有骰子。它也不会抛出任何错误。

这是我目前的方法:

Sub GetTitle()
    Const Url As String = "https://www.fbatoolkit.com/"
    Dim IE As New InternetExplorer, Html As HTMLDocument
    Dim post As Object

    With IE
        .Visible = True
        .navigate Url
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set Html = .document
    End With

    Application.Wait Now + TimeValue("00:00:05")

    For Each post In Html.getElementsByClassName("chart")
        With post.getElementsByTagName("text")
          If .Length Then R = R + 1: Cells(R, 1) = .item(0).innerText
        End With
    Next post
End Sub

标题如下所示,在每个图表上方可见:

Toys & Games
Health & Household

我不期待任何与 selenium 相关的解决方案。谢谢。

老实说,这有点作弊。将其视为占位符,直到我找到更好的方法,因为我猜你特别想访问 those titles.

Option Explicit
Public Sub GetInfo()
    Dim ie As New InternetExplorer, html As HTMLDocument, titles(), i As Long
    With ie
        .Visible = True
        .navigate "https://www.fbatoolkit.com/"
        While .Busy Or .readyState < 4: DoEvents: Wend
        Set html = .document
        titles = GetTitles(html.body.innerHTML, "id=""visualization([^""]*)")
        For i = LBound(titles) To UBound(titles)
            Debug.Print titles(i)
        Next
        .Quit '<== Remember to quit application
    End With
End Sub

Public Function GetTitles(ByVal inputString As String, ByVal sPattern As String) As Variant
    Dim Matches As Object, iMatch As Object, s As String, arrMatches(), i As Long
    With CreateObject("vbscript.regexp")
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = sPattern
        If .test(inputString) Then
            Set Matches = .Execute(inputString)
            For Each iMatch In Matches
                If iMatch.SubMatches(0) <> vbNullString Then
                    ReDim Preserve arrMatches(i)
                    arrMatches(i) = Replace$(Replace$(iMatch.SubMatches(0), Chr$(95), Chr$(32)), Chr$(32) & Chr$(32), Chr$(32) & Chr$(38) & Chr$(32))
                    i = i + 1
                End If
            Next iMatch
        End If
    End With
    GetTitles = arrMatches
End Function

虽然这个答案完全受到 QHarr 的影响,但我想将它发布给未来的读者。使用 IDS 是最好的策略。以下解决方案几乎类似于类别名称。

这里是:

Sub GetChartInfo()
    Const Url As String = "https://www.fbatoolkit.com/"
    Dim IE As New InternetExplorer, Html As HTMLDocument
    Dim itemvisibility As Object, otitle As Object, I&

    With IE
        .Visible = False
        .navigate Url
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set Html = .document
    End With

    Do: Set itemvisibility = Html.querySelectorAll("div[class='chart-container']"): DoEvents: Loop While itemvisibility.Length <= -1

    With Html.querySelectorAll("div[class='chart-container']")
        For I = 0 To .Length - 1
            Do: Set otitle = .Item(I).querySelector(".chart"): DoEvents: Loop While otitle Is Nothing
            Cells(I + 1, 1) = Application.WorksheetFunction.Proper(Replace(Replace(Split(otitle.getAttribute("id"), "visualization_")(1), "__", " "), "_", " "))
        Next I
    End With
End Sub