VBA WebScraping 数据在单元格中向后显示

VBA WebScraping Data Showing Backwards in cell

我正在从网络中提取一些数据,一切正常,除了一组数据,当它被提取时,它在单元格中向后显示。

我不明白为什么它向后显示,因为其他一切都很好。

Q) 能否请教一下为什么会这样?

这就是我用来提取数据的方法,它适用于其他一切,只是这个 class 它在 excel

中向后显示
        Set doc = NewHTMLDocument(CStr(link))
        '''IF Statement, change class to suite needs 'bscd
               ' On Error Resume Next
                If doc.getElementsByClassName("bscd")(0) Is Nothing Then
                wsSheet.Cells(StartRow + Counter, 5).Value = "-"
            Else
               ' On Error Resume Next
                wsSheet.Cells(StartRow + Counter, 5).Value = doc.getElementsByClassName("bscd")(0).Children(1).InnerText
            End If

这是Class

结果在 excel

中向后显示

难道“完整信息”是一个 JAVA 下拉菜单?

只是按照建议按 Ctrl+U,这就是 html 的样子,它在这里向后显示,但在网站上显示正确.

您需要单击 link 才能访问内容。这是您可以做的方法之一。我在脚本中使用了显式等待而不是硬编码延迟,因此脚本将等待最多 10 秒直到内容可见。

Public driver As ChromeDriver

Sub ScrapeContent()
    Const URL$ = "https://www.ebay.co.uk/itm/Metal-Floor-Fan-High-velocity-chrome-free-stand-fan-industrial-fan-3-8-Speed-UK/333664038024"
    Dim oElem As Object, oItem As Object
    Set driver = New ChromeDriver
    driver.get URL

    driver.FindElementByXPath("//span/a[contains(.,'Complete information')]", Timeout:=10000).Click
    Set oElem = driver.FindElementByXPath("//span[contains(.,'Phone:')]/following::span", Timeout:=10000)
    Set oItem = driver.FindElementByXPath("//span[contains(.,'Email:')]/following::span", Timeout:=10000)
    
    Debug.Print oElem.Text, oItem.Text
End Sub

输出:

13025438495   eshijiali@outlook.com

如果使用xmlhttp请求,可能得到的结果是相反的。但是,我使用了一个函数使它们规则化:

Function reverseString(inputStr As String)
    Dim myString$, I&
    
    For I = Len(inputStr) To 1 Step -1
        myString = myString & Mid(inputStr, I, 1)
    Next I
    
    reverseString = myString
End Function

Sub FetchData()
    Const Url$ = "https://www.ebay.co.uk/itm/Metal-Floor-Fan-High-velocity-chrome-free-stand-fan-industrial-fan-3-8-Speed-UK/333664038024"
    Dim HTML As New HTMLDocument, oPost As Object

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", Url, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36"
        .send
        HTML.body.innerHTML = .responseText
    End With

    Set oPost = HTML.getElementsByClassName("bsi-lbl")

    If Not oPost Is Nothing And oPost.Length >= 1 Then
        Debug.Print reverseString(oPost(0).NextSibling.innerText)
    End If

    If Not oPost Is Nothing And oPost.Length >= 2 Then
        Debug.Print reverseString(oPost(1).NextSibling.innerText)
    End If
End Sub

输出:

13025438495   eshijiali@outlook.com