VBA WebScraping 数据在单元格中向后显示
VBA WebScraping Data Showing Backwards in cell
我正在从网络中提取一些数据,一切正常,除了一组数据,当它被提取时,它在单元格中向后显示。
我不明白为什么它向后显示,因为其他一切都很好。
Q) 能否请教一下为什么会这样?
这就是我用来提取数据的方法,它适用于其他一切,只是这个 class 它在 excel
中向后显示
Set doc = NewHTMLDocument(CStr(link))
'''IF Statement, change class to suite needs 'bscd
' On Error Resume Next
If doc.getElementsByClassName("bscd")(0) Is Nothing Then
wsSheet.Cells(StartRow + Counter, 5).Value = "-"
Else
' On Error Resume Next
wsSheet.Cells(StartRow + Counter, 5).Value = doc.getElementsByClassName("bscd")(0).Children(1).InnerText
End If
这是Class
结果在 excel
中向后显示
难道“完整信息”是一个 JAVA 下拉菜单?
只是按照建议按 Ctrl+U,这就是 html 的样子,它在这里向后显示,但在网站上显示正确.
您需要单击 link 才能访问内容。这是您可以做的方法之一。我在脚本中使用了显式等待而不是硬编码延迟,因此脚本将等待最多 10 秒直到内容可见。
Public driver As ChromeDriver
Sub ScrapeContent()
Const URL$ = "https://www.ebay.co.uk/itm/Metal-Floor-Fan-High-velocity-chrome-free-stand-fan-industrial-fan-3-8-Speed-UK/333664038024"
Dim oElem As Object, oItem As Object
Set driver = New ChromeDriver
driver.get URL
driver.FindElementByXPath("//span/a[contains(.,'Complete information')]", Timeout:=10000).Click
Set oElem = driver.FindElementByXPath("//span[contains(.,'Phone:')]/following::span", Timeout:=10000)
Set oItem = driver.FindElementByXPath("//span[contains(.,'Email:')]/following::span", Timeout:=10000)
Debug.Print oElem.Text, oItem.Text
End Sub
输出:
13025438495 eshijiali@outlook.com
如果使用xmlhttp请求,可能得到的结果是相反的。但是,我使用了一个函数使它们规则化:
Function reverseString(inputStr As String)
Dim myString$, I&
For I = Len(inputStr) To 1 Step -1
myString = myString & Mid(inputStr, I, 1)
Next I
reverseString = myString
End Function
Sub FetchData()
Const Url$ = "https://www.ebay.co.uk/itm/Metal-Floor-Fan-High-velocity-chrome-free-stand-fan-industrial-fan-3-8-Speed-UK/333664038024"
Dim HTML As New HTMLDocument, oPost As Object
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", Url, False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36"
.send
HTML.body.innerHTML = .responseText
End With
Set oPost = HTML.getElementsByClassName("bsi-lbl")
If Not oPost Is Nothing And oPost.Length >= 1 Then
Debug.Print reverseString(oPost(0).NextSibling.innerText)
End If
If Not oPost Is Nothing And oPost.Length >= 2 Then
Debug.Print reverseString(oPost(1).NextSibling.innerText)
End If
End Sub
输出:
13025438495 eshijiali@outlook.com
我正在从网络中提取一些数据,一切正常,除了一组数据,当它被提取时,它在单元格中向后显示。
我不明白为什么它向后显示,因为其他一切都很好。
Q) 能否请教一下为什么会这样?
这就是我用来提取数据的方法,它适用于其他一切,只是这个 class 它在 excel
中向后显示 Set doc = NewHTMLDocument(CStr(link))
'''IF Statement, change class to suite needs 'bscd
' On Error Resume Next
If doc.getElementsByClassName("bscd")(0) Is Nothing Then
wsSheet.Cells(StartRow + Counter, 5).Value = "-"
Else
' On Error Resume Next
wsSheet.Cells(StartRow + Counter, 5).Value = doc.getElementsByClassName("bscd")(0).Children(1).InnerText
End If
这是Class
结果在 excel
中向后显示难道“完整信息”是一个 JAVA 下拉菜单?
只是按照建议按 Ctrl+U,这就是 html 的样子,它在这里向后显示,但在网站上显示正确.
您需要单击 link 才能访问内容。这是您可以做的方法之一。我在脚本中使用了显式等待而不是硬编码延迟,因此脚本将等待最多 10 秒直到内容可见。
Public driver As ChromeDriver
Sub ScrapeContent()
Const URL$ = "https://www.ebay.co.uk/itm/Metal-Floor-Fan-High-velocity-chrome-free-stand-fan-industrial-fan-3-8-Speed-UK/333664038024"
Dim oElem As Object, oItem As Object
Set driver = New ChromeDriver
driver.get URL
driver.FindElementByXPath("//span/a[contains(.,'Complete information')]", Timeout:=10000).Click
Set oElem = driver.FindElementByXPath("//span[contains(.,'Phone:')]/following::span", Timeout:=10000)
Set oItem = driver.FindElementByXPath("//span[contains(.,'Email:')]/following::span", Timeout:=10000)
Debug.Print oElem.Text, oItem.Text
End Sub
输出:
13025438495 eshijiali@outlook.com
如果使用xmlhttp请求,可能得到的结果是相反的。但是,我使用了一个函数使它们规则化:
Function reverseString(inputStr As String)
Dim myString$, I&
For I = Len(inputStr) To 1 Step -1
myString = myString & Mid(inputStr, I, 1)
Next I
reverseString = myString
End Function
Sub FetchData()
Const Url$ = "https://www.ebay.co.uk/itm/Metal-Floor-Fan-High-velocity-chrome-free-stand-fan-industrial-fan-3-8-Speed-UK/333664038024"
Dim HTML As New HTMLDocument, oPost As Object
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", Url, False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36"
.send
HTML.body.innerHTML = .responseText
End With
Set oPost = HTML.getElementsByClassName("bsi-lbl")
If Not oPost Is Nothing And oPost.Length >= 1 Then
Debug.Print reverseString(oPost(0).NextSibling.innerText)
End If
If Not oPost Is Nothing And oPost.Length >= 2 Then
Debug.Print reverseString(oPost(1).NextSibling.innerText)
End If
End Sub
输出:
13025438495 eshijiali@outlook.com