从具有多个 innertext 的网页 excel vba 抓取数据
Grab data from webpage excel vba with multiple innertext
我正在尝试从网页中获取一些数据并且部分成功。然而,我的 html 和 javascript 知识并不是最好的。我可以抓取数据并填充到我的 sheet 中,但如果可能的话,我想更多地分离数据。
这是我的代码:
Sub get_data_2()
'Source for this code is:
'
Dim sht As Worksheet
Dim SKU As String
Dim RowCount As Long
Set sht = Sheet8
Set ie = CreateObject("InternetExplorer.application")
RowCount = 1
'This just gives the columns a titel i row numer 1.
sht.Range("a" & RowCount) = "SKU"
sht.Range("b" & RowCount) = "Own titel"
sht.Range("c" & RowCount) = "EMO titel"
sht.Range("d" & RowCount) = "Product info"
sht.Range("e" & RowCount) = "Weight"
sht.Range("f" & RowCount) = "Volum"
sht.Range("g" & RowCount) = "EAN"
sht.Range("h" & RowCount) = "Originalnumber"
sht.Range("i" & RowCount) = "Price"
sht.Range("j" & RowCount) = "Stock"
sht.Range("k" & RowCount) = "Units"
Do
RowCount = RowCount + 1
SKU = sht.Range("a" & RowCount).Value ' **SKU is 491215 in this example**
With ie
.Visible = False
.navigate "https://www.emo.no/web/ePortal/ctrl?action=showiteminfo&itemNo=" & SKU
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
sht.Range("c" & RowCount).Value = .document.getElementById("itemDetail_heading").innerText
sht.Range("d" & RowCount).Value = .document.getElementById("itemDetail_textBox").innerText
sht.Range("e" & RowCount).Value = .document.getElementById("itemDetail_technicalDataBox").innerText
sht.Range("j" & RowCount).Value = .document.getElementById("itemDetail_deliveryBox").innerText
sht.Range("k" & RowCount).Value = .document.getElementById("itemDetail_unitsbox").innerText
End With
Loop While sht.Range("a" & RowCount + 1).Value <> ""
Set ie = Nothing
End Sub
现在,网页上的 html 来源(节选)如下:
<div id="itemDetail_container">
<div id="itemDetail_heading">
<div class="xxLarge extraBold">Papir ubleket kraft 60g 40cm 5kg/rull</div>
<div class="item_itemNumberBox">
<span class="darkGray medium">Varenr : 491215</span>
</div>
</div>
我只想要文字“Papir ubleket kraft 60g 40cm 5kg/rull
" 出现在我的 excel sheet 中,但我也得到 "Varenr : 491215"。其他列也是如此。我试图 post thge [=29= 的照片] 抓取,但不被允许。您可以 运行 代码并查看,或者我可以将屏幕截图通过电子邮件发送给您。
如何将数据放入不同的列中?
非常感谢您的帮助! :-)
对于 "Papir ubleket kraft 60g 40cm 5kg/rull" 改变这个
.document.getElementById("itemDetail_heading").innerText
至:
.document.getElementById("itemDetail_heading").getElementsByTagName("div")(0).innerText
或(不太具体):
.document.getElementById("itemDetail_heading").firstChild.innerText
获取"Varenr : 491215" -
.document.getElementById("itemDetail_heading").getElementsByTagName("span")(0).innerText
我正在尝试从网页中获取一些数据并且部分成功。然而,我的 html 和 javascript 知识并不是最好的。我可以抓取数据并填充到我的 sheet 中,但如果可能的话,我想更多地分离数据。
这是我的代码:
Sub get_data_2()
'Source for this code is:
'
Dim sht As Worksheet
Dim SKU As String
Dim RowCount As Long
Set sht = Sheet8
Set ie = CreateObject("InternetExplorer.application")
RowCount = 1
'This just gives the columns a titel i row numer 1.
sht.Range("a" & RowCount) = "SKU"
sht.Range("b" & RowCount) = "Own titel"
sht.Range("c" & RowCount) = "EMO titel"
sht.Range("d" & RowCount) = "Product info"
sht.Range("e" & RowCount) = "Weight"
sht.Range("f" & RowCount) = "Volum"
sht.Range("g" & RowCount) = "EAN"
sht.Range("h" & RowCount) = "Originalnumber"
sht.Range("i" & RowCount) = "Price"
sht.Range("j" & RowCount) = "Stock"
sht.Range("k" & RowCount) = "Units"
Do
RowCount = RowCount + 1
SKU = sht.Range("a" & RowCount).Value ' **SKU is 491215 in this example**
With ie
.Visible = False
.navigate "https://www.emo.no/web/ePortal/ctrl?action=showiteminfo&itemNo=" & SKU
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
sht.Range("c" & RowCount).Value = .document.getElementById("itemDetail_heading").innerText
sht.Range("d" & RowCount).Value = .document.getElementById("itemDetail_textBox").innerText
sht.Range("e" & RowCount).Value = .document.getElementById("itemDetail_technicalDataBox").innerText
sht.Range("j" & RowCount).Value = .document.getElementById("itemDetail_deliveryBox").innerText
sht.Range("k" & RowCount).Value = .document.getElementById("itemDetail_unitsbox").innerText
End With
Loop While sht.Range("a" & RowCount + 1).Value <> ""
Set ie = Nothing
End Sub
现在,网页上的 html 来源(节选)如下:
<div id="itemDetail_container">
<div id="itemDetail_heading">
<div class="xxLarge extraBold">Papir ubleket kraft 60g 40cm 5kg/rull</div>
<div class="item_itemNumberBox">
<span class="darkGray medium">Varenr : 491215</span>
</div>
</div>
我只想要文字“Papir ubleket kraft 60g 40cm 5kg/rull " 出现在我的 excel sheet 中,但我也得到 "Varenr : 491215"。其他列也是如此。我试图 post thge [=29= 的照片] 抓取,但不被允许。您可以 运行 代码并查看,或者我可以将屏幕截图通过电子邮件发送给您。
如何将数据放入不同的列中?
非常感谢您的帮助! :-)
对于 "Papir ubleket kraft 60g 40cm 5kg/rull" 改变这个
.document.getElementById("itemDetail_heading").innerText
至:
.document.getElementById("itemDetail_heading").getElementsByTagName("div")(0).innerText
或(不太具体):
.document.getElementById("itemDetail_heading").firstChild.innerText
获取"Varenr : 491215" -
.document.getElementById("itemDetail_heading").getElementsByTagName("span")(0).innerText