调整从网页中提取 table 的函数以仅提取单个元素 + 重复
Adjusting a function which pulls a table from a web page to pull only a single element + repeat
我有一个奇怪的项目要完成。
本质上,我有一个工具可以创建一个包含仓库特定部分的整个库存的广泛电子表格。它列出了它们的位置、库存状态和它们的项目 ID("ASIN",这基本上是内部系统中的条形码和虚拟参考。问题是,它没有列出 "velocity"(一个指标我们在一周内卖出了多少)这些特定商品,我想在每个商品 ID 旁边打印出这个指标,这样我就可以整理出没有卖的东西并将其发送到长期存储部分仓库。我找到了另一个工具,它从我们的内部 wiki("FCresearch") 中获取关于单个项目 ID 的 table 信息,它恰好包含这个特定的指标。我只想获取此 table 中某项的速度(基本上是此位置中的数字:
/html/body/div[2]/div/div[1]/div/div[1]/div/div[2]/div/div/div[2]/table/tbody/tr[19]/td
在网页上),然后调整此宏,使其作用于前一个工具创建的 table 中的 ASIN,将其速度打印到相邻的单元格,然后向下移动一行并重复所有 ~4000 个条目,直到它为空 space.
这里是完整的相关函数:
Sub getFCresearch()
Dim A As Object, H As Object, D As Object, C As Object, asin$, B$, F$
Dim x&, t&
Set C = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Set D = CreateObject("HTMLFile")
Set A = CreateObject("New:{00000566-0000-0010-8000-00AA006D2EA4}")
Set H = CreateObject("WinHTTP.WinHTTPRequest.5.1")
H.SetAutoLogonPolicy 0
''passes badge
H.Open "GET", "https://hrwfs.amazon.com/?Operation=empInfoByUid&ContentType=JSON&employeeUid=" & Environ("USERNAME")
H.send
DoEvents
B = Split(Split(H.ResponseText, "employeeBarcode"":""")(1), Chr(34))(0)
H.Open "POST", "http://fcmenu-iad-regionalized.corp.amazon.com/do/login"
H.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
H.setRequestHeader "Content-Length", Len("badgeBarcodeId=" & B)
H.send "badgeBarcodeId=" & B
DoEvents
H.Open "GET", "http://fcmenu-iad-regionalized.corp.amazon.com/" & F
H.send
DoEvents
''Needs to derive "asin" variable from adjacent cell
asin = Sheets("Sheet1").[A1]
''This gathers the specific item's page on the wiki "FCresearch"
H.Open "GET", "http://fcresearch-na.aka.amazon.com/DEN3/results/inventory?s=" & asin, False
H.send
'''This gets the whole table,where I only need one specific element called "velocity" at: /html/body/div[2]/div/div[1]/div/div[1]/div/div[2]/div/div/div[2]/table/tbody/tr[19]/td
D.body.InnerHTML = H.ResponseText
C.SetText D.GetElementById("table-inventory").OuterHTML
C.PutInClipboard
''This pastes the table to a different sheet, but needs to paste to a cell adjacent to the "asin" variable of each row
''Before moving down to the next row and repeating the process
Sheet2.[C:Z].Cells.ClearContents
Sheet2.[C1].PasteSpecial
Sheet2.[C:N].WrapText = False
Sheet2.Columns("C:N").AutoFit
End Sub
你们能提供的任何帮助都会很棒。抱歉,这是一件很广泛的事情,我对此还很陌生,我只能调整代码的一些小细节,而且我无法在任何地方找到比 .GetElementById 函数更深入的帮助文档这不适用于没有 ID 的 html 元素。
Image of table HTML, + plaintext
<table data-row-id="1579657885" class="a-keyvalue"><tbody><tr><th>ASIN</th><td><a href="/DEN3/results?s=1579657885">1579657885</a></td></tr><tr><th>Title</th><td><a target="_blank" href="http://www.amazon.com/gp/product/1579657885">1,000 Places to See Before You Die (Deluxe Edition): The World as You've Never Seen It Before</a></td></tr><tr><th>Binding</th><td>Hardcover</td></tr><tr><th>Publisher</th><td></td></tr><tr><th>Vendor Code</th><td>ATSAN</td></tr><tr><th>Weight</th><td>6.45 pounds</td></tr><tr><th>Dimensions</th><td>1.50 x 13.00 x 9.80 IN</td></tr><tr><th>List Price</th><td>USD 50.00</td></tr><tr><th>Expiration Date</th><td class=""></td></tr><tr><th>Asin Demand</th><td><a target="_blank" href="https://ufo.amazon.com/srw14na/asins/place_in_line/1579657885?warehouse=DEN3">Demand for 1579657885</a></td></tr><tr><th>Sortable</th><td>true</td></tr><tr><th>Conveyable</th><td>true</td></tr><tr><th>Very High Value</th><td>false</td></tr><tr><th>Master Case</th><td>false</td></tr><tr><th>FCSku Scope</th><td>FNSKU</td></tr><tr><th>Sales Forecast</th><td>4.0</td></tr><tr><th>Sales History (approx)</th><td>5.0</td></tr><tr><th>Sales Override</th><td>0.0</td></tr><tr><th>ASIN Velocity (approx)</th><td>5.0</td></tr><tr><th>Provenance Value</th><td>UNTRACKED</td></tr><tr><th>Provenance IOG</th><td>Info Not Found</td></tr></tbody></table>
好的,这里有两种获取所需信息的方法。如果您理解逻辑,我相信这些方法的任意组合应该足以根据您的需要调整代码。
为了简单起见,我假设 HTML 已经加载到名为 D
的 HTMLDocument
对象中。出于演示目的,利息值将打印在您的即时 window 中。
首先,您需要参考 Microsoft HTML Object Library
(VBE>工具>参考>...)。
我将使用以下变量:
Dim table As HTMLTable
Dim tableOfInterest As HTMLTable
Dim row As HTMLTableRow
Dim rowOfInterest As HTMLTableRow
Dim cell As HTMLTableCell 'not using it but you could in a For-Each
Dim cellOfInterest As HTMLTableCell
假设 table 的索引,行的索引和单元格的索引始终相同并且您知道它们:
Set tableOfInterest = D.getElementsByTagName("table")(0) 'Assuming the table of interest is the first table to appear in the HTML document. Keep in mind indexing starts at zero!
Set rowOfInterest = tableOfInterest.getElementsByTagName("tr")(18) 'Assuming the row of interest is the 19th row in the table.
Set cellOfInterest = rowOfInterest.getElementsByTagName("td")(0) 'Assuming the cell of interest is the 1st cell in the row.
Debug.Print cellOfInterest.innerText
假设您没有明确知道 table 和行的索引,但您知道属性或内部文本等其他信息
For Each table In D.getElementsByTagName("table")
If table.Attributes("data-row-id").Value = "1579657885" Then 'assuming the value of this attribute is always the same
Set tableOfInterest = table
End If
Next table
For Each row In tableOfInterest.getElementsByTagName("tr")
If row.innerText Like "*ASIN Velocity (approx)*" Then 'assuming that's the text you're looking for
Set rowOfInterest = row
End If
Next row
Debug.Print rowOfInterest.Cells(1).innerText 'in this case the "th" element is also considered a cell so the cell you're interested in is the 2nd one.
ID 缺失时使用的另一种方法是 .getElementsByClassName()
。它使用与 .getElementsByTagName()
相同的逻辑。
我有一个奇怪的项目要完成。 本质上,我有一个工具可以创建一个包含仓库特定部分的整个库存的广泛电子表格。它列出了它们的位置、库存状态和它们的项目 ID("ASIN",这基本上是内部系统中的条形码和虚拟参考。问题是,它没有列出 "velocity"(一个指标我们在一周内卖出了多少)这些特定商品,我想在每个商品 ID 旁边打印出这个指标,这样我就可以整理出没有卖的东西并将其发送到长期存储部分仓库。我找到了另一个工具,它从我们的内部 wiki("FCresearch") 中获取关于单个项目 ID 的 table 信息,它恰好包含这个特定的指标。我只想获取此 table 中某项的速度(基本上是此位置中的数字:
/html/body/div[2]/div/div[1]/div/div[1]/div/div[2]/div/div/div[2]/table/tbody/tr[19]/td
在网页上),然后调整此宏,使其作用于前一个工具创建的 table 中的 ASIN,将其速度打印到相邻的单元格,然后向下移动一行并重复所有 ~4000 个条目,直到它为空 space.
这里是完整的相关函数:
Sub getFCresearch()
Dim A As Object, H As Object, D As Object, C As Object, asin$, B$, F$
Dim x&, t&
Set C = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Set D = CreateObject("HTMLFile")
Set A = CreateObject("New:{00000566-0000-0010-8000-00AA006D2EA4}")
Set H = CreateObject("WinHTTP.WinHTTPRequest.5.1")
H.SetAutoLogonPolicy 0
''passes badge
H.Open "GET", "https://hrwfs.amazon.com/?Operation=empInfoByUid&ContentType=JSON&employeeUid=" & Environ("USERNAME")
H.send
DoEvents
B = Split(Split(H.ResponseText, "employeeBarcode"":""")(1), Chr(34))(0)
H.Open "POST", "http://fcmenu-iad-regionalized.corp.amazon.com/do/login"
H.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
H.setRequestHeader "Content-Length", Len("badgeBarcodeId=" & B)
H.send "badgeBarcodeId=" & B
DoEvents
H.Open "GET", "http://fcmenu-iad-regionalized.corp.amazon.com/" & F
H.send
DoEvents
''Needs to derive "asin" variable from adjacent cell
asin = Sheets("Sheet1").[A1]
''This gathers the specific item's page on the wiki "FCresearch"
H.Open "GET", "http://fcresearch-na.aka.amazon.com/DEN3/results/inventory?s=" & asin, False
H.send
'''This gets the whole table,where I only need one specific element called "velocity" at: /html/body/div[2]/div/div[1]/div/div[1]/div/div[2]/div/div/div[2]/table/tbody/tr[19]/td
D.body.InnerHTML = H.ResponseText
C.SetText D.GetElementById("table-inventory").OuterHTML
C.PutInClipboard
''This pastes the table to a different sheet, but needs to paste to a cell adjacent to the "asin" variable of each row
''Before moving down to the next row and repeating the process
Sheet2.[C:Z].Cells.ClearContents
Sheet2.[C1].PasteSpecial
Sheet2.[C:N].WrapText = False
Sheet2.Columns("C:N").AutoFit
End Sub
你们能提供的任何帮助都会很棒。抱歉,这是一件很广泛的事情,我对此还很陌生,我只能调整代码的一些小细节,而且我无法在任何地方找到比 .GetElementById 函数更深入的帮助文档这不适用于没有 ID 的 html 元素。
Image of table HTML, + plaintext
<table data-row-id="1579657885" class="a-keyvalue"><tbody><tr><th>ASIN</th><td><a href="/DEN3/results?s=1579657885">1579657885</a></td></tr><tr><th>Title</th><td><a target="_blank" href="http://www.amazon.com/gp/product/1579657885">1,000 Places to See Before You Die (Deluxe Edition): The World as You've Never Seen It Before</a></td></tr><tr><th>Binding</th><td>Hardcover</td></tr><tr><th>Publisher</th><td></td></tr><tr><th>Vendor Code</th><td>ATSAN</td></tr><tr><th>Weight</th><td>6.45 pounds</td></tr><tr><th>Dimensions</th><td>1.50 x 13.00 x 9.80 IN</td></tr><tr><th>List Price</th><td>USD 50.00</td></tr><tr><th>Expiration Date</th><td class=""></td></tr><tr><th>Asin Demand</th><td><a target="_blank" href="https://ufo.amazon.com/srw14na/asins/place_in_line/1579657885?warehouse=DEN3">Demand for 1579657885</a></td></tr><tr><th>Sortable</th><td>true</td></tr><tr><th>Conveyable</th><td>true</td></tr><tr><th>Very High Value</th><td>false</td></tr><tr><th>Master Case</th><td>false</td></tr><tr><th>FCSku Scope</th><td>FNSKU</td></tr><tr><th>Sales Forecast</th><td>4.0</td></tr><tr><th>Sales History (approx)</th><td>5.0</td></tr><tr><th>Sales Override</th><td>0.0</td></tr><tr><th>ASIN Velocity (approx)</th><td>5.0</td></tr><tr><th>Provenance Value</th><td>UNTRACKED</td></tr><tr><th>Provenance IOG</th><td>Info Not Found</td></tr></tbody></table>
好的,这里有两种获取所需信息的方法。如果您理解逻辑,我相信这些方法的任意组合应该足以根据您的需要调整代码。
为了简单起见,我假设 HTML 已经加载到名为 D
的 HTMLDocument
对象中。出于演示目的,利息值将打印在您的即时 window 中。
首先,您需要参考 Microsoft HTML Object Library
(VBE>工具>参考>...)。
我将使用以下变量:
Dim table As HTMLTable
Dim tableOfInterest As HTMLTable
Dim row As HTMLTableRow
Dim rowOfInterest As HTMLTableRow
Dim cell As HTMLTableCell 'not using it but you could in a For-Each
Dim cellOfInterest As HTMLTableCell
假设 table 的索引,行的索引和单元格的索引始终相同并且您知道它们:
Set tableOfInterest = D.getElementsByTagName("table")(0) 'Assuming the table of interest is the first table to appear in the HTML document. Keep in mind indexing starts at zero!
Set rowOfInterest = tableOfInterest.getElementsByTagName("tr")(18) 'Assuming the row of interest is the 19th row in the table.
Set cellOfInterest = rowOfInterest.getElementsByTagName("td")(0) 'Assuming the cell of interest is the 1st cell in the row.
Debug.Print cellOfInterest.innerText
假设您没有明确知道 table 和行的索引,但您知道属性或内部文本等其他信息
For Each table In D.getElementsByTagName("table")
If table.Attributes("data-row-id").Value = "1579657885" Then 'assuming the value of this attribute is always the same
Set tableOfInterest = table
End If
Next table
For Each row In tableOfInterest.getElementsByTagName("tr")
If row.innerText Like "*ASIN Velocity (approx)*" Then 'assuming that's the text you're looking for
Set rowOfInterest = row
End If
Next row
Debug.Print rowOfInterest.Cells(1).innerText 'in this case the "th" element is also considered a cell so the cell you're interested in is the 2nd one.
ID 缺失时使用的另一种方法是 .getElementsByClassName()
。它使用与 .getElementsByTagName()
相同的逻辑。