从 HTML table 中提取文本
Extracting text from HTML table
我正在尝试从此页面中提取各种元素:
http://partsurfer.hp.com/Search.aspx?searchText=4CE0460D0G
我想从 ctl00_BodyContentPlaceHolder_lblSerialNumber
开始。
如果您知道 ID,肯定有一个简单的解决方案可以从 HTML 页面中提取您想要的元素?我认为 getElementsByName
或 getElementById
甚至 getElementsByTagName
之类的东西会起作用,但我无法让它提取我想要的东西,尽我所能!
这行不通:
Function GetHPModelName()
Dim ie As Object
Dim Oelement As Object
Dim Ohtml As New MSHTML.HTMLDocument
Dim lrow As Integer
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://partsurfer.hp.com/Search.aspx?searchText=" & Worksheets("HP_Lookup").Range("A2").Value, False
.send
Ohtml.body.innerHTML = .responseText
End With
FetchHPInfo "ctl00_BodyContentPlaceHolder_lblSerialNumber", "A", Oelement, Ohtml
End Function
通话中
Public Function FetchHPInfo(tablename As String, thiscolumn As String, Oelement As Object, Ohtml As MSHTML.HTMLDocument)
lrow = 1
For Each Oelement In Ohtml.getElementsById(tablename)
Worksheets("HP_main").Range(thiscolumn & lrow).Value = Oelement.innerText
lrow = lrow + 1
Next Oelement
Worksheets("HP_main").Columns(thiscolumn).cells.HorizontalAlignment = xlHAlignLeft
Worksheets("HP_main").Columns(thiscolumn).AutoFit
End Function
getElementById()
应该是您所需要的,因为该节点具有 ID 属性。您可能会遇到问题,因为您正在尝试将 responseText
分配给文档正文,但文档还没有 <body>
节点。只需使用 write()
将整个响应写入空文档即可。这是我拼凑的一个例子 returns 正确的值:
Dim objHttp
Set objHttp = CreateObject("MSXML2.XMLHTTP")
objHttp.Open "GET", "http://partsurfer.hp.com/Search.aspx?searchText=4CE0460D0G", False
objHttp.Send
Dim doc
Set doc = CreateObject("htmlfile")
doc.write objHttp.responseText
MsgBox doc.getElementById("ctl00_BodyContentPlaceHolder_lblSerialNumber").innerText
输出:
4CE0460D0G
我正在尝试从此页面中提取各种元素:
http://partsurfer.hp.com/Search.aspx?searchText=4CE0460D0G
我想从 ctl00_BodyContentPlaceHolder_lblSerialNumber
开始。
如果您知道 ID,肯定有一个简单的解决方案可以从 HTML 页面中提取您想要的元素?我认为 getElementsByName
或 getElementById
甚至 getElementsByTagName
之类的东西会起作用,但我无法让它提取我想要的东西,尽我所能!
这行不通:
Function GetHPModelName()
Dim ie As Object
Dim Oelement As Object
Dim Ohtml As New MSHTML.HTMLDocument
Dim lrow As Integer
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://partsurfer.hp.com/Search.aspx?searchText=" & Worksheets("HP_Lookup").Range("A2").Value, False
.send
Ohtml.body.innerHTML = .responseText
End With
FetchHPInfo "ctl00_BodyContentPlaceHolder_lblSerialNumber", "A", Oelement, Ohtml
End Function
通话中
Public Function FetchHPInfo(tablename As String, thiscolumn As String, Oelement As Object, Ohtml As MSHTML.HTMLDocument)
lrow = 1
For Each Oelement In Ohtml.getElementsById(tablename)
Worksheets("HP_main").Range(thiscolumn & lrow).Value = Oelement.innerText
lrow = lrow + 1
Next Oelement
Worksheets("HP_main").Columns(thiscolumn).cells.HorizontalAlignment = xlHAlignLeft
Worksheets("HP_main").Columns(thiscolumn).AutoFit
End Function
getElementById()
应该是您所需要的,因为该节点具有 ID 属性。您可能会遇到问题,因为您正在尝试将 responseText
分配给文档正文,但文档还没有 <body>
节点。只需使用 write()
将整个响应写入空文档即可。这是我拼凑的一个例子 returns 正确的值:
Dim objHttp
Set objHttp = CreateObject("MSXML2.XMLHTTP")
objHttp.Open "GET", "http://partsurfer.hp.com/Search.aspx?searchText=4CE0460D0G", False
objHttp.Send
Dim doc
Set doc = CreateObject("htmlfile")
doc.write objHttp.responseText
MsgBox doc.getElementById("ctl00_BodyContentPlaceHolder_lblSerialNumber").innerText
输出:
4CE0460D0G