HTML 从上一个网页而不是重定向网页填充的元素集合 VBA
HTML Element Collection filled from Previous Webpage Rather than Redirected Webpage VBA
下面的代码导航到一个网页,用查询填充搜索框,然后提交到结果页面。但是,脚本中的最终元素集合 tdtags 是在重定向之后定义的,它从原始搜索页面而不是结果页面中提取数据。我目前在脚本中有 while ie.busy 循环和定时延迟,两者都不起作用。我也尝试过等到只出现在结果页面中的元素在 html 中变得可用,但这也不起作用。
Dim twb As Workbook
Dim ie As Object
Set twb = ThisWorkbook
twb.Activate
Set ie = CreateObject("internetexplorer.application")
'church = Sheets("Control").Range("A2").Value
'minister = Sheets("Control").Range("A4").Value
location = "London" 'Sheets("Control").Range("A6").Value
'denomination = Sheets("Control").Range("A8").Value
With ie
.navigate "http://www.ukchurch.org/index.php"
.Visible = True
Do While .Busy Or .ReadyState <> 4
DoEvents
Loop
End With
Application.Wait (Now + TimeValue("00:00:02"))
Set intags = ie.document.getelementsbytagname("input")
For Each intag In intags
If intag.getattribute("name") = "name" Then
If church <> "" Then
intag.Value = church
End If
ElseIf intag.getattribute("name") = "minister" Then
If minister <> "" Then
intag.Value = minister
End If
ElseIf intag.getattribute("name") = "location" Then
If location <> "" Then
intag.Value = location
End If
Else
End If
Next intag
Set dropopt = ie.document.getelementsbytagname("select")
For Each dropo In dropopt
If dropo.classname = "DenominationDropDown" Then
Set opttags = dropo.getelementsbytagname("option")
For Each opt In opttags
If opt.innertext = denomination Then
opt.Selected = True
End If
Next opt
End If
Next dropo
On Error Resume Next
For Each intag In intags
If intag.getattribute("src") = "images/ukchurch/button-go.jpg" Then
intag.Click
Do While ie.Busy Or ie.ReadyState <> 4
DoEvents
Loop
Application.Wait (Now + TimeValue("00:00:03"))
Exit For
End If
Next intag
Application.Wait (Now + TimeValue("00:00:03"))
Set tdtags = ie.document.getelementsbytagname("td")
For Each td In tdtags
If td.classname = "pText" Then
Debug.Print td.innertext
Debug.Print ie.locationURL
pagecount = Right(td.innertext, InStr(td.innertext, ":"))
End If
Next td
Debug.Print pagecount
End Sub
任何诊断将不胜感激。
自动化 IE 很痛苦,所以要避免它。
以下函数直接请求结果页面。
Public Function GetSearchResult(Optional ByVal ResultPage As Integer = 0, Optional ByVal ChurchName As String = "", Optional ByVal Minister As String = "", Optional ByVal ChurchLocation As String = "", Optional ByVal Denomination As String = "") As Object
Dim Request As Object: Set Request = CreateObject("MSXML2.serverXMLHTTP")
Dim Result As Object: Set Result = CreateObject("htmlfile")
Request.Open "POST", "http://www.ukchurch.org/searchresults1.php", False
Request.setRequestHeader "content-type", "application/x-www-form-urlencoded"
Request.send IIf(ResultPage = 0, "", "page=" & ResultPage & "&") & "name=" & ChurchName & "&minister=" & Minister & "&location=" & ChurchLocation & "&denomination=" & Denomination
Result.body.innerHTML = Request.responseText
Set GetSearchResult = Result
End Function
一个示例,它在包含搜索结果
的table 中打印具有类名pText
的td
的内容
Sub Main()
Dim Document As Object
Set Document = GetSearchResult(ChurchLocation:="London")
Dim ResultRows as Object
Dim ResultRow As Object
Set ResultRows = Document.getElementsByTagName("table")(8).getElementsByTagName("td")
For Each ResultRow in ResultRows
If ResultRow.Classname = "pText" Then
Debug.print ResultRow.innerText
End If
Next
End Sub
更新
您需要向 VBA 项目添加几个引用,才能使以下代码正常工作。
在 VBA 编辑器中,转到“工具”菜单,单击“引用”,然后在打开的对话框中,在以下两项旁边添加一个勾:Microsoft XML, v6.0
和 Microsoft HTML Object Library
(
Public Function GetChurchDetails(ByVal ChurchID As String) As MSHTML.HTMLDocument
Dim Request As New MSXML2.ServerXMLHTTP60
Dim Result As New MSHTML.HTMLDocument
Request.Open "GET", "http://www.ukchurch.org/churchdetails.php?churchid=" & ChurchID, False
Request.send
Result.body.innerHTML = Request.responseText
Set GetChurchDetails = Result
End Function
Sub Main2()
Dim Document As MSHTML.HTMLDocument
Dim Church As MSHTML.HTMLDocument
Set Document = GetSearchResult(ChurchLocation:="London")
Dim ResultRows As MSHTML.IHTMLElementCollection
Dim ResultRow As MSHTML.IHTMLElement
Dim ChurchID As String
'Set ResultRows = Document.getElementsByTagName("table")(8).getElementsByTagName("td")
' all result links on searchresults1.php have a classname of resultslink which makes getting them much easier
Set ResultRows = Document.getElementsByClassName("resultslink")
For Each ResultRow In ResultRows
ChurchID = ResultRow.getAttribute("href")
ChurchID = Mid(ChurchID, InStr(1, ChurchID, "=") + 1)
Set Church = GetChurchDetails(ChurchID)
' code to read data from the page using Church as the Document
' eg: Church.getElemenetsByTagName("td").....
Next
End Sub
提交数据时只需要使用"post"模式,其他都可以使用"get"
下面的代码导航到一个网页,用查询填充搜索框,然后提交到结果页面。但是,脚本中的最终元素集合 tdtags 是在重定向之后定义的,它从原始搜索页面而不是结果页面中提取数据。我目前在脚本中有 while ie.busy 循环和定时延迟,两者都不起作用。我也尝试过等到只出现在结果页面中的元素在 html 中变得可用,但这也不起作用。
Dim twb As Workbook
Dim ie As Object
Set twb = ThisWorkbook
twb.Activate
Set ie = CreateObject("internetexplorer.application")
'church = Sheets("Control").Range("A2").Value
'minister = Sheets("Control").Range("A4").Value
location = "London" 'Sheets("Control").Range("A6").Value
'denomination = Sheets("Control").Range("A8").Value
With ie
.navigate "http://www.ukchurch.org/index.php"
.Visible = True
Do While .Busy Or .ReadyState <> 4
DoEvents
Loop
End With
Application.Wait (Now + TimeValue("00:00:02"))
Set intags = ie.document.getelementsbytagname("input")
For Each intag In intags
If intag.getattribute("name") = "name" Then
If church <> "" Then
intag.Value = church
End If
ElseIf intag.getattribute("name") = "minister" Then
If minister <> "" Then
intag.Value = minister
End If
ElseIf intag.getattribute("name") = "location" Then
If location <> "" Then
intag.Value = location
End If
Else
End If
Next intag
Set dropopt = ie.document.getelementsbytagname("select")
For Each dropo In dropopt
If dropo.classname = "DenominationDropDown" Then
Set opttags = dropo.getelementsbytagname("option")
For Each opt In opttags
If opt.innertext = denomination Then
opt.Selected = True
End If
Next opt
End If
Next dropo
On Error Resume Next
For Each intag In intags
If intag.getattribute("src") = "images/ukchurch/button-go.jpg" Then
intag.Click
Do While ie.Busy Or ie.ReadyState <> 4
DoEvents
Loop
Application.Wait (Now + TimeValue("00:00:03"))
Exit For
End If
Next intag
Application.Wait (Now + TimeValue("00:00:03"))
Set tdtags = ie.document.getelementsbytagname("td")
For Each td In tdtags
If td.classname = "pText" Then
Debug.Print td.innertext
Debug.Print ie.locationURL
pagecount = Right(td.innertext, InStr(td.innertext, ":"))
End If
Next td
Debug.Print pagecount
End Sub
任何诊断将不胜感激。
自动化 IE 很痛苦,所以要避免它。
以下函数直接请求结果页面。
Public Function GetSearchResult(Optional ByVal ResultPage As Integer = 0, Optional ByVal ChurchName As String = "", Optional ByVal Minister As String = "", Optional ByVal ChurchLocation As String = "", Optional ByVal Denomination As String = "") As Object
Dim Request As Object: Set Request = CreateObject("MSXML2.serverXMLHTTP")
Dim Result As Object: Set Result = CreateObject("htmlfile")
Request.Open "POST", "http://www.ukchurch.org/searchresults1.php", False
Request.setRequestHeader "content-type", "application/x-www-form-urlencoded"
Request.send IIf(ResultPage = 0, "", "page=" & ResultPage & "&") & "name=" & ChurchName & "&minister=" & Minister & "&location=" & ChurchLocation & "&denomination=" & Denomination
Result.body.innerHTML = Request.responseText
Set GetSearchResult = Result
End Function
一个示例,它在包含搜索结果
的table 中打印具有类名pText
的td
的内容
Sub Main()
Dim Document As Object
Set Document = GetSearchResult(ChurchLocation:="London")
Dim ResultRows as Object
Dim ResultRow As Object
Set ResultRows = Document.getElementsByTagName("table")(8).getElementsByTagName("td")
For Each ResultRow in ResultRows
If ResultRow.Classname = "pText" Then
Debug.print ResultRow.innerText
End If
Next
End Sub
更新 您需要向 VBA 项目添加几个引用,才能使以下代码正常工作。
在 VBA 编辑器中,转到“工具”菜单,单击“引用”,然后在打开的对话框中,在以下两项旁边添加一个勾:Microsoft XML, v6.0
和 Microsoft HTML Object Library
(
Public Function GetChurchDetails(ByVal ChurchID As String) As MSHTML.HTMLDocument
Dim Request As New MSXML2.ServerXMLHTTP60
Dim Result As New MSHTML.HTMLDocument
Request.Open "GET", "http://www.ukchurch.org/churchdetails.php?churchid=" & ChurchID, False
Request.send
Result.body.innerHTML = Request.responseText
Set GetChurchDetails = Result
End Function
Sub Main2()
Dim Document As MSHTML.HTMLDocument
Dim Church As MSHTML.HTMLDocument
Set Document = GetSearchResult(ChurchLocation:="London")
Dim ResultRows As MSHTML.IHTMLElementCollection
Dim ResultRow As MSHTML.IHTMLElement
Dim ChurchID As String
'Set ResultRows = Document.getElementsByTagName("table")(8).getElementsByTagName("td")
' all result links on searchresults1.php have a classname of resultslink which makes getting them much easier
Set ResultRows = Document.getElementsByClassName("resultslink")
For Each ResultRow In ResultRows
ChurchID = ResultRow.getAttribute("href")
ChurchID = Mid(ChurchID, InStr(1, ChurchID, "=") + 1)
Set Church = GetChurchDetails(ChurchID)
' code to read data from the page using Church as the Document
' eg: Church.getElemenetsByTagName("td").....
Next
End Sub
提交数据时只需要使用"post"模式,其他都可以使用"get"