网页抓取:运行-time error 91 Object variable not set
Web scraping: Run-time error 91 Object variable not set
我正在尝试使用来自该网站的信息更新我的物种数据库:https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/search.php
我有一个包含 A 列物种列表的 xlsm,并在网站的搜索引擎中搜索它们中的每一个,这导致一个页面显示 link 到另一个专用于该特定的页面物种。每个物种都有一个唯一的ID标识,这就是我要的信息。
例如如果我在搜索框“学名”中输入“Mnais mneme”,将出现一个显示 table 的页面,其中包含该物种,其名称附有 link (https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/popup_record.php?id=781&lang=en)。 “781”将是物种 ID。
我想将此 link 复制到我的 xlsm 的 B 列中,并在 Excel 中提取 ID:
Sub SearchBot()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
'link will be the <a> carrying the href with the species id
Dim link As HTMLAnchorElement
'define y as interger counter
Dim y As Integer
'initiating a new instance of Internet Explorer and assigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
objIE.Visible = True
'navigate IE to this web page
objIE.navigate "https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/search.php"
'wait here for a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
objIE.document.getElementById("s1").Click
'in the search box put cell "A2" value
objIE.document.all.Item("scientific_name").Value = Sheets("Sheet1").Range("A2").Value
'click the 'Search' button
objIE.document.getElementsByClassName("btn_3")(1).Click
'wait again for the browser
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'select the species name link
Set link = objIE.document.getElementsByTagName("td")(4).getElementsByTagName ("a")(0)
y = 2
'print the link to column B in Sheet1
Sheets("Sheet1").Range("B" & y).Value = link.href
End Sub
调试显示
run-time error 91
在最后一行停止时:
Sheets("Sheet1").Range("B" & y).Value = link.href
设置link为HTMLAnchorElement有问题吗?我尝试将其设置为对象,但错误仍然出现。
下面是一些使用网络请求查找您感兴趣的数据的代码。我查看了页面,发现复选框也包含 SpeciesID
,因此,我使用了名称该控件找到输入的值。
代码
Public Function GetSpeciesID(ScientificName As String) As String
Dim requestURL As String
Const InvalidValue As String = "-1"
'If the Scientific Name is blank, return a default value
If (Trim$(ScientificName) = vbNullString) Then
GetSpeciesID = InvalidValue
Exit Function
End If
requestURL = "https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/doSearch.php?" & _
"entity_id=0&family_name=&scientific_name=" & WorksheetFunction.EncodeURL(ScientificName) & _
"&common_name=&chinese_name=&hk_protection_status_val=&chinared_status_val=&iucn_status_val="
With CreateObject("MSXML2.ServerXMLHTTP.6.0")
.Open "GET", requestURL
.send
Dim html As Object: Set html = CreateObject("htmlfile")
html.body.innerhtml = .responseText
End With
GetSpeciesID = html.getElementsByName("check")(0).Value
End Function
'Run this method
Public Sub Runner()
Debug.Print GetSpeciesID("Mnais mneme")
End Sub
我正在尝试使用来自该网站的信息更新我的物种数据库:https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/search.php
我有一个包含 A 列物种列表的 xlsm,并在网站的搜索引擎中搜索它们中的每一个,这导致一个页面显示 link 到另一个专用于该特定的页面物种。每个物种都有一个唯一的ID标识,这就是我要的信息。
例如如果我在搜索框“学名”中输入“Mnais mneme”,将出现一个显示 table 的页面,其中包含该物种,其名称附有 link (https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/popup_record.php?id=781&lang=en)。 “781”将是物种 ID。
我想将此 link 复制到我的 xlsm 的 B 列中,并在 Excel 中提取 ID:
Sub SearchBot()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
'link will be the <a> carrying the href with the species id
Dim link As HTMLAnchorElement
'define y as interger counter
Dim y As Integer
'initiating a new instance of Internet Explorer and assigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
objIE.Visible = True
'navigate IE to this web page
objIE.navigate "https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/search.php"
'wait here for a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
objIE.document.getElementById("s1").Click
'in the search box put cell "A2" value
objIE.document.all.Item("scientific_name").Value = Sheets("Sheet1").Range("A2").Value
'click the 'Search' button
objIE.document.getElementsByClassName("btn_3")(1).Click
'wait again for the browser
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'select the species name link
Set link = objIE.document.getElementsByTagName("td")(4).getElementsByTagName ("a")(0)
y = 2
'print the link to column B in Sheet1
Sheets("Sheet1").Range("B" & y).Value = link.href
End Sub
调试显示
run-time error 91
在最后一行停止时:
Sheets("Sheet1").Range("B" & y).Value = link.href
设置link为HTMLAnchorElement有问题吗?我尝试将其设置为对象,但错误仍然出现。
下面是一些使用网络请求查找您感兴趣的数据的代码。我查看了页面,发现复选框也包含 SpeciesID
,因此,我使用了名称该控件找到输入的值。
代码
Public Function GetSpeciesID(ScientificName As String) As String
Dim requestURL As String
Const InvalidValue As String = "-1"
'If the Scientific Name is blank, return a default value
If (Trim$(ScientificName) = vbNullString) Then
GetSpeciesID = InvalidValue
Exit Function
End If
requestURL = "https://www.afcd.gov.hk/english/conservation/hkbiodiversity/database/doSearch.php?" & _
"entity_id=0&family_name=&scientific_name=" & WorksheetFunction.EncodeURL(ScientificName) & _
"&common_name=&chinese_name=&hk_protection_status_val=&chinared_status_val=&iucn_status_val="
With CreateObject("MSXML2.ServerXMLHTTP.6.0")
.Open "GET", requestURL
.send
Dim html As Object: Set html = CreateObject("htmlfile")
html.body.innerhtml = .responseText
End With
GetSpeciesID = html.getElementsByName("check")(0).Value
End Function
'Run this method
Public Sub Runner()
Debug.Print GetSpeciesID("Mnais mneme")
End Sub