尝试使用宏从 excel 中的超链接导入数据时如何解决 "BAD REQUEST" 错误?

How to solve a "BAD REQUEST" error while trying to import data from a hyperlink in excel using macros?

我想从网站 https://www.amfiindia.com/nav-history-download 导入一些数据。在这个页面上,有一个 link "Download Complete NAV Report in Text Format" 会给我所需的数据。但是这个 link 不是静态的,所以我不能直接在 VBA 中使用它来下载我的数据。那么如何使用excel从网页上的hyperlink下载数据呢?

我的方法是先获取变量中的 hyperlink 然后使用该变量获取数据?

  1. 首先,使用 getElementsByTagName 函数获取 hyperlink,如下所示。
  2. 然后将其用作 URL 来获取数据。

但是我在向这个 hyperlink 发送请求时收到 "BAD REQUEST" 响应。我不知道为什么会出现此错误。我使用的代码是

Sub GrabLastNames()

    'dimension (set aside memory for) our variables
    Dim objIE As InternetExplorer
    Dim ele As Object
    Dim y As Integer
    Dim mtbl As String
    Dim request As Object
    Dim html As New HTMLDocument
    Dim website As String
    Dim price As Variant
    Dim cellAddress As String
    Dim rowNumber As Long



    'start a new browser instance
    Set objIE = New InternetExplorer
    'make browser visible
    objIE.Visible = True

    'navigate to page with needed data
    objIE.navigate "https://www.amfiindia.com/nav-history-download"
    'wait for page to load
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

    ' ht.querySelector(".nav-hist-dwnld a").href
    'we will output data to excel, starting on row 1
    y = 1
    mtbl = objIE.document.querySelector(".nav-hist-dwnld a").href

    ' mtbl = Sheets("Sheet1").Range("A" & y).Value

    ' Website to go to.
    ' website = mtbl

    ' Create the object that will make the webpage request.
    Set request = CreateObject("MSXML2.XMLHTTP")

    ' Where to go and how to go there - probably don't need to change this.
    request.Open "GET", mtbl, False


    ' Get fresh data.
    request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"

    ' Send the request for the webpage.
    request.send

    '    MsqBox "bye"

    ' Get the webpage response data into a variable.
    response = request.responseText



    ' Put the webpage into an html object to make data references easier.
    'html.body.innerHTML = response

     MsgBox "Hi"
    '   MsgBox "Bye Bye"

    ' Get the price from the specified element on the page.

    Sheets("Sheet1").Range("A" & y + 1).Value = "Hi"

    MsgBox response


    'look at all the 'tr' elements in the 'table' with id 'myTable',
    'and evaluate each, one at a time, using 'ele' variable

    ActiveWorkbook.Save

End Sub


响应变量应该包含来自网站的所有数据,但它却在 msgBox 中打印此 "Bad Request"。

您将 header 设置为 this,即使用 .setrequestheader 方法。

我看到这个 header 信息:

GET /spages/NAVAll.txt?t=06052020095056 HTTP/1.1
Host: www.amfiindia.com
Connection: keep-alive
Cache-Control: max-age=0
DNT: 1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
Cookie: __utma=57940026.1471746098.1588710696.1588710696.1588710696.1; __utmc=57940026; __utmz=57940026.1588710696.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
If-None-Match: "0d8e9bad223d61:0"
If-Modified-Since: Wed, 06 May 2020 18:18:24 GMT 

您不太可能需要所有这些,但对于 403,您很可能需要用户代理。例如.setRequestHeader "User-Agent","Mozilla/5.0"