协议错误|404 使用 HttpClient 或 WebClient 从 API 读取 JSON

Protocol Error|404 reading JSON from API with HttpClient or WebClient

正在尝试使用此服务获取 JSON 数据以便在本地进一步处理:

https://www.sec.gov/edgar/sec-api-documentation

测试示例(来自上面的文档 link):

https://data.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json

这在浏览器中效果很好,但是我尝试了几种方法(如下所列) 全部失败主要是协议错误/404

根据文档 https://www.sec.gov/os/webmaster-faq#developers

我已经合并了所需的 header 个条目(我认为正确)

我一直读到首选方法是使用 HttpClient,但是我遇到的所有简单示例都是 C# and/or 使用我不理解的 lambda 表达式。我尽力翻译成 VB - 不确定我是否正确。

我也已将此添加到 App.config:

<system.net>
  <settings>
    <httpWebRequest useUnsafeHeaderParsing = "true"/>
  </settings>
</system.net>

该应用程序是 Framework 4.7.2 上的 Win Forms 应用程序 OS 是 Windows 10 Pro、20H2、64 位操作系统、基于 x64 的处理器

Imports System.IO
Imports System.Net
Imports System.Net.Http

Public Module SECJSON

    Public sourceUrl As String = "https://data.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json"

    Public ua As String = "My Edgar Reader Admin@Example.com"
    Public accept As String = ""
    'Public accept As String = "application/json"
    'Public accept As String = "text/plain"
    'Public accept As String = "*/*"

    Public Sub RunAttempts()
        Debug.WriteLine("HttpWebRequest:")
        Debug.WriteLine(Fetch1(sourceUrl))
        Debug.WriteLine("-----------")
        Debug.WriteLine("")

        Call Fetch2(sourceUrl)

        Debug.WriteLine("WebClient:")
        Debug.WriteLine(Fetch3(sourceUrl))
        Debug.WriteLine("-----------")
        Debug.WriteLine("")

        Debug.WriteLine("WebClient File:")
        Debug.WriteLine(Fetch4(sourceUrl))
        Debug.WriteLine("-----------")
        Debug.WriteLine("")

    End Sub

    Public Function Fetch1(url As String) As String
        Dim request As HttpWebRequest
        Dim response As HttpWebResponse
        Dim reader As StreamReader

        'request = WebRequest.Create(url)
        request = DirectCast(WebRequest.Create(url), HttpWebRequest)
        request.ContentType = "application/json"
        'request.Method = "POST"
        request.Method = "GET"

        request.UserAgent = ua
        request.Host = "www.sec.gov"
        request.Headers("Accept-Encoding") = "gzip, deflate"

        If Not (accept = "") Then request.Accept = accept

        Dim result As String = ""

        Try
            response = DirectCast(request.GetResponse(), HttpWebResponse)
            If response Is Nothing = False Then
                reader = New StreamReader(response.GetResponseStream())
                result = reader.ReadToEnd()
            Else
                result = "No Data"
            End If
        Catch ex As Exception
            result = ex.Message
        End Try

        Return result
    End Function

    Public Async Sub Fetch2(page As String)

        Dim request As New HttpRequestMessage(HttpMethod.Get, New Uri(page))
        If Not (accept = "") Then request.Headers.Accept.Add(New Headers.MediaTypeWithQualityHeaderValue(accept))
        request.Headers.TryAddWithoutValidation("User-Agent", ua)
        request.Headers.TryAddWithoutValidation("Host", "www.sec.gov")
        request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate")

        Dim client As HttpClient = New HttpClient

        Using response As HttpResponseMessage = Await client.SendAsync(request)
            Using content As HttpContent = response.Content
                'content.Headers.Remove("Content-Type")
                'content.Headers.Add("Content-Type", "application/json")
                Dim result As String = Await content.ReadAsStringAsync()
                If result Is Nothing Then
                    Debug.WriteLine("HttpClient:No result")
                Else
                    Debug.WriteLine("HttpClient:")
                    Debug.WriteLine(result)
                    Debug.WriteLine("-----------")
                    Debug.WriteLine("")
                End If
            End Using
        End Using
    End Sub

    Public Function Fetch3(url As String) As String

        Dim client As New WebClient()
        'ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12

        client.Headers.Add("User-Agent", ua)
        client.Headers.Add("Host", "www.sec.gov")
        client.Headers.Add("Accept-Encoding", "gzip, deflate")

        If Not (accept = "") Then client.Headers.Add("accept", accept)

        Dim ans As String = ""
        Try
            ans = client.DownloadString(New Uri(url))
        Catch ex As WebException
            If ex.Response Is Nothing = False Then
                'Dim dataStream As Stream = ex.Response.GetResponseStream
                'Dim reader As New StreamReader(dataStream)
                'ans = reader.ReadToEnd()
                ans = ex.Message & "|" & ex.Status.ToString
            Else
                ans = "No Response Stream"
            End If
        End Try

        Return ans
    End Function

    Public Function Fetch4(url As String) As String

        Dim client As New WebClient()
        Dim tmpFile As String = Path.Combine(Path.GetTempPath(), "edgar.txt")
        Debug.WriteLine(tmpFile)
        Try
            client.DownloadFile(New Uri(url), tmpFile)
            Return File.ReadAllText(tmpFile)
        Catch ex As WebException
            If ex.Response Is Nothing = False Then
                Return ex.Message & "|" & ex.Response.ContentType & " " & ex.Response.ContentLength.ToString
            Else
                Return ex.Message
            End If
        End Try

    End Function

End Module

问题是您的主机 header 设置为 'www.sec.gov' 因此请求将发送至 'https://www.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json' 而不是 'https://data.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json'

解决方案很简单,只需编辑主机 header 来自:

request.Host = "www.sec.gov"

至:

request.Host = "data.sec.gov"

下次如果您无法确定问题,请尝试使用 Fiddler 查看您的请求有什么问题。