WebClient 下载 Zip 文件重定向时出现 404 错误

WebClient Downloading Zip Files 404 error on redirect

我有 5 个 zip 文件要从网站下载。

http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part1_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part2_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part3_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part4_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part5_5.zip

但是,如果我使用以下代码,我会收到 404 错误,我认为这是因为当我在浏览器中导航到该页面时 http:// 被丢弃,而不是当我使用我的代码时。

    Try
        Dim reg As String = """.*zip"""
        Dim list As New List(Of String)()
        Dim list2 As New List(Of String)()
        Dim myRegex As New Regex(reg, RegexOptions.None)
        TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower
        For Each myMatch As Match In myRegex.Matches(TextBox1.Text) 
            list.Add(myMatch.Value)
        Next
        Dim temp As String
        For Each i In list
            temp = i.Remove(0, 1)
            temp = temp.Remove(temp.Length - 1, 1)
            list2.Add(temp)
        Next
        Dim x As Integer = 1
        For Each i In list2
            Dim address As String = "http://download.companieshouse.gov.uk/" + i
            Dim des As String = Application.StartupPath + "\" + x.ToString + ".zip"
            Dim client As New System.Net.WebClient()
            client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
            client.DownloadFile(address, des)
            x = x + 1
        Next

        For i As Integer = 1 To x Step 1
            Shell(Application.StartupPath + "za.exe e " + Application.StartupPath + "\" + x + ".zip")
        Next
        list.Clear()
    Catch ex As Exception
        MsgBox(ex.ToString)
    End Try

有什么想法吗?

*更新:我包含了完整的代码而不是代码片段。

这可能是您存储文件名数据的内容或方式。您的代码中还有一两个其他问题:

Private filList As New List(Of String) From {"BasicCompanyData-2015-02-01-part1_5.zip",
                                         "BasicCompanyData-2015-02-01-part2_5.zip",
                                         "BasicCompanyData-2015-02-01-part3_5.zip",
                                         "BasicCompanyData-2015-02-01-part4_5.zip",
                                         "BasicCompanyData-2015-02-01-part5_5.zip"}

然后在其他地方比如一个按钮点击:

Dim destPath As String = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments)
Dim destFile As String
Dim baseURL As String = "http://download.companieshouse.gov.uk/"
Dim thisURL As String

Using wc As New WebClient
    wc.Headers.Add("user-agent", 
            "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")

    For Each f As String In filList
        thisURL = baseURL & f
        destFile = Path.Combine(destPath, f)
        wc.DownloadFile(thisURL, destFile)
    Next

End Using
  1. USING 块确保将关闭、处置 WebClient 并释放资源。
  2. 在 VS 中,使用 Application.StartupPath 会起作用,但作为部署的应用程序,当应用程序安装到 Program Files... 时可能会失败,因为您的应用程序可能无法在那里写入。使用Environment.GetFolderPath获取文件夹如MyDocuments.
  3. 此版本保留了原始文件名,因此如果您正在处理其他文件,它们不会相互覆盖(使用 App StartupPath 时的另一个可能问题)。

问题在于我在页面上找到链接的方式,我正在将网页读入文本框并对其应用 .toLower,所以当我提取它时 url 是错误的。

问题行:

TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower