WebClient 下载 Zip 文件重定向时出现 404 错误
WebClient Downloading Zip Files 404 error on redirect
我有 5 个 zip 文件要从网站下载。
http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part1_5.zip
http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part2_5.zip
http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part3_5.zip
http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part4_5.zip
http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part5_5.zip
但是,如果我使用以下代码,我会收到 404 错误,我认为这是因为当我在浏览器中导航到该页面时 http:// 被丢弃,而不是当我使用我的代码时。
Try
Dim reg As String = """.*zip"""
Dim list As New List(Of String)()
Dim list2 As New List(Of String)()
Dim myRegex As New Regex(reg, RegexOptions.None)
TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower
For Each myMatch As Match In myRegex.Matches(TextBox1.Text)
list.Add(myMatch.Value)
Next
Dim temp As String
For Each i In list
temp = i.Remove(0, 1)
temp = temp.Remove(temp.Length - 1, 1)
list2.Add(temp)
Next
Dim x As Integer = 1
For Each i In list2
Dim address As String = "http://download.companieshouse.gov.uk/" + i
Dim des As String = Application.StartupPath + "\" + x.ToString + ".zip"
Dim client As New System.Net.WebClient()
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
client.DownloadFile(address, des)
x = x + 1
Next
For i As Integer = 1 To x Step 1
Shell(Application.StartupPath + "za.exe e " + Application.StartupPath + "\" + x + ".zip")
Next
list.Clear()
Catch ex As Exception
MsgBox(ex.ToString)
End Try
有什么想法吗?
*更新:我包含了完整的代码而不是代码片段。
这可能是您存储文件名数据的内容或方式。您的代码中还有一两个其他问题:
Private filList As New List(Of String) From {"BasicCompanyData-2015-02-01-part1_5.zip",
"BasicCompanyData-2015-02-01-part2_5.zip",
"BasicCompanyData-2015-02-01-part3_5.zip",
"BasicCompanyData-2015-02-01-part4_5.zip",
"BasicCompanyData-2015-02-01-part5_5.zip"}
然后在其他地方比如一个按钮点击:
Dim destPath As String = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments)
Dim destFile As String
Dim baseURL As String = "http://download.companieshouse.gov.uk/"
Dim thisURL As String
Using wc As New WebClient
wc.Headers.Add("user-agent",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
For Each f As String In filList
thisURL = baseURL & f
destFile = Path.Combine(destPath, f)
wc.DownloadFile(thisURL, destFile)
Next
End Using
- USING 块确保将关闭、处置 WebClient 并释放资源。
- 在 VS 中,使用
Application.StartupPath
会起作用,但作为部署的应用程序,当应用程序安装到 Program Files...
时可能会失败,因为您的应用程序可能无法在那里写入。使用Environment.GetFolderPath
获取文件夹如MyDocuments
.
- 此版本保留了原始文件名,因此如果您正在处理其他文件,它们不会相互覆盖(使用 App
StartupPath
时的另一个可能问题)。
问题在于我在页面上找到链接的方式,我正在将网页读入文本框并对其应用 .toLower,所以当我提取它时 url 是错误的。
问题行:
TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower
我有 5 个 zip 文件要从网站下载。
http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part1_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part2_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part3_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part4_5.zip http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part5_5.zip
但是,如果我使用以下代码,我会收到 404 错误,我认为这是因为当我在浏览器中导航到该页面时 http:// 被丢弃,而不是当我使用我的代码时。
Try
Dim reg As String = """.*zip"""
Dim list As New List(Of String)()
Dim list2 As New List(Of String)()
Dim myRegex As New Regex(reg, RegexOptions.None)
TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower
For Each myMatch As Match In myRegex.Matches(TextBox1.Text)
list.Add(myMatch.Value)
Next
Dim temp As String
For Each i In list
temp = i.Remove(0, 1)
temp = temp.Remove(temp.Length - 1, 1)
list2.Add(temp)
Next
Dim x As Integer = 1
For Each i In list2
Dim address As String = "http://download.companieshouse.gov.uk/" + i
Dim des As String = Application.StartupPath + "\" + x.ToString + ".zip"
Dim client As New System.Net.WebClient()
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
client.DownloadFile(address, des)
x = x + 1
Next
For i As Integer = 1 To x Step 1
Shell(Application.StartupPath + "za.exe e " + Application.StartupPath + "\" + x + ".zip")
Next
list.Clear()
Catch ex As Exception
MsgBox(ex.ToString)
End Try
有什么想法吗?
*更新:我包含了完整的代码而不是代码片段。
这可能是您存储文件名数据的内容或方式。您的代码中还有一两个其他问题:
Private filList As New List(Of String) From {"BasicCompanyData-2015-02-01-part1_5.zip",
"BasicCompanyData-2015-02-01-part2_5.zip",
"BasicCompanyData-2015-02-01-part3_5.zip",
"BasicCompanyData-2015-02-01-part4_5.zip",
"BasicCompanyData-2015-02-01-part5_5.zip"}
然后在其他地方比如一个按钮点击:
Dim destPath As String = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments)
Dim destFile As String
Dim baseURL As String = "http://download.companieshouse.gov.uk/"
Dim thisURL As String
Using wc As New WebClient
wc.Headers.Add("user-agent",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
For Each f As String In filList
thisURL = baseURL & f
destFile = Path.Combine(destPath, f)
wc.DownloadFile(thisURL, destFile)
Next
End Using
- USING 块确保将关闭、处置 WebClient 并释放资源。
- 在 VS 中,使用
Application.StartupPath
会起作用,但作为部署的应用程序,当应用程序安装到Program Files...
时可能会失败,因为您的应用程序可能无法在那里写入。使用Environment.GetFolderPath
获取文件夹如MyDocuments
. - 此版本保留了原始文件名,因此如果您正在处理其他文件,它们不会相互覆盖(使用 App
StartupPath
时的另一个可能问题)。
问题在于我在页面上找到链接的方式,我正在将网页读入文本框并对其应用 .toLower,所以当我提取它时 url 是错误的。
问题行:
TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower