Python 在 for 循环中下载多个文件
Python download multiple files within for loop
我有一个 URL 列表,这些 URL 指向 SEC 的文件(例如,https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm)
我的目标是编写一个 for 循环来打开 URL、请求文档并将其保存到文件夹中。
但是,我以后需要能够识别这些文件。这就是为什么我想使用 "htps://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm" 这个备案专用编号作为文档名称
directory = r"\Desktopks"
for url in url_list:
response = requests.get(url).content
path = (directory + str(url)[40:-5] +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
但每次,我都会收到以下错误消息:filenotfounderror:[errno 2] 没有那个文件或目录:
我真的希望你能帮帮我!!
谢谢
这个有效
for url in url_list:
response = requests.get(url).content.decode('utf-8')
path = (directory + str(url)[40:-5] +".txt").replace('/', '\')
with open(path, "w+") as f:
f.write(response)
f.close()
你构建的路径是这样的 \Desktop\10ks18651/000119312509042636/d10.txt
我想你正在为那些反斜杠处理 windows,无论如何你只需要替换 [=] 中的斜杠22=] 到反斜杠。
另一件事,write
收到一个字符串,因此您需要将以字节为单位的响应解码为字符串。
希望对您有所帮助!
import requests
import os
url_list = ["https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm"]
#Create the path Desktop/10ks/
directory = os.path.expanduser("~/Desktop") + "\10ks"
for url in url_list:
#Get the content as string instead of getting it as bytes
response = requests.get(url).text
#Replace slash in filename with underscore
filename = str(url)[40:-5].replace("/", "_")
#print filename to check if it is correct
print(filename)
path = (directory + "\" + filename +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
查看评论。
我想文件名中的反斜杠是不允许的,因为
filename = str(url)[40:-5].replace("/", "\")
给我
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\user/Desktop\10ks\18651\000119312509042636\d10.txt'
另请参阅:
https://docs.python.org/3/library/os.path.html#os.path.expanduser
我有一个 URL 列表,这些 URL 指向 SEC 的文件(例如,https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm)
我的目标是编写一个 for 循环来打开 URL、请求文档并将其保存到文件夹中。 但是,我以后需要能够识别这些文件。这就是为什么我想使用 "htps://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm" 这个备案专用编号作为文档名称
directory = r"\Desktopks"
for url in url_list:
response = requests.get(url).content
path = (directory + str(url)[40:-5] +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
但每次,我都会收到以下错误消息:filenotfounderror:[errno 2] 没有那个文件或目录:
我真的希望你能帮帮我!! 谢谢
这个有效
for url in url_list:
response = requests.get(url).content.decode('utf-8')
path = (directory + str(url)[40:-5] +".txt").replace('/', '\')
with open(path, "w+") as f:
f.write(response)
f.close()
你构建的路径是这样的 \Desktop\10ks18651/000119312509042636/d10.txt
我想你正在为那些反斜杠处理 windows,无论如何你只需要替换 [=] 中的斜杠22=] 到反斜杠。
另一件事,write
收到一个字符串,因此您需要将以字节为单位的响应解码为字符串。
希望对您有所帮助!
import requests
import os
url_list = ["https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm"]
#Create the path Desktop/10ks/
directory = os.path.expanduser("~/Desktop") + "\10ks"
for url in url_list:
#Get the content as string instead of getting it as bytes
response = requests.get(url).text
#Replace slash in filename with underscore
filename = str(url)[40:-5].replace("/", "_")
#print filename to check if it is correct
print(filename)
path = (directory + "\" + filename +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
查看评论。 我想文件名中的反斜杠是不允许的,因为
filename = str(url)[40:-5].replace("/", "\")
给我
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\user/Desktop\10ks\18651\000119312509042636\d10.txt'
另请参阅:
https://docs.python.org/3/library/os.path.html#os.path.expanduser