如何让 Python 记住文件内容而不保存到本地磁盘?
How to make Python remember the contents of a file without saving it to local disk?
我有一个脚本可以下载一个文本文件,然后提取所有 URL,然后创建一个新文件来保存这些 URL。但与其将其保存到本地磁盘,我更希望 Python 记住文本文件的内容,甚至可能使它等于一个变量,以便我可以用于下一步。这样,就不需要一直保存文件到本地磁盘,以后再添加命令删除这些文件。
这可能吗?如果可能,怎么做?
代码如下:
import urllib.request
import os
import re
# download text file to disk
urllib.request.urlretrieve("https://www.w3.org/TR/PNG/iso_8859-1.txt", "iso_input.txt")
# extract all URLs from input file then insert into new output file
with open("iso_input.txt", "r") as file:
for line in file:
urls = re.findall('https?://[^\s<>"]+[|www\.^\s<>"]+', line)
print(*urls, file=open("iso_output.txt", "a"))
我想你在找 io.StringIO
:
A text stream using an in-memory text buffer.
# Open input file and output "file"
with open("iso_input.txt", "r") as file, io.StringIO() as output:
for line in file:
urls = re.findall('https?://[^\s<>"]+[|www\.^\s<>"]+', line)
print(*urls, file=output) # print to in-memory buffer
# Save "output file content" as variable
urls = output.getvalue()
# Do something with the retrieved urls
print(urls)
附带说明一下,https://www.w3.org/TR/PNG/iso_8859-1.txt
不包含任何 URL,正如 @Corralien 正确指出的那样
我有一个脚本可以下载一个文本文件,然后提取所有 URL,然后创建一个新文件来保存这些 URL。但与其将其保存到本地磁盘,我更希望 Python 记住文本文件的内容,甚至可能使它等于一个变量,以便我可以用于下一步。这样,就不需要一直保存文件到本地磁盘,以后再添加命令删除这些文件。
这可能吗?如果可能,怎么做?
代码如下:
import urllib.request
import os
import re
# download text file to disk
urllib.request.urlretrieve("https://www.w3.org/TR/PNG/iso_8859-1.txt", "iso_input.txt")
# extract all URLs from input file then insert into new output file
with open("iso_input.txt", "r") as file:
for line in file:
urls = re.findall('https?://[^\s<>"]+[|www\.^\s<>"]+', line)
print(*urls, file=open("iso_output.txt", "a"))
我想你在找 io.StringIO
:
A text stream using an in-memory text buffer.
# Open input file and output "file"
with open("iso_input.txt", "r") as file, io.StringIO() as output:
for line in file:
urls = re.findall('https?://[^\s<>"]+[|www\.^\s<>"]+', line)
print(*urls, file=output) # print to in-memory buffer
# Save "output file content" as variable
urls = output.getvalue()
# Do something with the retrieved urls
print(urls)
附带说明一下,https://www.w3.org/TR/PNG/iso_8859-1.txt
不包含任何 URL,正如 @Corralien 正确指出的那样