Python: 试图只写入包含特定单词的行,而不是整个文本
Python: Trying to write to .txt only lines that contains specific word, instead of the whole text
此处的 Python 代码读取 list.txt,其中包含网站链接,然后从这些网站的 webarchive.org 中提取 URL,并将它们写入 urls.txt。我想要的是只提取包含特定“WORD”的行。如我所见,如果一行中存在特定的“WORD”,我的代码将提取所有行。
谁能解释一下为什么?提前致谢!
代码:
urls = []
with open("list.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("url.txt", "w") as f_out:
for url in urls:
r = requests.get(archive_url.format(url))
if 'WORD' in archive_url:
print(r.text, file=f_out)
print("\n", file=f_out)
我尝试用 if 'WORD' in url:
替换 if 'WORD' in archive_url:
但它没有向 TXT 写入任何内容!
我不知道如何只打印包含“WORD”的 LINE
with open("url.txt", "w") as f_out:
for url in urls:
if 'WORD' in url:
r = requests.get(archive_url.format(url))
f_out.write(r,'\n')
尝试:
import requests
urls = []
with open("list.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("url.txt", "w") as f_out:
for url in urls:
r = requests.get(archive_url.format(url))
for line in r.text.splitlines():
if "your_word" in line:
print(line, file=f_out)
print("\n", file=f_out)
此处的 Python 代码读取 list.txt,其中包含网站链接,然后从这些网站的 webarchive.org 中提取 URL,并将它们写入 urls.txt。我想要的是只提取包含特定“WORD”的行。如我所见,如果一行中存在特定的“WORD”,我的代码将提取所有行。
谁能解释一下为什么?提前致谢!
代码:
urls = []
with open("list.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("url.txt", "w") as f_out:
for url in urls:
r = requests.get(archive_url.format(url))
if 'WORD' in archive_url:
print(r.text, file=f_out)
print("\n", file=f_out)
我尝试用 if 'WORD' in url:
替换 if 'WORD' in archive_url:
但它没有向 TXT 写入任何内容!
我不知道如何只打印包含“WORD”的 LINE
with open("url.txt", "w") as f_out:
for url in urls:
if 'WORD' in url:
r = requests.get(archive_url.format(url))
f_out.write(r,'\n')
尝试:
import requests
urls = []
with open("list.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("url.txt", "w") as f_out:
for url in urls:
r = requests.get(archive_url.format(url))
for line in r.text.splitlines():
if "your_word" in line:
print(line, file=f_out)
print("\n", file=f_out)