Python 请求:从一个 TXT 文件中获取所有行,一次获取一个请求并将它们保存到一个新的 TXT 文件中

Python Requests: take all lines from a TXT file, one at a time to get requests from each and save them to a new TXT file

这里的代码从网络 archive.org 获取一堆 URLS 并将它们保存到一个新的 TXT 文件中。我不想输入(写一个 url 地址),而是从 TXT 文件加载一堆 URLS。因此 x=input('URL:') 必须替换为一些代码以从 txt 文件一次加载每一行。

我已经尝试了几天了,我卡住了!请帮忙!

代码:

x=input('Enter your url:-')
r = requests.get('http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey'.format(x))
with open('url.txt', 'a') as f:
    f.write('\n')
    f.writelines(str(r.text))
    f.write('\n')

首先,您需要将 urls.txt 文件中的所有 URL 都用新行分隔,然后使用 readlines() 函数打开它。它将 return 所有 URL 的列表。这是完整的代码。

import requests
with open('urls.txt') as file:
    # get the list of urls
    urls_list=file.readlines()
    for x in urls_list:
        r = requests.get('http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey'.format(x))
        print(r.status_code)

要从文件中读取 URL,您可以使用下一个示例:

import requests

urls = []
with open("something.txt", "r") as f_in:
    for line in map(str.strip, f_in):
        if line == "":
            continue
        urls.append(line)

archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"

with open("output.txt", "w") as f_out:
    for url in urls:
        print(url)
        r = requests.get(archive_url.format(url))
        print(r.text, file=f_out)
        print("\n", file=f_out)

something.txt 包含域,例如:

google.com
yahoo.com

output.txt 包含来自 requests

的回复