以某种自定义方式使用 concurrent.futures 时无法打印函数的结果

Question

我使用 concurrent.futures 库创建了一个脚本来打印 fetch_links 函数的结果。当我在函数内使用 print 语句时，我得到了相应的结果。我现在想做的是使用 yield 语句打印该函数的结果。

有什么方法可以修改 main 函数下的内容，以便打印 fetch_links 函数的结果，保持原样，即保持 yield 语句？

import requests
from bs4 import BeautifulSoup
import concurrent.futures as cf

links = [
    "https://whosebug.com/questions/tagged/web-scraping?tab=newest&page=2&pagesize=50",
    "https://whosebug.com/questions/tagged/web-scraping?tab=newest&page=3&pagesize=50",
    "https://whosebug.com/questions/tagged/web-scraping?tab=newest&page=4&pagesize=50"
]

base = 'https://whosebug.com{}'

def fetch_links(s,link):
    r = s.get(link)
    soup = BeautifulSoup(r.text,"lxml")
    for item in soup.select(".summary .question-hyperlink"):
        # print(base.format(item.get("href")))
        yield base.format(item.get("href"))

if __name__ == '__main__':
    with requests.Session() as s:
        with cf.ThreadPoolExecutor(max_workers=5) as exe:
            future_to_url = {exe.submit(fetch_links,s,url): url for url in links}
            cf.as_completed(future_to_url)

Answer 1

您的 fetch_links 是一个生成器，因此您也必须对其进行循环以获得结果：

import requests
from bs4 import BeautifulSoup
import concurrent.futures as cf

links = [
    "https://whosebug.com/questions/tagged/web-scraping?tab=newest&page=2&pagesize=50",
    "https://whosebug.com/questions/tagged/web-scraping?tab=newest&page=3&pagesize=50",
    "https://whosebug.com/questions/tagged/web-scraping?tab=newest&page=4&pagesize=50"
]

base = 'https://whosebug.com{}'


def fetch_links(s, link):
    r = s.get(link)
    soup = BeautifulSoup(r.text, "lxml")
    for item in soup.select(".summary .question-hyperlink"):
        yield base.format(item.get("href"))


if __name__ == '__main__':
    with requests.Session() as s:
        with cf.ThreadPoolExecutor(max_workers=5) as exe:
            future_to_url = {exe.submit(fetch_links, s, url): url for url in links}
            for future in cf.as_completed(future_to_url):
                for result in future.result():
                    print(result)

输出：














and so on ...

以某种自定义方式使用 concurrent.futures 时无法打印函数的结果

Unable to print results from a function while using concurrent.futures in some customized way

python

web-scraping

python-3.x

concurrent.futures