如何在线程执行后访问 as_completed(futures) 年表中的元素?

How to access the elements in as_completed(futures) chronology after the threads have executed?

我有一个函数可以输入 Pandas 数据帧的一个子集。我正在使用 concurrent.futuresrequests_futures 来请求数据框 asynchronously/concurrently 的其中一列中的每个 URL。它工作得很好,但是,在执行并发操作之后,我无法将元素与数据帧的相应行匹配回来

代码如下:

from concurrent.futures import as_completed
from requests_futures.sessions import FuturesSession

def img_helper_function(df_chunk):

    # multithreading lock
    lock = threading.Lock() # Actually not used

    with FuturesSession(max_workers=10) as session:
        futures = [session.get(j, stream=True) for j in df_chunk["billedurl"].tolist()]
        filenames = [filename for filename in df_chunk["image_filename"].tolist()] # New, slugify filenames

        for idx, future in enumerate(as_completed(futures)):
            response = future.result()

            # These two elements don't match
            response.url
            filenames[idx]

您可以通过将 futures 存储为列表更改为以 future 为键、args 为值的字典来实现。这允许您使用完成的未来来查找相应的参数。在看起来像

的代码中
futures = {session.get(j, stream=True):j for j in df_chunk["billedurl"].tolist()}

for idx, future in enumerate(as_completed(futures)):
    col = futures[future]
...