多进程子函数没有 return 任何结果

multiprocess sub-function does not return any results

我正在尝试使用 deco 模块提供的并发功能。代码在没有多线程的情况下工作,如此处的答案所示:

Extract specific columns from a given webpage

但是下面的代码没有 return finallist 的任何元素(它是空的)。它 return 在“慢”的功能范围内产生了一些结果,从 print 语句中可以明显看出。但是为什么外层列表是空的?

import urllib.request
from bs4 import BeautifulSoup
from deco import concurrent, synchronized

finallist=list()
urllist=list()
    
@concurrent
def slow(url):
    #print (url)
    try:
        page = urllib.request.urlopen(url).read()
        soup = BeautifulSoup(page)
        mylist=list()
        for anchor in soup.find_all('div', {'class':'col-xs-8'})[:9]: 
            mylist.append(anchor.text)
            urllist.append(url)
        finallist.append(mylist)
        #print (mylist)
        print (finallist)
    except:
        pass

@synchronized
def run():
    finallist=list()
    urllist=list()
    for i in range(10):
        url='https://pythonexpress.in/workshop/'+str(i).zfill(3)
        print (url)
        slow(url)
    slow.wait()

我重构了您的代码以使用该模块。我修复了两个 common pitfalls outlined on the deco wiki:

  1. 不要使用全局变量
  2. 用方括号操作做所有事情:obj[key] = value

结果如下:

import urllib
from bs4 import BeautifulSoup
from deco import concurrent, synchronized

N = 10

@concurrent
def slow(url):
    try:
        page = urllib.urlopen(url).read()
        soup = BeautifulSoup(page, "html.parser")
        mylist=list()
        for anchor in soup.find_all('div', {'class':'col-xs-8'})[:9]: 
            mylist.append(anchor.text)
        return mylist
    except:
        pass

@synchronized
def run():
    finallist=[None] * N
    urllist = ['https://pythonexpress.in/workshop/'+str(i).zfill(3) for i in range(N)]
    for i, url in enumerate(urllist):
        print (url)
        finallist[i] = slow(url)
    return finallist

if __name__ == "__main__":
    finallist = run()
    print(finallist)