如何对此网络抓取代码实施多处理？我应该改用多线程吗？

Question

我想要实现的是缩短完成抓取过程所需的时间并将所有数据存储在字典中（字典是 Untiters 键是用户名，值是次数用户创建了一个具有特定名称的 post）我使用 this 网站作为教程，但我不知道如何实现我的代码中解释的内容。这是代码，抱歉，如果我提供了不必要的大部分代码。

from multiprocessing import Pool
import requests
from bs4 import BeautifulSoup
z = 0
Untitleds = ["Sin título","Untitled","Sans titre","İsimsiz","Ohne Titel","بلا عنوان",
             "Без названия","无标题","夕イトルなし"]
Untiters = {}
Untits = []

x = 138
for i in range(1,20):
    y = x + 1
    x = y
    Id = y
    link = "https://folioscope.co/blank/" + str(Id)
    Url = (link)
    R = requests.get(Url)
    Soup = BeautifulSoup(R.text,"html5lib")
    Pretitle = (Soup.find("div",{"class":"container_padding"}))
    Title = Pretitle.div.text
    if Title in (Untitleds):
        Prename = Soup.find("div",{"class":"padding_bottom_normal"})
        Name = Prename.a.text
        Untitled = z + 1
        z = Untitled

        if Name not in Untiters:
            Untiters.update({Name : 1})
        else:
            c0 = Untiters[Name]
            c1 = c0 + 1
            Untiters[Name] = c1
        Untits.append(Title)
        print (Title, Name)

Answer 1

要使用multiprocessing.Pool从站点获取数据，您可以使用以下示例：

from multiprocessing import Pool
import requests
from bs4 import BeautifulSoup


def get_data(id_):
    url = "https://folioscope.co/blank/" + str(id_)
    soup = BeautifulSoup(requests.get(url).content, "html.parser")

    title = soup.select_one("#animation_container .title") or ""
    if title:
        title = title.text

    username = soup.select_one(".username") or ""
    if username:
        username = username.text

    return id_, title, username


if __name__ == "__main__":
    with Pool() as pool:
        for id_, title, username in pool.imap_unordered(
            get_data, range(138, 158)
        ):
            if title and username:
                print("{:<4} {:<40} {}".format(id_, title, username))

                # here you can add the result to list, filter duplicates etc.

打印：

153  First attempt                            CyberAly
149  Minecraft Loop                           MisterD
142  An Idea!                                 Pyro
148  Untitled                                 szymun
152  Thunder                                  dpknyk1993
139  Untitled                                 WoopDeDoo
146  Untitled                                 szymun
144  Loop                                     pjrd
138  Blink                                    fairyfina
140  Test                                     sknob
154  Dragon Ball kameha                       piedicmolkok
157  Boom                                     animation33
156  Tree in wind                             CyberAly

如何对此网络抓取代码实施多处理？我应该改用多线程吗？

How can I implement multiprocessing to this web scraping code? Should I use multi threading instead?

python

multithreading

multiprocessing

web-scraping