OOP

Question

我在设计软件时遇到了运行问题。

我的软件包含几个 classes、Bot、Website 和 Scraper。

Bot 是最抽象的执行者 class，负责在高层管理程序。

Website 是一个 class，其中包含从该特定网站抓取的数据。

Scraper 是一个 class，每个 Website 可能有多个实例。每个实例负责单个网站的不同部分。

Scraper 有一个函数 scrape_data()，其中 returns JSON 数据与 Website 关联。我想以某种方式将此数据传递到 Website，但找不到方法，因为 Scraper 位于较低的抽象级别。这是我尝试过的想法：

# In this idea, Website would have to poll scraper. Scraper is already polling Server, so this seems messy and inefficient
class Website:
    def __init__(self):
        self.scrapers = list()
        self.data = dict()

    def add_scraper(self, scraper):
        self.scrapers.append(scraper)
   
    def add_data(type, json):
        self.data[type] = json

    ...

# The problem here is scraper has no awareness of the dict of websites. It cannot pass the data returned by Scraper into the respective Website
class Bot:
     def __init__(self):
          self.scrapers = list()
          self.websites = dict()

我该如何解决我的问题？什么样的更基本的规则或设计模式适用于这个问题，以便我将来可以使用它们？

Answer 1

解决这个问题的一种方法是，从节点结构中汲取灵感，在 Scraper class 中有一个属性直接引用其各自的 Website，就好像我'我正确理解了您描述的一对多关系（一个 Website 可以有多个 Scrapers）。然后，当一个Scraper需要将它的数据传递给它的Website时，你可以直接引用该属性：

class Website:
     def __init__(self):
         self.scrapers = list()   #You can indeed remove this list of scrapers since the 
                                   #scrapper will reference its master website, not the other way around
         self.data = dict()  #I'm not sure how you want the data to be stores, 
                             #it could be a list, a dict, etc.
     def add_scraper(self, scraper):
         self.scrapers.append(scraper)

     def add_data(type, json):
         self.data[type] = json

class Scraper:
     def __init__(self, master_website):
         #Respective code
         self.master = master_website  #This way you have a direct reference to the website.  
                                       #This master_website is a Website object
     ...
     def scrape_data(self):
          json = #this returns the scraped data in JSON format
          self.master.add_data(type, json)

我不知道这会有多高效，或者你是否想随时知道哪些抓取器链接到哪个网站，不过

Answer 2

一旦您开始谈论多对多 parent/child 关系，您就应该考虑组合模式而不是传统的继承。具体来说，装饰者模式。您的 add_scraper 方法是一种提示，表明您实际上是在寻求构建处理程序堆栈。

此模式的 classic 示例是一组 classes 负责产生咖啡的价格。您从基本成分“咖啡”开始，每种成分有一个 class，每种成分都有自己的价格调节器。 class 用于全脂牛奶，一种用于脱脂，一种用于糖，一种用于榛子糖浆，一种用于巧克力等。所有成分以及基本成分共享一个接口，保证 'getPrice' 方法。当用户下订单时，基础组件被注入第一个 ingredient/wrapper-class。包装的对象被注入到后续的成分包装器中，依此类推，直到最后 getPrice 被调用。并且 getPrice 的每个实例都应该写入到首先从先前注入的实例中提取，因此计算遍及所有层。

好处是可以在不影响现有菜单的情况下添加新成分，可以单独更改现有成分的价格，并且可以将成分添加到多种类型的饮料中。

在您的例子中，被修饰的数据结构是 Website 对象。成分 classes 将是您的 Scrapers，而 getPrice 方法将是 scrape_data。并且 scrape_data 方法应该期望接收 Website 的实例作为参数，并且 return 它在水化后。每个 Scraper 都不需要了解其他 Scraper 的工作方式或实施哪些 Scraper。它所需要知道的只是前一个存在并遵守一个接口，保证它也有一个 scrape_data 方法。所有人最终都将操纵同一个 Website 对象，因此返回到您的 Bot 的内容已被所有人水合。

这让您有责任了解 Bot class 上的 Website 应用什么 Scrapers，这现在基本上是一项服务 class。由于它位于上层抽象层，因此具有了解这一点所需的高级视角。

OOP - 如何在抽象中传递数据 "up"？

OOP - How to pass data "up" in abstraction?

design-patterns

software-design