OOP python: 在哪里实例化 Cassandra 和 elasticsearch 集群?

OOP python: Where to instantiate Cassandra and elasticsearch cluster?

我有一个与 elasticsearch 和 cassandra 交互很多的对象。但我不知道在哪里实例化我的 Cassandra 和 elasticsearch 会话。我应该把它放在我的 "code" 中,然后像这样将会话传递到我的函数的参数中吗:

cassandra_cluster = Cluster()
session = cassandra_cluster.connect()
es = Elasticsearch()

class Article:

    document_type = "cnn_article"

    def __init__(self):
        self.author = ""
        self.url = ""
        ...

    @classmethod
    def from_crawl(cls, url):
        obj = cls()
        # Launch a crawler and fill the fields and return the object

    @classmethod
    def from_elasticseacrh(cls, elastic_search_document):
        obj = cls()
        # Read the response from elasticsearch and return the object

    def save_to_cassandra(self):
        # Save an object into cassandra
        session.execute(.....)

    def save_to_elasticsearch(self, index_name, es):
        # Save an object into elasticsearch
        es.index(index=index_name, ...)

    ...

article = Article.from_crawl("http://cnn.com/article/blabla")
article.save_to_cassandra(session)
article.save_to_elasticsearch("cnn", es)

或者我应该将我的 cassandra 和 elasticsearch 会话的实例化为这样的实例变量:

class Article:

    cassandra_cluster = Cluster()
    session = cassandra_cluster.connect()
    es = Elasticsearch()
    document_type = "cnn_article"

    def __init__(self):
        self.author = ""
        self.url = ""
        ...

    @classmethod
    def from_crawl(cls, url):
        obj = cls()
        # Launch a crawler and fill the fields and return the object

    @classmethod
    def from_elasticseacrh(cls, elastic_search_document):
        obj = cls()
        # Read the response from elasticsearch and return the object

    def save_to_cassandra(self):
        # Save an object into cassandra
        session.execute(.....)

    def save_to_elasticsearch(self):
        # Save an object into elasticsearch
        es.index(....)

    ...

article = Article.from_crawl("http://cnn.com/article/blabla")
article.save_to_cassandra()
article.save_to_elasticsearch()

基于他们的文档和此处的一些示例:http://www.datastax.com/dev/blog/datastax-python-driver-multiprocessing-example-for-improved-bulk-data-throughput

我会采用你的第二种方法。他们提到会话只是一个用于关闭连接的上下文管理器,他们的查询管理器将它们显示为 class 属性。

我认为两者都可以,但如果您想对其进行多处理,使用后一种方法可能会稍微容易一些。