如何提高sqlalchemy中频繁查询a table的性能？

Question

我有一个 table 继承自另一个并且有超过 10,000 行和 10 列。我正在从输入中读取 10,000 多个项目。对于每个项目，我查询 table 以查看数据是否不存在。在本例中，我将其插入 table。一个简化的例子如下：

class Port(Interface):
    __tablename__ = 'ports'
    Int_Class_ID = Column(Integer, ForeignKey('interfaceitems.Int_Class_ID', ondelete='cascade'), primary_key=True)
    __mapper_args__ = {
        'polymorphic_identity': 'ports',
    }
    Name = Column(Text)
    Locked = Column(Boolean)
    UniqueIdentifier = Column(Text, index=True)

然后我做：

portList = Session.query(Port).filter(Port.UniqueIdentifier == ID).all()

例如将被调用 10,000 次。但这很慢，需要3分钟，这对于代码的应用和执行的频率来说并不令人满意。有什么办法可以提高性能吗？

Answer 1

您似乎正在读取 ID，然后分别为每个 ID 调用 Query。这将非常慢，并且您想一次阅读更大的项目列表。我通常这样做的方式是使用块；这样你肯定每个查询都有很多项目，但你不会在一个查询中加载太多 ids:

list_of_ids = list(input_ids)
while list_of_ids:
    chunk = list_of_ids[0:1000]
    list_of_ids = list_of_ids[1000:]
    for port_object in session.query(Port).filter(Port.uniqueidentifier.in_(chunk)):
        process_object(port_object)

使用上面的方法，不是针对每一行发出 10000 个查询，而是每个查询针对 1000 行只发出十个查询。它应该运行很快。

如果问题出在其他地方，比如读取所有 10K 行太慢，并且您无法解决发出许多行的查询，orm slowness will help. The upcoming baked query 功能中的提示也将有助于生成来自查询的 SQL 字符串，尽管这似乎不是这里的问题。

如何提高sqlalchemy中频繁查询a table的性能？

How to improve the performance of frequent querying of a table in sqlalchemy?

performance

sqlalchemy