为什么 SQL Alchemy ORM 的多对多 JOIN 对我来说效率低下?
Why is SQL Alchemy ORM's Many to Many JOIN this inefficient for me?
我 运行正在我的一个 类 中跟踪查询,它采用 ORM 类 以便它可以与几个类似的表一起工作。
(
self.db.query(self.orm_contact_class)
.options(
load_only(
self.orm_contact_class.id,
self.orm_contact_class.name,
self.orm_contact_class.email_attempts_dict,
),
joinedload(
self.orm_contact_class.__dict__[self.access_from_contact_to_company]
).load_only(self.orm_company_class.domain)
)
.where(
self.orm_contact_class.email == None,
self.orm_contact_class.name != None,
self.orm_company_class.domain != None,
catch_all_conditions
)
)
这导致了这个可怕的查询:
SELECT test.crunchbase_people.id AS test_crunchbase_people_id,
test.crunchbase_people.name AS test_crunchbase_people_name,
test.crunchbase_people.email_attempts_dict AS test_crunchbase_people_email_attempts_dict,
crunchbase_companies_1.id AS crunchbase_companies_1_id,
crunchbase_companies_1.domain AS crunchbase_companies_1_domain
FROM test.crunchbase_companies,
test.crunchbase_people
LEFT OUTER JOIN (
test.crunchbase_people_crunchbase_companies AS crunchbase_people_crunchbase_companies_1
JOIN test.crunchbase_companies AS crunchbase_companies_1
ON crunchbase_companies_1.id = crunchbase_people_crunchbase_companies_1.crunchbase_companies_id)
ON test.crunchbase_people.id = crunchbase_people_crunchbase_companies_1.crunchbase_people_id
WHERE test.crunchbase_people.email IS NULL AND test.crunchbase_people.name IS NOT NULL AND test.crunchbase_companies.domain IS NOT NULL AND (test.crunchbase_companies.is_domain_catch_all = false OR test.crunchbase_companies.is_domain_catch_all IS NULL)
它最终永远不会完成,如果我在 Postgres 控制台中 运行 它只是 returns 完全相同的行的重复,一遍又一遍!
所以它永远不会映射对象,因为查询会永远持续下去。在没有 ORM 的情况下,有一种简单的方法可以做到这一点,查询 运行s 在 0.5 秒内完成(并且没有像上面那样的重复项),但是我的对象没有映射,这会导致我重构很多代码。
有人知道这样的查询可能有什么问题吗?
正如评论员对原始 post 的建议,我缺少一个显式连接,因为 joinedload 不会替换它,而只是告诉应该预加载哪些字段。
(
self.db.query(self.orm_contact_class)
.join(self.orm_contact_class.__dict__[self.access_from_contact_to_company]) # added this line
.options(
load_only(
self.orm_contact_class.id,
self.orm_contact_class.name,
self.orm_contact_class.email_attempts_dict,
),
joinedload(
self.orm_contact_class.__dict__[self.access_from_contact_to_company]
).load_only(self.orm_company_class.domain)
)
.where(
self.orm_contact_class.email == None,
self.orm_contact_class.name != None,
self.orm_company_class.domain != None,
catch_all_conditions
)
)
我 运行正在我的一个 类 中跟踪查询,它采用 ORM 类 以便它可以与几个类似的表一起工作。
(
self.db.query(self.orm_contact_class)
.options(
load_only(
self.orm_contact_class.id,
self.orm_contact_class.name,
self.orm_contact_class.email_attempts_dict,
),
joinedload(
self.orm_contact_class.__dict__[self.access_from_contact_to_company]
).load_only(self.orm_company_class.domain)
)
.where(
self.orm_contact_class.email == None,
self.orm_contact_class.name != None,
self.orm_company_class.domain != None,
catch_all_conditions
)
)
这导致了这个可怕的查询:
SELECT test.crunchbase_people.id AS test_crunchbase_people_id,
test.crunchbase_people.name AS test_crunchbase_people_name,
test.crunchbase_people.email_attempts_dict AS test_crunchbase_people_email_attempts_dict,
crunchbase_companies_1.id AS crunchbase_companies_1_id,
crunchbase_companies_1.domain AS crunchbase_companies_1_domain
FROM test.crunchbase_companies,
test.crunchbase_people
LEFT OUTER JOIN (
test.crunchbase_people_crunchbase_companies AS crunchbase_people_crunchbase_companies_1
JOIN test.crunchbase_companies AS crunchbase_companies_1
ON crunchbase_companies_1.id = crunchbase_people_crunchbase_companies_1.crunchbase_companies_id)
ON test.crunchbase_people.id = crunchbase_people_crunchbase_companies_1.crunchbase_people_id
WHERE test.crunchbase_people.email IS NULL AND test.crunchbase_people.name IS NOT NULL AND test.crunchbase_companies.domain IS NOT NULL AND (test.crunchbase_companies.is_domain_catch_all = false OR test.crunchbase_companies.is_domain_catch_all IS NULL)
它最终永远不会完成,如果我在 Postgres 控制台中 运行 它只是 returns 完全相同的行的重复,一遍又一遍!
所以它永远不会映射对象,因为查询会永远持续下去。在没有 ORM 的情况下,有一种简单的方法可以做到这一点,查询 运行s 在 0.5 秒内完成(并且没有像上面那样的重复项),但是我的对象没有映射,这会导致我重构很多代码。
有人知道这样的查询可能有什么问题吗?
正如评论员对原始 post 的建议,我缺少一个显式连接,因为 joinedload 不会替换它,而只是告诉应该预加载哪些字段。
(
self.db.query(self.orm_contact_class)
.join(self.orm_contact_class.__dict__[self.access_from_contact_to_company]) # added this line
.options(
load_only(
self.orm_contact_class.id,
self.orm_contact_class.name,
self.orm_contact_class.email_attempts_dict,
),
joinedload(
self.orm_contact_class.__dict__[self.access_from_contact_to_company]
).load_only(self.orm_company_class.domain)
)
.where(
self.orm_contact_class.email == None,
self.orm_contact_class.name != None,
self.orm_company_class.domain != None,
catch_all_conditions
)
)