使用 Django ORM 计算组合(CROSS JOIN)
Calculating Combinations using Django ORM (CROSS JOIN)
我有三个相关模型:Process
、Factor
和 Level
。 Process
与 Factor
具有多对多关系,而 Factor
将具有一个或多个 Level
。我正在尝试计算与 Process
相关的 Level
的所有组合。这很容易用 Python 的 itertools
作为模型方法来实现,但是执行速度有点慢,所以我想弄清楚如何使用 Django ORM 来执行这个计算SQL.
型号:
class Process(models.Model):
factors = models.ManyToManyField(Factor, blank = True)
class Factor(models.Model):
...
class Level(models.Model):
factor = models.ForeignKey(Factor, on_delete=models.CASCADE)
示例:一个过程 'Running'
涉及三个 Factor
('Distance'
、'Climb'
、'Surface'
),每个过程由多个 Level
s('Long'
/'Short'
、'Flat'
/'Hilly'
、'Road'
/'Mixed'
/'Trail'
)。计算 SQL 中的组合将涉及通过首先确定涉及的 Factor
数量(本例中为 3)并多次执行所有级别的 CROSS JOIN
来构建查询。
在 SQL 中,可以这样完成:
WITH foo AS
(SELECT * FROM Level
WHERE Level.factor_id IN
(SELECT ProcessFactors.factor_id FROM ProcessFactors WHERE process_id = 1)
)
SELECT a1.*, a2.*, a3.*
FROM foo a1
CROSS JOIN foo a2
CROSS JOIN foo a3
WHERE (a1.factor_id < a2.factor_id) AND (a2.factor_id < a3.factor_id)
结果:
a1.name | a2.name | a3.name
--------------------------
Long | Flat | Road
Long | Flat | Mixed
Long | Flat | Trail
Long | Hilly | Road
Long | Hilly | Mixed
Long | Hilly | Trail
Short | Flat | Road
Short | Flat | Mixed
Short | Flat | Trail
Short | Hilly | Road
Short | Hilly | Mixed
Short | Hilly | Trail
目前,我在 Process
模型上将其作为一种方法实现为:
def level_combinations(self):
levels = []
for factor in self.factors.all():
levels.append(Level.objects.filter(factor = factor))
combinations = []
for levels in itertools.product(*levels):
combination = {}
combination["levels"] = levels
combinations.append(combination)
return combinations
这是否可能使用 Django ORM,或者它是否足够复杂以至于应该作为原始查询来实现以提高 Python 代码实现的速度?
几年前有一个关于 performing CROSS JOIN
in Django ORM 的类似问题(大约 Django v1.3 看起来像)并没有引起太多关注(作者认为只是使用 Python itertools).
如果我没理解错的话,你可以试试:
for process in Process.objects.all():
# get all levels for current process
levels = Level.objects.filter(factor__in=process.factors.all())
from itertools import groupby, product
def level_combinations(self):
# We need order by factor_id for proper grouping
levels = Level.objects.filter(factor__process=self).order_by('factor_id')
# [{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...},
# {'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]
groups = [list(group) for _, group in groupby(levels, lambda l: l.factor_id)]
# [[{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...}],
# [{'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]]
# Note: don't forget, that product is iterator/generator, not list
return product(*groups)
如果顺序无关紧要,则:
def level_combinations(self):
levels = Level.objects.filter(factor__process=self)
groups = {}
for level in levels:
groups.setdefault(level.factor_id, []).append(level)
return product(*groups.values())
晚了几年,此变通方法确实 不 实际上使用了 CROSS JOIN
,但它 产生了预期的结果在 单个 查询中。
第 1 步:将 cross
字段添加到您的 Factor
模型
class Factor(models.Model):
cross = models.ForeignKey(
to='self', on_delete=models.CASCADE, null=True, blank=True)
...
第 2 步:link 'Climb'
到 'Surface'
,以及 link 'Distance'
到 'Climb'
,使用新的 Factor.cross
场
第三步:查询如下
Level.objects.filter(factor__name='Distance').values_list(
'name', 'factor__cross__level__name', 'factor__cross__cross__level__name')
结果:
('Long', 'Flat', 'Road')
('Long', 'Flat', 'Mixed')
('Long', 'Flat', 'Trail')
('Long', 'Hilly', 'Road')
('Long', 'Hilly', 'Mixed')
('Long', 'Hilly', 'Trail')
('Short', 'Flat', 'Road')
('Short', 'Flat', 'Mixed')
('Short', 'Flat', 'Trail')
('Short', 'Hilly', 'Road')
('Short', 'Hilly', 'Mixed')
('Short', 'Hilly', 'Trail')
这是一个简化的例子。为了使其更通用,而不是添加 Factor.cross
字段,您可以添加一个带有两个外键的新 CrossedFactors
模型到 Factor
。然后可以使用该模型来定义各种实验设计。
我有三个相关模型:Process
、Factor
和 Level
。 Process
与 Factor
具有多对多关系,而 Factor
将具有一个或多个 Level
。我正在尝试计算与 Process
相关的 Level
的所有组合。这很容易用 Python 的 itertools
作为模型方法来实现,但是执行速度有点慢,所以我想弄清楚如何使用 Django ORM 来执行这个计算SQL.
型号:
class Process(models.Model):
factors = models.ManyToManyField(Factor, blank = True)
class Factor(models.Model):
...
class Level(models.Model):
factor = models.ForeignKey(Factor, on_delete=models.CASCADE)
示例:一个过程 'Running'
涉及三个 Factor
('Distance'
、'Climb'
、'Surface'
),每个过程由多个 Level
s('Long'
/'Short'
、'Flat'
/'Hilly'
、'Road'
/'Mixed'
/'Trail'
)。计算 SQL 中的组合将涉及通过首先确定涉及的 Factor
数量(本例中为 3)并多次执行所有级别的 CROSS JOIN
来构建查询。
在 SQL 中,可以这样完成:
WITH foo AS
(SELECT * FROM Level
WHERE Level.factor_id IN
(SELECT ProcessFactors.factor_id FROM ProcessFactors WHERE process_id = 1)
)
SELECT a1.*, a2.*, a3.*
FROM foo a1
CROSS JOIN foo a2
CROSS JOIN foo a3
WHERE (a1.factor_id < a2.factor_id) AND (a2.factor_id < a3.factor_id)
结果:
a1.name | a2.name | a3.name
--------------------------
Long | Flat | Road
Long | Flat | Mixed
Long | Flat | Trail
Long | Hilly | Road
Long | Hilly | Mixed
Long | Hilly | Trail
Short | Flat | Road
Short | Flat | Mixed
Short | Flat | Trail
Short | Hilly | Road
Short | Hilly | Mixed
Short | Hilly | Trail
目前,我在 Process
模型上将其作为一种方法实现为:
def level_combinations(self):
levels = []
for factor in self.factors.all():
levels.append(Level.objects.filter(factor = factor))
combinations = []
for levels in itertools.product(*levels):
combination = {}
combination["levels"] = levels
combinations.append(combination)
return combinations
这是否可能使用 Django ORM,或者它是否足够复杂以至于应该作为原始查询来实现以提高 Python 代码实现的速度?
几年前有一个关于 performing CROSS JOIN
in Django ORM 的类似问题(大约 Django v1.3 看起来像)并没有引起太多关注(作者认为只是使用 Python itertools).
如果我没理解错的话,你可以试试:
for process in Process.objects.all():
# get all levels for current process
levels = Level.objects.filter(factor__in=process.factors.all())
from itertools import groupby, product
def level_combinations(self):
# We need order by factor_id for proper grouping
levels = Level.objects.filter(factor__process=self).order_by('factor_id')
# [{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...},
# {'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]
groups = [list(group) for _, group in groupby(levels, lambda l: l.factor_id)]
# [[{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...}],
# [{'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]]
# Note: don't forget, that product is iterator/generator, not list
return product(*groups)
如果顺序无关紧要,则:
def level_combinations(self):
levels = Level.objects.filter(factor__process=self)
groups = {}
for level in levels:
groups.setdefault(level.factor_id, []).append(level)
return product(*groups.values())
晚了几年,此变通方法确实 不 实际上使用了 CROSS JOIN
,但它 产生了预期的结果在 单个 查询中。
第 1 步:将 cross
字段添加到您的 Factor
模型
class Factor(models.Model):
cross = models.ForeignKey(
to='self', on_delete=models.CASCADE, null=True, blank=True)
...
第 2 步:link 'Climb'
到 'Surface'
,以及 link 'Distance'
到 'Climb'
,使用新的 Factor.cross
场
第三步:查询如下
Level.objects.filter(factor__name='Distance').values_list(
'name', 'factor__cross__level__name', 'factor__cross__cross__level__name')
结果:
('Long', 'Flat', 'Road')
('Long', 'Flat', 'Mixed')
('Long', 'Flat', 'Trail')
('Long', 'Hilly', 'Road')
('Long', 'Hilly', 'Mixed')
('Long', 'Hilly', 'Trail')
('Short', 'Flat', 'Road')
('Short', 'Flat', 'Mixed')
('Short', 'Flat', 'Trail')
('Short', 'Hilly', 'Road')
('Short', 'Hilly', 'Mixed')
('Short', 'Hilly', 'Trail')
这是一个简化的例子。为了使其更通用,而不是添加 Factor.cross
字段,您可以添加一个带有两个外键的新 CrossedFactors
模型到 Factor
。然后可以使用该模型来定义各种实验设计。