如何首先从 parent 递归排序数据,然后使用 python 继续其 children?
How to sort data with recursion from the parent first and then proceed to its children using python?
我有一个名为 Topics 的模型,其数据如下:
id
has_sub_topic
parent_id
subject_module_level_id
order
27
1
NULL
25
1
31
1
NULL
25
2
34
0
NULL
25
3
28
0
27
25
1
29
0
27
25
2
40
1
27
25
3
32
0
31
25
1
33
0
31
25
2
41
1
40
25
1
43
0
40
25
2
44
1
40
25
3
42
0
41
25
1
45
0
44
25
1
47
1
44
25
2
48
0
47
25
1
我想先按 parent 排序,然后像 depth-first 那样进行 children 处理,只获取没有 has_sub_topic 的主题的数据.因此,数据将按这样的顺序排序: https://upload.wikimedia.org/wikipedia/commons/7/7f/Depth-First-Search.gif 并且只得到数据 4, 7, 8, 10
以前我尝试使用排序函数,但它与许多 child 不兼容。所以,我必须使用递归函数。我使用递归的代码是这样的:
# Example data for topics
import pandas as pd
topics = pd.DataFrame({
'id': [27, 31, 34, 28, 29, 40, 32, 33, 41, 43, 44, 42, 45, 47, 48],
'has_sub_topic': [1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0],
'parent_id': [None, None, None, 27, 27, 27, 31, 31, 40, 40, 40, 41, 44, 44, 47],
'subject_module_level_id': [25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25],
'order': [1, 2, 3, 1, 2, 3, 1, 2, 1, 2, 3, 1, 1, 2, 1]
})
def topic_child_order(topic, list_topics=None):
if list_topics is None: list_topics = []
if topic.has_sub_topic:
topics = Topics.objects.filter(parent=topic).order_by('order')
for child in topics:
result = topic_child_order(child, list_topics)
else:
result = topic
list_topics.append(result)
return list_topics
topics = Topics.objects.filter(
subject_module_level_id=25,
parent=None
).order_by('order')
topics_order = []
for topic in topics:
topics_order.append(topic_child_order(topic))
结果是这样的:
[
[
<Topics: Topicsobject(28)>,
<Topics: Topicsobject(29)>,
<Topics: Topicsobject(42)>,
[
...
],
<Topics: Topicsobject(43)>,
<Topics: Topicsobject(45)>,
<Topics: Topicsobject(48)>,
[
...
],
[
...
],
[
...
],
[
...
]
],
[
<Topics: Topicsobject(32)>,
<Topics: Topicsobject(33)>,
[
...
]
],
[
<Topics: Topicsobject(34)>
]
]
排序顺序是正确的,但我不知道为什么结果是空列表。有人知道怎么修这个东西吗?或者任何人都知道如何更好地做到这一点,所以结果只有 return 在一个列表而不是嵌套列表中?
我以嵌套 python dict 映射 parent id 到 children id 列表的形式显式构建了一棵树,使用 .iterrows
将节点添加到树。 Children 使用给定的顺序排序。
然后我在树中执行一个简单的 depth-first-search,沿途生成叶子的 ID。
最后我使用 .loc
到数据框中的 select 行。
import pandas as pd
topics = pd.DataFrame({
'id': [27, 31, 34, 28, 29, 40, 32, 33, 41, 43, 44, 42, 45, 47, 48],
'has_sub_topic': [1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0],
'parent_id': [0, 0, 0, 27, 27, 27, 31, 31, 40, 40, 40, 41, 44, 44, 47],
'subject_module_level_id': [25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25],
'order': [1, 2, 3, 1, 2, 3, 1, 2, 1, 2, 3, 1, 1, 2, 1]
}).set_index('id')
tree = {}
for i, row in topics.iterrows():
tree.setdefault(row['parent_id'], []).append(i)
for brotherhood in tree.values():
brotherhood.sort(key=lambda sibling: topics.at[sibling,'order'])
# print( tree )
# {0: [27, 31, 34], 27: [28, 29, 40], 31: [32, 33], 40: [41, 43, 44], 41: [42], 44: [45, 47], 47: [48]}
def gen_leaves(tree, i=0):
if i in tree:
for child in tree[i]:
yield from gen_leaves(tree, child)
else:
yield i
# print( list(gen_leaves(tree)) )
# [28, 29, 42, 43, 45, 48, 32, 33, 34]
leaf_ids = list(gen_leaves(tree))
topics_leaves = topics.loc[leaf_ids]
print(topics_leaves)
# has_sub_topic parent_id subject_module_level_id order
# id
# 28 0 27 25 1
# 29 0 27 25 2
# 42 0 41 25 1
# 43 0 40 25 2
# 45 0 44 25 1
# 48 0 47 25 1
# 32 0 31 25 1
# 33 0 31 25 2
# 34 0 0 25 3
我有一个名为 Topics 的模型,其数据如下:
id | has_sub_topic | parent_id | subject_module_level_id | order |
---|---|---|---|---|
27 | 1 | NULL | 25 | 1 |
31 | 1 | NULL | 25 | 2 |
34 | 0 | NULL | 25 | 3 |
28 | 0 | 27 | 25 | 1 |
29 | 0 | 27 | 25 | 2 |
40 | 1 | 27 | 25 | 3 |
32 | 0 | 31 | 25 | 1 |
33 | 0 | 31 | 25 | 2 |
41 | 1 | 40 | 25 | 1 |
43 | 0 | 40 | 25 | 2 |
44 | 1 | 40 | 25 | 3 |
42 | 0 | 41 | 25 | 1 |
45 | 0 | 44 | 25 | 1 |
47 | 1 | 44 | 25 | 2 |
48 | 0 | 47 | 25 | 1 |
我想先按 parent 排序,然后像 depth-first 那样进行 children 处理,只获取没有 has_sub_topic 的主题的数据.因此,数据将按这样的顺序排序: https://upload.wikimedia.org/wikipedia/commons/7/7f/Depth-First-Search.gif 并且只得到数据 4, 7, 8, 10
以前我尝试使用排序函数,但它与许多 child 不兼容。所以,我必须使用递归函数。我使用递归的代码是这样的:
# Example data for topics
import pandas as pd
topics = pd.DataFrame({
'id': [27, 31, 34, 28, 29, 40, 32, 33, 41, 43, 44, 42, 45, 47, 48],
'has_sub_topic': [1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0],
'parent_id': [None, None, None, 27, 27, 27, 31, 31, 40, 40, 40, 41, 44, 44, 47],
'subject_module_level_id': [25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25],
'order': [1, 2, 3, 1, 2, 3, 1, 2, 1, 2, 3, 1, 1, 2, 1]
})
def topic_child_order(topic, list_topics=None):
if list_topics is None: list_topics = []
if topic.has_sub_topic:
topics = Topics.objects.filter(parent=topic).order_by('order')
for child in topics:
result = topic_child_order(child, list_topics)
else:
result = topic
list_topics.append(result)
return list_topics
topics = Topics.objects.filter(
subject_module_level_id=25,
parent=None
).order_by('order')
topics_order = []
for topic in topics:
topics_order.append(topic_child_order(topic))
结果是这样的:
[
[
<Topics: Topicsobject(28)>,
<Topics: Topicsobject(29)>,
<Topics: Topicsobject(42)>,
[
...
],
<Topics: Topicsobject(43)>,
<Topics: Topicsobject(45)>,
<Topics: Topicsobject(48)>,
[
...
],
[
...
],
[
...
],
[
...
]
],
[
<Topics: Topicsobject(32)>,
<Topics: Topicsobject(33)>,
[
...
]
],
[
<Topics: Topicsobject(34)>
]
]
排序顺序是正确的,但我不知道为什么结果是空列表。有人知道怎么修这个东西吗?或者任何人都知道如何更好地做到这一点,所以结果只有 return 在一个列表而不是嵌套列表中?
我以嵌套 python dict 映射 parent id 到 children id 列表的形式显式构建了一棵树,使用 .iterrows
将节点添加到树。 Children 使用给定的顺序排序。
然后我在树中执行一个简单的 depth-first-search,沿途生成叶子的 ID。
最后我使用 .loc
到数据框中的 select 行。
import pandas as pd
topics = pd.DataFrame({
'id': [27, 31, 34, 28, 29, 40, 32, 33, 41, 43, 44, 42, 45, 47, 48],
'has_sub_topic': [1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0],
'parent_id': [0, 0, 0, 27, 27, 27, 31, 31, 40, 40, 40, 41, 44, 44, 47],
'subject_module_level_id': [25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25],
'order': [1, 2, 3, 1, 2, 3, 1, 2, 1, 2, 3, 1, 1, 2, 1]
}).set_index('id')
tree = {}
for i, row in topics.iterrows():
tree.setdefault(row['parent_id'], []).append(i)
for brotherhood in tree.values():
brotherhood.sort(key=lambda sibling: topics.at[sibling,'order'])
# print( tree )
# {0: [27, 31, 34], 27: [28, 29, 40], 31: [32, 33], 40: [41, 43, 44], 41: [42], 44: [45, 47], 47: [48]}
def gen_leaves(tree, i=0):
if i in tree:
for child in tree[i]:
yield from gen_leaves(tree, child)
else:
yield i
# print( list(gen_leaves(tree)) )
# [28, 29, 42, 43, 45, 48, 32, 33, 34]
leaf_ids = list(gen_leaves(tree))
topics_leaves = topics.loc[leaf_ids]
print(topics_leaves)
# has_sub_topic parent_id subject_module_level_id order
# id
# 28 0 27 25 1
# 29 0 27 25 2
# 42 0 41 25 1
# 43 0 40 25 2
# 45 0 44 25 1
# 48 0 47 25 1
# 32 0 31 25 1
# 33 0 31 25 2
# 34 0 0 25 3