将简单的字典列表转换为评论和回复的嵌套字典列表的递归函数

Recursive function to convert flat list of dicts to nested list of dicts of comments and replies

我正在为像 reddit 这样的线程评论实现 API!但是我有一个不知道如何解决的问题,我会给你必要的上下文:

我有一个字典列表,其中每个字典代表从数据库中提取的评论信息,它看起来像这样:

comments:List[Dict] = [
    {
        'body': 'holaaaaaaaaaaaaaaaaaaa',
        'id': UUID('b1865484-fdef-11eb-90ee-0242ac150003'),
        'level': 0,
        'parent_id': None,
        'reference_id': UUID('e480ae80-89e2-4d38-ba92-570e863fec87')
    },
    {
        'body': 'holaaaaaaaaaaaaaaaaaaa',
        'id': UUID('b76db7ac-fdef-11eb-90ee-0242ac150003'),
        'level': 1,
        'parent_id': UUID('b1865484-fdef-11eb-90ee-0242ac150003'),
        'reference_id': UUID('b621b10b-47c8-4492-881d-577e5a7a98af')
    },
    {
        'body': 'holaaaaaaaaaaaaaaaaaaa',
        'id': UUID('9a3cdb12-fe09-11eb-a65d-0242ac150003'),
        'level': 1,
        'parent_id': UUID('b1865484-fdef-11eb-90ee-0242ac150003'),
        'reference_id': UUID('09280706-4ab6-459a-86cc-dc953479c356')
    },
    {
        'body': 'holaaaaaaaaaaaaaaaaaaa',
        'id': UUID('a53c7522-fe09-11eb-a65d-0242ac150003'),
        'level': 2,
        'parent_id': UUID('9a3cdb12-fe09-11eb-a65d-0242ac150003'),
        'reference_id': UUID('124ece3a-a8ff-415b-8e98-66852d4d7e16')
    },
    {
        'body': 'holaaaaaaaaaaaaaaaaaaa',
        'id': UUID('a349cf9a-fe12-11eb-8cc0-0242ac150003'),
        'level': 3,
        'parent_id': UUID('a53c7522-fe09-11eb-a65d-0242ac150003'),
        'reference_id': UUID('e6dbbf9c-ff0b-4337-a9c5-aa9eedfebc07')
    }
]

而我的 objective 是将其转换为嵌套字典列表,将 replies/sub-comments 插入每个字典内的 replies 键中,我希望它看起来像这样:

comments:List[Dict] = [
    {
        'body': 'holaaaaaaaaaaaaaaaaaaa',
        'id': UUID('b1865484-fdef-11eb-90ee-0242ac150003'),
        'level': 0,
        'parent_id': None,
        'reference_id': UUID('e480ae80-89e2-4d38-ba92-570e863fec87'),
        'replies': [
            {
                'body': 'holaaaaaaaaaaaaaaaaaaa',
                'id': UUID('b76db7ac-fdef-11eb-90ee-0242ac150003'),
                'level': 1,
                'parent_id': UUID('b1865484-fdef-11eb-90ee-0242ac150003'),
                'reference_id': UUID('b621b10b-47c8-4492-881d-577e5a7a98af')
            },
            {
                'body': 'holaaaaaaaaaaaaaaaaaaa',
                'id': UUID('9a3cdb12-fe09-11eb-a65d-0242ac150003'),
                'level': 1,
                'parent_id': UUID('b1865484-fdef-11eb-90ee-0242ac150003'),
                'reference_id': UUID('09280706-4ab6-459a-86cc-dc953479c356'),
                'replies': [
                    {
                        'body': 'holaaaaaaaaaaaaaaaaaaa',
                        'id': UUID('a53c7522-fe09-11eb-a65d-0242ac150003'),
                        'level': 2,
                        'parent_id': UUID('9a3cdb12-fe09-11eb-a65d-0242ac150003'),
                        'reference_id': UUID('124ece3a-a8ff-415b-8e98-66852d4d7e16'),
                        'replies': [
                            {
                                'body': 'holaaaaaaaaaaaaaaaaaaa',
                                'id': UUID('a349cf9a-fe12-11eb-8cc0-0242ac150003'),
                                'level': 3,
                                'parent_id': UUID('a53c7522-fe09-11eb-a65d-0242ac150003'),
                                'reference_id': UUID('e6dbbf9c-ff0b-4337-a9c5-aa9eedfebc07')
                            }
                        ]
                    }
                ]
            }
        ]
    }
]

子级嵌套在父级中的位置。我尝试了一个递归函数:

def parse(comments:List[Dict]) -> List[Dict]:
    output_nested = []
    while len(comments) > 0:
        comment = comments.pop(0)

        replies = [reply for reply in comments if reply['parent_id'] == comment['id']]
        replies = parse(replies)

        comment['replies'] = replies
        comments = [c for c in comments if c['parent_id'] != comment['id']]

        output_nested.append(comment)

    return output_nested

但它没有按预期工作,实际上我 运行 不知道如何做到这一点。

你可以使用递归:

#an object to make your basic input compatible with the function below
class UUID:
   def __init__(self, _id):
      self.id = _id
   def __eq__(self, uuid):
      return self.id == getattr(uuid, 'id', None)
   def __repr__(self):
      return f'{self.__class__.__name__}("{self.id}")'
   
comments = [{'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("b1865484-fdef-11eb-90ee-0242ac150003"), 'level': 0, 'parent_id': None, 'reference_id': UUID("e480ae80-89e2-4d38-ba92-570e863fec87")}, {'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("b76db7ac-fdef-11eb-90ee-0242ac150003"), 'level': 1, 'parent_id': UUID("b1865484-fdef-11eb-90ee-0242ac150003"), 'reference_id': UUID("b621b10b-47c8-4492-881d-577e5a7a98af")}, {'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("9a3cdb12-fe09-11eb-a65d-0242ac150003"), 'level': 1, 'parent_id': UUID("b1865484-fdef-11eb-90ee-0242ac150003"), 'reference_id': UUID("09280706-4ab6-459a-86cc-dc953479c356")}, {'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("a53c7522-fe09-11eb-a65d-0242ac150003"), 'level': 2, 'parent_id': UUID("9a3cdb12-fe09-11eb-a65d-0242ac150003"), 'reference_id': UUID("124ece3a-a8ff-415b-8e98-66852d4d7e16")}, {'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("a349cf9a-fe12-11eb-8cc0-0242ac150003"), 'level': 3, 'parent_id': UUID("a53c7522-fe09-11eb-a65d-0242ac150003"), 'reference_id': UUID("e6dbbf9c-ff0b-4337-a9c5-aa9eedfebc07")}]
def to_tree(p = None):
   return [{**i, **({'replies':k} if (k:=to_tree(i['id'])) else {})} 
            for i in comments if i['parent_id'] == p]

full_comments = to_tree()

输出:

[{'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("b1865484-fdef-11eb-90ee-0242ac150003"), 'level': 0, 'parent_id': None, 'reference_id': UUID("e480ae80-89e2-4d38-ba92-570e863fec87"), 'replies': [{'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("b76db7ac-fdef-11eb-90ee-0242ac150003"), 'level': 1, 'parent_id': UUID("b1865484-fdef-11eb-90ee-0242ac150003"), 'reference_id': UUID("b621b10b-47c8-4492-881d-577e5a7a98af")}, {'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("9a3cdb12-fe09-11eb-a65d-0242ac150003"), 'level': 1, 'parent_id': UUID("b1865484-fdef-11eb-90ee-0242ac150003"), 'reference_id': UUID("09280706-4ab6-459a-86cc-dc953479c356"), 'replies': [{'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("a53c7522-fe09-11eb-a65d-0242ac150003"), 'level': 2, 'parent_id': UUID("9a3cdb12-fe09-11eb-a65d-0242ac150003"), 'reference_id': UUID("124ece3a-a8ff-415b-8e98-66852d4d7e16"), 'replies': [{'body': 'holaaaaaaaaaaaaaaaaaaa', 'id': UUID("a349cf9a-fe12-11eb-8cc0-0242ac150003"), 'level': 3, 'parent_id': UUID("a53c7522-fe09-11eb-a65d-0242ac150003"), 'reference_id': UUID("e6dbbf9c-ff0b-4337-a9c5-aa9eedfebc07")}]}]}]}]

不要将不必要的复杂和不可读的单行代码误认为是好的代码。 实际上应该是这样的:

def to_tree(comments, parent=None):
    tree = []
    for comment in comments:
        if comment['parent_id'] != parent:
            continue
        subdict = comment.copy()
        child_tree = to_tree(comments, comment['id'])
        if child_tree:
            subdict['replies'] = child_tree
        tree.append(subdict)
    return tree

full_comments = to_tree(comments)

它仍然缺少一个很好的文档字符串来解释它的作用,但拼写逻辑使代码可维护和可调试。