如何映射递归结构?

How to map a recursive structure?

我正在尝试找出如何映射包含字典和列表的递归结构,到目前为止我已经知道了:

import collections


def rec_walk(l):
    for v in l:
        if isinstance(v, list):
            yield from rec_walk(v)
        else:
            yield v


def rec_map(l, f):
    for v in l:
        if isinstance(v, collections.Iterable):
            if isinstance(v, list):
                yield list(rec_map(v, f))
            elif isinstance(v, dict):
                yield dict(rec_map(v, f))
        else:
            yield f(v)


a = ["0", ["1", "2", ["3", "4"]], [[[[["5"]]]]]]
print(list(rec_map(a, lambda x: x + "_tweaked")))
b = {
    'a': ["0", "1"],
    'b': [[[[[["2"]]]]]],
    'c': {
        'd': [{
            'e': [[[[[[["3"]]]]]]]
        }]
    }
}
print(dict(rec_map(b, lambda x: x + "_tweaked")))

输出:

[[[]], [[[[[]]]]]]
{}

如您所见,上面示例的问题是 rec_map 没有返回正确映射的结构,我想要得到的是正确映射的相同结构或新的克隆映射结构,例如,像这样:

a = ["0", ["1", "2", ["3", "4"]], [[[[["5"]]]]]]
rec_map(a, lambda x: x + "_tweaked")

应该将a转换成:

["0_tweaked", ["1_tweaked", "2_tweaked", ["3_tweaked", "4_tweaked"]], [[[[["5_tweaked"]]]]]]

和:

b = {
    'a': ["0", "1"],
    'b': [[[[[["2"]]]]]],
    'c': {
        'd': [{
            'e': [[[[[[["3"]]]]]]]
        }]
    }
}
print(dict(rec_map(b, lambda x: x + "_tweaked")))

进入:

b = {
    'a': ["0_tweaked", "1_tweaked"],
    'b': [[[[[["2_tweaked"]]]]]],
    'c': {
        'd': [{
            'e': [[[[[[["3_tweaked"]]]]]]]
        }]
    }
}

这是由于 yield from。您应该改用 yield list()

yield from 每次从生成器中生成每个元素,但您在这里想要的是生成整个列表而不是其中的每个元素。

这个问题解释了区别。

以下修改后的代码版本会生成您想要的行为:

def rec_walk(l):
    for v in l:
        if isinstance(v, list):
            yield list(rec_walk(v))
        else:
            yield v


def rec_map(l, f):
    for v in l:
        if isinstance(v, list):
            yield list(rec_map(v, f))
        else:
            yield f(v)


a = ["0", ["1", "2", ["3", "4"]], [[[[["5"]]]]]]
print('-' * 80)
print(list(rec_walk(a)))
print('-' * 80)
print(list(rec_map(a, lambda x: x + "_tweaked")))

您正在创建一个生成器,然后使用 yield from,这实际上是扁平化的。相反,您需要具体化生成器而不是从中产生:

In [1]: def rec_map(l, f):
   ...:     for v in l:
   ...:         if isinstance(v, list):
   ...:             yield list(rec_map(v, f))
   ...:         else:
   ...:             yield f(v)
   ...:

In [2]: a = ["0", ["1", "2", ["3", "4"]], [[[[["5"]]]]]]
   ...:

In [3]: list(rec_map(a, lambda x: x + "_tweaked"))
Out[3]:
['0_tweaked',
 ['1_tweaked', '2_tweaked', ['3_tweaked', '4_tweaked']],
 [[[[['5_tweaked']]]]]]

您遇到的问题是使用生成器执行此操作要困难得多,因为您必须仔细管理返回的内容。老实说,您似乎甚至不需要发电机,只需使用:

In [16]: def rec_map(l, f):
    ...:     if isinstance(l, list):
    ...:         return [rec_map(v, f) for v in l]
    ...:     elif isinstance(l, dict):
    ...:         return {k:rec_map(v, f) for k,v in l.items()}
    ...:     else:
    ...:         return f(l)
    ...:

In [17]: rec_map(b, lambda x: x + '_tweaked')
Out[17]:
{'a': ['0_tweaked', '1_tweaked'],
 'b': [[[[[['2_tweaked']]]]]],
 'c': {'d': [{'e': [[[[[[['3_tweaked']]]]]]]}]}}

此外,不要使用 collections.Iterable,请明确检查您正在处理的类型。注:

In [18]: isinstance('I am a string but I am iterable!', collections.Iterable)
Out[18]: True