无需循环即可替换列表中的相同元素

Replace identical elements in a list without loop

我正在尝试用新字符串替换列表中所有相同的元素,并且还试图摆脱对所有内容使用循环。

# My aim is to turn:
list = ["A", "", "", "D"]
# into:
list = ["A", "???", "???", "D"]
# but without using a for-loop

我从理解的变体开始:

# e.g. 1
['' = "???"(i) for i in list]
# e.g. 2
list = [list[i] .replace '???' if ''(i) for i in range(len(lst))]

然后我尝试使用 Python 的地图功能 here:

list[:] = map(lambda i: "???", list)
# I couldn't work out where to add the '""' to be replaced.

终于杀了一只third solution:

list[:] = ["???" if ''(i) else i for i in list]

我觉得我离合理的攻击路线更远了,我只是想要一个整洁的方式来完成一个简单的任务。

您可以使用列表理解,但您要做的是比较每个元素,如果匹配则替换为不同的字符串,否则只保留原始元素。

>>> data = ["A", "", "", "D"]
>>> ['???' if i == '' else i for i in data]
['A', '???', '???', 'D']

你可以试试这个:

list1 = ["A", "", "", "D"]

list2=list(map(lambda x: "???" if not x else x,list1))

print(list2)

这是上面那个的加长版:

list1 = ["A", "", "", "D"]
def check_string(string):
    if not string:
        return "???"
    return string

list2=list(map(check_string,list1))
print(list2)

利用 "" 字符串是 False 值这一事实,您可以分别使用隐式布尔值和 return 值。 输出:

['A', '???', '???', 'D']

为了简洁(如果我们允许列表理解,一种循环形式)。此外,正如@ComteHerappait 正确指出的那样,这是用 '???' 替换空字符串,与问题示例一致。

>>> [e or '???' for e in l]
['A', '???', '???', 'D']

如果我们专注于替换重复元素,那么:

seen = set()
newl = ['???' if e in seen or seen.add(e) else e for e in l]
>>> newl
['A', '', '???', 'D']

最后,以下内容替换列表中的所有重复项

from collections import Counter

c = Counter(l)
newl = [e if c[e] < 2 else '???' for e in l]
>>> newl
['A', '???', '???', 'D']

这个怎么样:-

myList = ['A', '', '', 'D']
myMap = map(lambda i: '???' if i == '' else i, myList)
print(list(myMap))

...将导致:-

['A', '???', '???', 'D']

如果您想避免如标题所示使用循环,可以使用 np.where 而不是 list-comprehension,并且对于大型数组来说速度更快:

data = np.array(["A", "", "", "D"], dtype='object')
index = np.where(data == '')[0]
data[index] = "???"
data.tolist()

结果:

['A', '???', '???', 'D']

速度测试

for rep in [1, 10, 100, 1000, 10000]:
    data = ["A", "", "", "D"] * rep
    print(f'array of length {4 * rep}')
    print('np.where:')
    %timeit data2 = np.array(data, dtype='object'); index = np.where(data2 == '')[0]; data2[index] = "???"; data2.tolist()
    print('list-comprehension:')
    %timeit ['???' if i == '' else i for i in data]

结果:

array of length 4
np.where:
The slowest run took 11.79 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 10.7 µs per loop
list-comprehension:
The slowest run took 5.75 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 487 ns per loop
array of length 40
np.where:
The slowest run took 7.08 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 13 µs per loop
list-comprehension:
100000 loops, best of 5: 2.99 µs per loop
array of length 400
np.where:
The slowest run took 4.83 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 31 µs per loop
list-comprehension:
10000 loops, best of 5: 26 µs per loop
array of length 4000
np.where:
1000 loops, best of 5: 225 µs per loop
list-comprehension:
1000 loops, best of 5: 244 µs per loop
array of length 40000
np.where:
100 loops, best of 5: 2.27 ms per loop
list-comprehension:
100 loops, best of 5: 2.63 ms per loop

对于长度超过 4000 的数组 np.where 更快。