就地洗牌列表的一部分

Shuffling part of a list in-place

我有一个 list,我想随机播放其中的一部分 就地

我知道 random.shuffle() 并且它就地工作,但是如果我对列表进行切片,它会打乱原始输入的切片副本,而原始 list 保持不变:

import random

l = list(range(20))
print(l)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

random.shuffle(l[:10])  # I wish it was shuffling the first half
print(l)  # but does nothing to `l`
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

我有什么选择?


编辑:这不是 this question, since there was no requirement for it to be in-place. Eventually, it seems that it would be possible to shuffle in-place a portion of a list only manually (which is exactly what I was trying to avoid), as suggested by one of the answers posted there 的真正副本。

不是很到位,但得到了预期的结果:

import random

l = list(range(20))

lpart = l[:10]
random.shuffle(lpart)

l[:10] = lpart

print(l)

修改列表的 n 位不会只对列表的一部分起作用。您可以使用 random.sample 代替随机抽样而不进行替换,然后切片分配回来:

k = 10
l[:k] = random.sample(l[:k], k=k)

print(l)
# [1, 7, 6, 0, 2, 3, 4, 9, 8, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

可以使用 Fisher-Yates shuffle to fundamentally re-implement random.shuffle() 接受 firstlast 索引作为参数,例如:

import random


def valid_index(i, n):
    assert(-n <= i < n)
    return i % n


def shuffle(seq, first=0, last=-1, rand_int_gen=None):
    n = len(seq)
    first = valid_index(first, n)
    last = valid_index(last, n)
    # use Fisher-Yates shuffle (Durstenfeld method)
    if callable(rand_int_gen):
        for i in range(first, last):
            j = rand_int_gen(i, last)
            seq[i], seq[j] = seq[j], seq[i]
    else:
        getrandbits = random.getrandbits
        for i in range(first, last + 1):
            size = last - i + 1
            j = getrandbits(size.bit_length()) % size + i
            seq[i], seq[j] = seq[j], seq[i]
    return seq

像这样使用:

l = list(range(20))
print(l)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

random.seed(0)  # just to show reproducible results
shuffle(l, 0, 9)
print(l)
# [6, 7, 2, 5, 8, 4, 9, 3, 0, 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

在时间方面,这实际上比 random.shuffle() 对整个序列进行洗牌要快几个百分点。

这本质上更快,因为它直接从 random.getrandbits() 获取随机值,这是 random 公开的最接近随机整数生成的方法,其他方法,例如randint()randrange() 最终减少到这个。 最后两个最终在内部使用 _getrandbelow(),这可能会更频繁地调用 getrandbits()

for k in range(1, 7): 
    n = 10 ** k 
    print(n) 
    %timeit l = list(range(n)); random.shuffle(l) 
    %timeit l = list(range(n)); shuffle(l) 
    print() 
10
100000 loops, best of 3: 6.16 µs per loop
100000 loops, best of 3: 3.85 µs per loop

100
10000 loops, best of 3: 54.3 µs per loop
10000 loops, best of 3: 28 µs per loop

1000
1000 loops, best of 3: 585 µs per loop
1000 loops, best of 3: 341 µs per loop

10000
100 loops, best of 3: 6.01 ms per loop
100 loops, best of 3: 3.56 ms per loop

100000
10 loops, best of 3: 71.7 ms per loop
10 loops, best of 3: 44.1 ms per loop

1000000
1 loop, best of 3: 815 ms per loop
1 loop, best of 3: 582 ms per loop

@usr2564301 指出,here 也建议采用这种方法。 不幸的是,我认为没有更好的方法可以就地执行此操作。