线性同余生成器——如何选择种子和统计检验

Question

我需要做一个线性同余生成器来成功通过选定的统计测试。

我的问题是：如何正确选择生成器的数字和我应该选择哪些统计检验？

我考虑过：

均匀性卡方频率检验
- 每个生成方法收集 10,000 个数字
- 将[0.1) 分成10等份
Kolmogorov-Smirnov 均匀性检验
- 由于 K-S 检验对较小的数字集效果更好，您可以使用为卡方频率检验生成的 10,000 个中的前 100 个

代码示例如下：

def seedLCG(initVal):
    global rand
    rand = initVal

def lcg():
    a = 1664525
    c = 1013904223
    m = 2**32
    global rand
    rand = (a*rand + c) % m
    return rand

seedLCG(1)

for i in range(1000):
    print (lcg())

在选择种子时，我考虑的是纳秒级，但我不知道如何实现它，它是否有意义？这个想法是为了表明所选择的种子是随机选择的，而不是从上限中选择的

Answer 1

关于如何正确选择生成器的数字，在 Wiki 页面中有 Hull–Dobell 定理的描述，它告诉您如何选择 a 和 c 以获得全周期生成器。您从 Numerical Recipes 获得了数字，据我所知，您将获得完整周期 [0...2³²) 生成器。或者您可以查看 this paper 中的品质因数，有 (a,c) 对可用于任何所需的周期大小。

关于测试，看看@pjs 提供的论文。

when it comes to choosing seeds, I was thinking about nanoseconds, but I have no idea how to implement it and will it make sense at all? The idea is to show that the selected seeds were chosen randomly and not so much from the cap。我认为这不是一个好主意，因为您不能保证从 time/ceil/... 中挑选的种子不会重叠。 LCG基本上是双射[0...2³²)<->[0...2³²)映射，比较容易重叠随机数流，以便您的结果相关。

相反，我建议使用另一个不错的属性 LCG - 向前（和向后）对数跳跃。因此，为了在 N 核上进行模拟，您可以在第一个代码上选择单个种子和运行，相同的种子，但跳过 (N/2³²) 第二个核心，种子和跳过（N/2³² * 2）等等等等。

具有显式状态和跳过的 LCG 代码如下，Win10 x64，Python 3.7 Anaconda

import numpy as np

class LCG(object):

    UZERO: np.uint32 = np.uint32(0)
    UONE : np.uint32 = np.uint32(1)

    def __init__(self, seed: np.uint32, a: np.uint32, c: np.uint32) -> None:
        self._seed: np.uint32 = np.uint32(seed)
        self._a   : np.uint32 = np.uint32(a)
        self._c   : np.uint32 = np.uint32(c)

    def next(self) -> np.uint32:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint32:
        return self._seed

    def set_seed(self, seed: np.uint32) -> np.uint32:
        self._seed = seed

    def skip(self, ns: np.int32) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint32 = np.uint32(ns)

        a: np.uint32 = self._a
        c: np.uint32 = self._c

        a_next: np.uint32 = LCG.UONE
        c_next: np.uint32 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next


#%%
np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)

print(rng.next())
print(rng.next())
print(rng.next())

rng.skip(-3) # back by 3
print(rng.next())
print(rng.next())
print(rng.next())

rng.skip(-3) # back by 3
rng.skip(2) # forward by 2
print(rng.next())

更新

正在生成 10k 个数字

np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)
q = [rng.next() for _ in range(0, 10000)]

线性同余生成器——如何选择种子和统计检验

Linear congruential generator - how to choose seeds and statistical tests

python

random

lcg