在 python 和 c++ 中生成类似的随机数，但得到不同的输出

Question

我有两个函数，用 c++ 和 python 来确定一个具有一定概率的事件在多次滚动中会发生多少次。

Python版本：

def get_loot(rolls):
    drops = 0

    for i in range(rolls):
        # getting a random float with 2 decimal places
        roll = random.randint(0, 10000) / 100
        if roll < 0.04:
            drops += 1

    return drops

for i in range(0, 10):
    print(get_loot(1000000))

Python 输出：

c++ 版本：

int get_drops(int rolls){
    int drops = 0;
    for(int i = 0; i < rolls; i++){
        // getting a random float with 2 decimal places
        float roll = (rand() % 10000)/100.0f;
        if (roll < 0.04){
            drops++;
        }
    }
    return drops;
}

int main()
{
    srand(time(NULL));
    for (int i = 0; i <= 10; i++){
        cout << get_drops(1000000) << "\n";
    }
}

c++ 输出：

食物看起来一模一样（至少对我来说是这样）。这两个函数都模拟在 1,000,000 次滚动中发生概率为 0.04 的事件。然而 python 版本的输出比 c++ 版本低大约 30%。这两个版本有何不同，为什么它们有不同的输出？

Answer 1

在 C++ 中 rand() "Returns 0 到 RAND_MAX 之间的伪随机整数。"

RAND_MAX 是“依赖于库，但保证在任何标准库实现中至少为 32767。”

让我们将 RAND_MAX 设为 32,767。

在计算 [0, 32767) % 10000 时随机数生成是倾斜的。

值 0-2,767 在 (% 10000)->

范围内全部出现 4 次

Value	Calculation	Result
1	1 % 10000	1
10001	10001 % 10000	1
20001	20001 % 10000	1
30001	30001 % 10000	1

其中值 2,768-9,999 在 (% 10000) ->

范围内仅出现 3 次

Value	Calculation	Result
2768	2768 % 10000	2768
12768	12768 % 10000	2768
22768	22768 % 10000	2768

这使得值 0-2767 比值 2768-9,999 出现的可能性高 25%（假设 rand() 实际上在 0 和 RAND_MAX 之间产生均匀分布）。

另一方面，

Python 使用 randint 在开始和结束之间产生均匀分布，因为 randint 是“randrange(a, b+1) 的别名”

和randrange（在python 3.2 和更新版本中）将产生均匀分布的值：

Changed in version 3.2: randrange() is more sophisticated about producing equally distributed values. Formerly it used a style like int(random()*n) which could produce slightly uneven distributions.

在 C++ 中有多种生成随机数的方法。与 python 最相似的东西可能是使用 Mersenne Twister 引擎（如果有一些差异，它与 python 相同）。

通过uniform_int_distribution with mt19937:

#include <iostream>
#include <random>
#include <chrono>


int get_drops(int rolls) {
    std::mt19937 e{
            static_cast<unsigned int> (
                    std::chrono::steady_clock::now().time_since_epoch().count()
            )
    };
    std::uniform_int_distribution<int> d{0, 9999};
    int drops = 0;
    for (int i = 0; i < rolls; i++) {
        float roll = d(e) / 100.0f;
        if (roll < 0.04) {
            drops++;
        }
    }
    return drops;
}

int main() {
    for (int i = 0; i <= 10; i++) {
        std::cout << get_drops(1000000) << "\n";
    }
}

值得注意的是，这两个引擎的底层实现以及播种和分发都略有不同，但是，这将更接近 python。

或者建议扩大兰特并除以 RAND_MAX:

int get_drops(int rolls) {
    int drops = 0;
    for (int i = 0; i < rolls; i++) {
        float roll = (10000 * rand() / RAND_MAX) / 100.0f;
        if (roll < 0.04) {
            drops++;
        }
    }
    return drops;
}

这也更接近 python 输出（同样在底层实现中生成随机数的方式有一些差异）。

Answer 2

两种语言使用不同的伪随机生成器。如果您想统一性能，您可能希望确定性地生成您自己的伪随机值。

这是 Python 中的样子：

SEED = 101
TOP = 999
class my_random(object):
    def seed(self, a=SEED):
        """Seeds a deterministic value that should behave the same irrespectively of the coding language"""
        self.seedval = a
    def random(self):
        """generates and returns the random number based on the seed"""
        self.seedval = (self.seedval * SEED) % TOP
        return self.seedval

instance = my_random(SEED)
read_seed = instance.seed
read_random = instance.random()

然而，在 C++ 中，它应该变成：

const int SEED = 101;
const int TOP = 9999;
class myRandom(){
    int seedval;
    public int random();
    myRandom(int a=SEED){
        this.seedval = a;
    }
    int random(){
        this.seedval = (this.seedval * SEED) % TOP;
        return this.seedval;
    }
    int seed(){
        return this.seedval;
    }
}
instance = myRandom(SEED);
readSeed = instance.seed;
readRandom = instance.random();

Answer 3

结果有偏差，因为 rand() % 10000 不是实现均匀分布的正确方法。（另请参阅 header <random> 中提供的 rand() Considered Harmful by Stephan T. Lavavej.) In modern C++, prefer the pseudo-random number generation library。例如：

#include <iostream>
#include <random>

int get_drops(int rolls)
{
    std::random_device rd;
    std::mt19937 gen{ rd() };
    std::uniform_real_distribution<> dis{ 0.0, 100.0 };
    int drops{ 0 };
    for(int roll{ 0 }; roll < rolls; ++roll)
    {
        if (dis(gen) < 0.04)
        {
            ++drops;
        }
    }

    return drops;
}

int main()
{
    for (int i{ 0 }; i <= 10; ++i)
    {
        std::cout << get_drops(1000000) << '\n';
    }
}

在 python 和 c++ 中生成类似的随机数，但得到不同的输出

Similar random number generation in python and c++ but getting different output

c++

python

random

random-seed