如何高效地进行多项随机试验？

Question

假设一个事件有概率 P 成功。 (0 < P < 1 ) 我必须进行 N 测试以查看是否发生这种情况，我想要成功的总数：

我可以去

int countSuccesses = 0;
while(N-- > 0)
{
   if(Random.NextDouble()<P) countSuccesses++; // NextDouble is from 0.0 to 1.0
}

但是有没有更有效的方法来做到这一点？我想要一个单一的公式，这样我就可以使用一次抽奖 ~~随机数~~ 来确定成功的总数。（编辑只使用一次平局的想法是低于 O(n)）

我希望能够调用一个方法

GetSuccesses( n, P)

它是 O(1)

更新
我将尝试与 MathNet.Numerics.Distributions.Binomial.Sample(P, n) 即使它可能使用的随机数不止一个，我猜它也会比 O(n) 快，即使它不是 O(1)。我将对其进行基准测试。非常感谢 David 和 Rici。

更新
上面的二项式样本是 O(n) 所以它对我没有帮助。但多亏了 Fred 的评论，我才切换到
MathNet.Numerics.Distributions.Normal.Sample(mean, stddev) 其中
mean = n * P
stddev = Math.Sqrt(n * P * (1 - P)); 现在是 O(1) !

Answer 1

根据你对问题的措辞，这是不可能的。

您实质上是在问如何确保单次抛硬币（即一次随机结果）正好是 50% 正面和 50% 反面，这是不可能的。

即使您使用两个随机数，您也希望一头一尾；该测试在所有情况下的 50% 都会失败（因为您可能会得到两个正面或两个反面）。

概率是建立在the law of large numbers的基础上的。这明确指出，小样本不能准确反映预期结果。

The LLN is important because it guarantees stable long-term results for the averages of some random events. For example, while a casino may lose money in a single spin of the roulette wheel, its earnings will tend towards a predictable percentage over a large number of spins. Any winning streak by a player will eventually be overcome by the parameters of the game. It is important to remember that the law only applies (as the name indicates) when a large number of observations is considered. There is no principle that a small number of observations will coincide with the expected value or that a streak of one value will immediately be "balanced" by the others (see the gambler's fallacy).

当我问这个作为评论时；你回复了：

@Flater No, I am making N actual draws but with only one random number.

但这没有意义。如果你只使用一个随机值，并继续使用相同的值，那么每次抽奖显然都会给你完全相同的结果（相同的数字）。

我能以并非不可能的方式最接近地解释你的问题是你错误地将单个 random seed 称为单个随机数。

A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator.

For a seed to be used in a pseudorandom number generator, it does not need to be random. Because of the nature of number generating algorithms, so long as the original seed is ignored, the rest of the values that the algorithm generates will follow probability distribution in a pseudorandom manner.

但是，您明确提到的期望似乎反驳了该假设。你想做这样的事情：

GetSuccesses( n, P, Random.NextDouble())

而且您还希望得到一个 O(1) 操作，它违反了大数定律。

如果您实际上是在谈论拥有一个随机种子；那么你的期望是不正确的。

如果抽N次，操作还是O(N)复杂。每次抽奖后是否随机分配种子无关紧要，它总是 O(N).
GetSuccesses( n, P, Random.NextDouble()) 会给你平局，而不是 一个种子 。无论使用何种术语，您对代码的期望与多次抽奖使用相同种子无关。

正如问题目前的措辞；你想要的是不可能的。几位评论者反复评论澄清，但尚未产生更清晰的画面。

作为旁注，我觉得很奇怪你回答了每条评论除了当直接询问你是否在谈论种子而不是数字时（现在两次).

Answer 2

我不打算在这里写公式，因为它已经在 wiki 中了，而且我真的不知道这里的格式对于这些东西来说是好的。

每个结果的概率可以由伯努利公式确定 https://en.wikipedia.org/wiki/Bernoulli_trial

你需要做的是计算二项式系数，那么概率计算就变得非常简单——将二项式系数乘以p和q的适当次方。填写数组 P[0..n]，其中包含每个结果的概率 - 恰好 i 次成功的次数。

设置后从0到n计算概率的滚动总和。检查 lower/upper 与随机值的界限，一旦它在当前区间内，return 结果。

所以，决定部分是这样的：

sum=0;
for (int i = 0; i <= n; i++)
  if (sum-eps < R && sum+P[i]+eps > R)
    return i;
  else
    sum+=P[i];

这里eps是小浮点值，用来克服浮点舍入问题，R是保存的随机值，P是我之前提到的概率数组。

不幸的是，这种方法对于大 N（20 或 100+）不实用：

你会受到舍入误差的很大影响
随机数生成器的确定性不足以涵盖具有适当概率分布的所有可能结果

Answer 3

根据 @rici 对于小 N，您可以使用二项分布的 CDF 或 PMF，并简单地将随机输入与 0,1,2..N 次成功的概率进行比较。

类似于：

  static void Main(string[] args)
    {

        var trials = 10;
        var trialProbability = 0.25;
        for (double p = 0;  p <= 1; p += 0.01)
        {
            var i = GetSuccesses(trials, trialProbability, p);
            Console.WriteLine($"{i} Successes out of {trials} with P={trialProbability} at {p}");
        }
        Console.ReadKey();

    }
    static int GetSuccesses(int N, double P, double rand)
    {
        for (int i = 0; i <= N; i++)
        {
            var p_of_i_successes = MathNet.Numerics.Distributions.Binomial.PMF(P, N, i);
            if (p_of_i_successes >= rand)
                return i;

            rand -= p_of_i_successes;

        }
        return N;


    }

Answer 4

向我指出了
MathNet.Numerics.Distributions.Binomial.Sample(P, n)
一个基准告诉我它也 O(n) 并且与我原来的

相当

int countSuccesses = 0;
while(N-- > 0)
{
   if(Random.NextDouble()<P) countSuccesses++; // NextDouble is from 0.0 to 1.0
}

但感谢 Fred 的评论：

You could turn that random number into a gaussian sample with mean N*P, which would have the same distribution as your initial function

我刚刚切换到
MathNet.Numerics.Distributions.Normal.Sample(mean, stddev) 其中
mean = n * P
stddev = Math.Sqrt(n * P * (1 - P)); 现在是 O(1) !

我想要的功能是：

    private int GetSuccesses(double p, int n)
    {
        double mean = n * p;
        double stddev = Math.Sqrt(n * p * (1 - p));
        double hits = MathNet.Numerics.Distributions.Normal.Sample(Random, mean, stddev);
        return (int)Math.Round(hits, 0);
    }

正如保罗指出的那样，这是一个近似值，但我很乐意接受。

如何高效地进行多项随机试验？

How to do a number of random trials efficiently?

c#

algorithm

probability

formula