如何得到多次执行伯努利实验的详细结果(概率树)

How to get the detailed results (probability tree) of executing a bernoulli experiment a large number of times

假设如下实验:执行相同的伯努利试验(成功概率为P)N次

我需要以下信息:success/failure 的所有可能序列及其发生概率。

示例: 成功概率 P = 40% 的伯努利实验执行 3 次将产生以下结果(S 为成功,F 为失败):

FFF 0.216

旧金山 0.144

FSF 0.144

SSF 0.096

FFS 0.144

SFS 0.096

FSS 0.096

SSS 0.064

我试图暴力破解它来获得结果,但它在只有 N = 25 的情况下很快就卡住了,我得到一个 OutOfMemoryException...

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace ConsoleApplication
{
    class Program
    {
        static Dictionary<string, double> finalResultProbabilities = new Dictionary<string, double>();

        static void Main(string[] args)
        {
            // OutOfMemoryException if I set it to 25 :(
            //var nbGames = 25;
            var nbGames = 3;
            var probabilityToWin = 0.4d;

            CalculateAverageWinningStreak(string.Empty, 1d, nbGames, probabilityToWin);

            // Do something with the finalResultProbabilities data...
        }

        static void CalculateAverageWinningStreak(string currentResult, double currentProbability, int nbGamesRemaining, double probabilityToWin)
        {
            if (nbGamesRemaining == 0)
            {
                finalResultProbabilities.Add(currentResult, currentProbability);
                return;
            }

            CalculateAverageWinningStreak(currentResult + "S", currentProbability * probabilityToWin, nbGamesRemaining - 1, probabilityToWin);
            CalculateAverageWinningStreak(currentResult + "F", currentProbability * (1 - probabilityToWin), nbGamesRemaining - 1, probabilityToWin);
        }
    }
}

我需要能够及时支持到N=3000(任意P不到3秒出结果)

有没有一种数学方法可以最佳地做到这一点?

由于您对最长连胜的长度感兴趣,因此在试验的任何时候,只有两个有关历史的相关事实:(1) 到目前为止最长连胜的时长 (m) (2) 当前连胜多长时间 (k)。初始状态为 (0, 0)。转换是 (m, k) -> (max(m, k + 1), k + 1) 获胜, (m, k) -> (m, 0) 失败。一旦我们知道了所有最终状态的概率,我们就可以平均了。

编辑:此代码的修订版进行了优化,以减少极不可能发生的事件所需的计算。具体来说,我们忽略所有长度大于或等于某些 cutoff 的状态。给定绝对误差预算 abserr,我们确定我们可以从最终预期值计算中排除高达 abserr / n 的概率质量,因为可能的最长连胜是 n。开始连胜的位置少于 n 个,在每个位置,连胜长度 cutoff 的概率为 pwin**cutoff。使用粗联合边界,我们需要

n * pwin**cutoff <= abserr / n
pwin**cutoff <= abserr / n**2
log(pwin) * cutoff <= log(abserr / n**2)
cutoff >= log(abserr / n**2, pwin),

因为 log(pwin) < 0.

最后一个不等式反转了方向

尽管评估器质量很差(即 2014 年硬件上的 Python 3 解释器),但修改后的代码运行时间不到三秒。

import math


def avglongwinstreak(n, pwin, abserr=0):
    if n > 0 and pwin < 1 and abserr > 0:
        cutoff = math.log(abserr / n**2, pwin)
    else:
        cutoff = n + 1
    dist = {(0, 0): 1}
    for i in range(n):
        nextdist = {}
        for (m, k), pmk in dist.items():
            winkey = (max(m, k + 1), k + 1)
            if winkey[0] < cutoff:
                nextdist[winkey] = nextdist.get(winkey, 0) + pmk * pwin
            losskey = (m, 0)
            nextdist[losskey] = nextdist.get(losskey, 0) + pmk * (1 - pwin)
        dist = nextdist
    return sum(m * pmk for ((m, k), pmk) in dist.items())


print(avglongwinstreak(3000, 0.4, 1e-6))

这是一种不同的方法,足够精确和快速只是二次方。最长连胜的期望值等于

 n
sum Pr(there exists a win streak of length at least k).
k=1

我们对概率的推理如下。记录要么以长度 - k 连胜(概率 pwin**k)开始,要么以 j 次获胜开始,部分 j in 0..k-1 随后输掉(概率 pwin**j * (1 - pwin)),在这种情况下,概率等于在 n - (j + 1) 次尝试中连续 k 长度的概率。我们使用 memoization 来评估此逻辑在 pwinstreak 中暗示的重复性; fastpwinstreak 中较快的版本使用代数来避免重复求和。

def avglongwinstreak(n, pwin):
    return sum(fastpwinstreak(n, pwin, k) for k in range(1, n + 1))


def pwinstreak(n, pwin, k):
    memo = [0] * (n + 1)
    for m in range(k, n + 1):
        memo[m] = pwin**k + sum(pwin**j * (1 - pwin) * memo[m - (j + 1)]
                                for j in range(k))
    return memo[n]


def fastpwinstreak(n, pwin, k):
    pwink = pwin**k
    memo = [0] * (n + 1)
    windowsum = 0
    for m in range(k, n + 1):
        memo[m] = pwink + windowsum
        windowsum = pwin * windowsum + (1 - pwin) * (memo[m] - pwink *
                                                     memo[m - k])
    return memo[n]


print(avglongwinstreak(3000, 0.4))

允许错误的版本:

def avglongwinstreak(n, pwin, abserr=0):
    avg = 0
    for k in range(1, n + 1):
        p = fastpwinstreak(n, pwin, k)
        avg += p
        if (n - k) * p < abserr:
            break
    return avg


def fastpwinstreak(n, pwin, k):
    pwink = pwin**k
    memo = [0] * (n + 1)
    windowsum = 0
    for m in range(k, n + 1):
        memo[m] = pwink + windowsum
        windowsum = pwin * windowsum + (1 - pwin) * (memo[m] - pwink *
                                                     memo[m - k])
    return memo[n]


print(avglongwinstreak(3000, 0.4, 1e-6))