使用张量流概率解决掷骰子和抛硬币问题,方差是错误的

Solving dice throwing and coin tossing problem using tensorflow probability, variance is wrong

我对统计不是很精通,我正在努力学习。所以请多多包涵。我在 Quora 中看到了 this 个问题 - 基本上陈述了以下内容 -

A fair dice is rolled if the result is an odd number then a fair coin is tossed 3 times. Otherwise, if the result is even number then a fair coin will be tossed 2 times. In both cases, # of heads is counted. What's the variance of # heads obtained?

我想用 Python 和 tf-probability 来解决它。这是我所做的 -

import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np
tf.enable_eager_execution()
probs = [1/6.] * 6

dices = tfp.distributions.Multinomial(total_count=1000, probs=probs)

n = dices.sample()

HEAD = 1
TAIL = 0
l = list(n.numpy())
heads_even = []
heads_odd = []
for i, nums in enumerate(l):
    mul_by = 3 if (i + 1) % 2 != 0 else 2
    tosses = tfp.distributions.Bernoulli(probs=0.5)
    coin_flip_data = tosses.sample(nums * mul_by)
    l2 = coin_flip_data.numpy()
    unique, counts = np.unique(l2, return_counts=True)
    head_tails = dict(zip(unique, counts))
    if (i + 1) % 2 != 0:
        heads_odd.append(head_tails[HEAD])
    else:
        heads_even.append(head_tails[HEAD])

total_heads = heads_odd + heads_even
final_nd_arr = np.array(total_heads)
print(final_nd_arr.var())

然而,final_nd_arr.var() 当然与实际答案相去甚远(它是 2089.805555555556),0.68(正如人们在 Quora 答案中提到的那样)。

我无法找出我做错了什么。我该如何纠正我的错误?

任何指针都会有所帮助。非常感谢。

------------ 编辑

要提供更多数据,

dices.sample() => array([169., 173., 149., 171., 175., 163.], dtype=float32)
heads_odd => [266, 210, 259]
heads_even => [176, 167, 145]
total_heads => [266, 210, 259, 176, 167, 145]

您正在计算错误分布的方差。我们正在寻找的方差适用于您将一遍又一遍地掷骰子的实验,每次都计算正面的数量,并计算正面数量的方差。你在你的代码中这样做,但你是求和所有骰子掷出的正面总数,然后为每个可能的骰子结果取这些总和的方差。

这将给出正确的结果。我添加了一些评论,希望能澄清它:

import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np

tf.enable_eager_execution()

# Simulate the outcome of 1000 dice rolls
probs = [1/6.] * 6
dices = tfp.distributions.Multinomial(total_count=1000, probs=probs)
n = dices.sample()
l = list(n.numpy().astype(int))

L = []
# Loop over 6 possible dice outcomes
for i in range(len(l)):
    # Loop over the rolls for this dice outcome
    for _ in range(l[i]):
        # For each of the dice rolls,
        # Flip a coin 2 or three times
        num_tosses = 3 if (i + 1) % 2 != 0 else 2
        tosses = tfp.distributions.Bernoulli(probs=0.5)
        coin_flip_data = tosses.sample(num_tosses)

        # And count the number of heads
        num_heads = np.sum(coin_flip_data.numpy())
        L += [num_heads]

np.var(L)
> 0.668999