如何使用 Python 从多个分类分布中抽样

Question

设 P 为每行总和为 1 的数组。如何生成矩阵 A，其中

A 与 P 具有相同的维度，并且 A_{ij} 等于 1 的概率为 P_{ij}
A 在每一行中只有一个条目等于 1，所有其他条目为零

如何在 Numpy 或 Scipy 中执行此操作？

我可以使用 for 循环来完成，但这显然很慢。有没有办法使用 Numpy 来提高效率？还是 Numba？

Answer 1

这遵循维基百科。

import numpy.random as rnd
import numpy as np

A_as_numbers = np.argmax(np.log(P) + rnd.gumbel(size=P.shape), axis=1)
A_one_hot = np.eye(P.shape[1])[A_as_numbers].reshape(P.shape)

测试于：

P = np.matrix([[1/4, 1/4, 1/4, 1/4], [1/3,1/3,1/6,1/6]])

得到：

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.]])

Answer 2

好的，使用带有 2d 扩展的选择

import numpy as np

def f(P):
    a = np.zeros(4, dtype=np.int64)
    q = np.random.choice(4, size=1, replace=True, p=P)
    a[q] = 1
    return a

P = np.array([[1/4, 1/4, 1/4, 1/4],
              [1/3,1/3,1/6,1/6]])

r = np.apply_along_axis(f, 1, P)
print(r)

[[0 0 0 1] [0 0 1 0]]

[[1 0 0 0] [0 1 0 0]]

如何使用 Python 从多个分类分布中抽样

How to sample from multiple categorical distributions using Python

python

random

numpy

montecarlo