从 Julia 中的离散分布中抽样

Sampling from a discrete distribution in Julia

我想从一个discrete distribution中重复采样以获得一个数字。

以下是一些实现我正在寻找的代码:

const probabilities = [0.3, 0.3, 0.2, 0.15, 0.05]
const cummulative_probabilities = cumsum(probabilities)

function pickone(cummulative_probabilities)
    n = length(cummulative_probabilities)
    i = 1
    r = rand()
    while r >= cummulative_probabilities[i] && i<n 
        i+=1
    end
    return i
end 

for i in 1:20
    println(pickone(cummulative_probabilities))
end

提议的使用 Distributions 的替代方案并没有削减它,因为我能得到的最接近的是以下代码:

using Random
using Distributions

const probabilities = [0.3, 0.3, 0.2, 0.15, 0.05]
mnd = Multinomial(1, probabilities)

for i in 1:20
    println(rand(mnd))
end

唉,在这种情况下,rand returns 整个向量只有一个 1,其余为零。

你好像在找加权抽样,由sample在StatsBase.jl中实现:

julia> using StatsBase

julia> using FreqTables

julia> 

julia> proptable([pickone(cummulative_probabilities) for _ in 1:10^7])
5-element Named Array{Float64,1}
Dim1  │ 
──────┼──────────
1     │  0.300094
2     │       0.3
3     │  0.199871
4     │  0.150075
5     │ 0.0499595

julia> proptable([sample(Weights(probabilities)) for _ in 1:10^7])
5-element Named Array{Float64,1}
Dim1  │ 
──────┼──────────
1     │   0.29987
2     │  0.300086
3     │  0.199956
4     │  0.150184
5     │ 0.0499035

解决方案是使用加权概率抽样。

如果尚未添加,请添加 StatsBase 包

Pkg.add("StatsBase")

抽样:

using StatsBase
const probabilities = [0.3, 0.3, 0.2, 0.15, 0.05]
items = [i for i in 1:length(probabilities)]
weights = Weights(probabilities)

for i in 1:20
    println(sample(items, weights))
end