通过询问二进制问题模拟找到随机选择的数字
Simulating finding a randomly chosen number by asking binary questions
作为作业中的一个问题,我被要求编写一个 Octave 函数来模拟 1000 次寻找随机变量 X 和字母表 {0 的实验, 1, 2, 3} 和 pmf:
Px(0) = 1/8
Px(1) = 1/4
Px(2) = 1/2
Px(3) = 1/8
通过询问一系列二进制、"yes" 或 "no" 问题。
我已经确定要求找到 X 值的二进制问题的最佳序列是简单地询问 "Is X = p?" 其中 p 是可能的值,按概率递减的顺序排列。
所以最佳顺序是:
X = 2吗?
如果没有:
X = 1吗?
如果没有:
X=0吗?
如果不是那么X = 3
这是我写的函数:
function x = guessing_experiment(probabilities, n)
% generates n simulations of finding a random number in an alphabet by asking binary questions,
% where 'probabilities' is a list of the probabilities per number in the order the questions will be asked
num_Qs = zeros(1,n); % allocate array of size n for number of questions asked per experiment
[num_col, alphabet_size] = size(probabilities); % get size of alphabet
for i = 1:n % generate n experiments
Qs = 0; % number of questions asked in this experiment
for j = 1:alphabet_size - 1 % iterate through questions
question = rand; % generate random number in range [0, 1]
Qs++; % incremenet number of questions asked
if (question <= probabilities(j)) % if question produces a "yes" answer
break;
endif
endfor
num_Qs(i) = Qs; % store number of questions asked for this experiment
endfor
x = mean(num_Qs); % calculate mean number of questions asked over the n experiments
end
被称为guessing_experiment([1/2, 1/4, 1/8, 1/8], 1000)
其中数组是每个问题产生 "yes" 答案的概率,按照提问方式排列,n 是实验次数。
问这些问题应该产生平均 1.75 个问题,但我的程序总是产生 ~1.87 个平均问题。我的脚本哪里出错了?
我假设它与生成一个新的随机数来模拟被问到的 3 个问题中的每一个有关。
我删除了我之前的错误答案,该答案指出您的脚本是正确的,而您的计算是错误的。我再想想,你的计算是对的。我自己尝试使用以下 MATLAB 脚本:
% probabilities for each number
p = [1/8,1/4,1/2,1/8];
% sort them from higher to lower
p = sort(p,'descend');
% number of questions per probability
nq = 1:length(p)-1;
% the last question can distinguish between two variables
nq(end+1) = nq(end);
% number of trials
n = 100000;
% random sample number of questions
q = randsample(nq,n,true,p);
% mean number of questions
avgQ = mean(q)
和获得的平均值。是 1.75 - 正如您计算的那样。
我会尝试再看一下您的代码,看看有什么问题
编辑
您的脚本的问题是您忽略了 conditional probability,即,在询问有关变量的问题时,您忽略了您已经知道的有关它的信息。例如,当您问第三个问题时,值为 0
的概率是 而不是 p=1/8
而是 p=1/2
因为您已经知道它不是 1
或 2
。
您需要做的修复是将概率除以可能的事件概率 probabilities(j)/sum(probabilities(j:end))
:
n = 10000;
p = [1/8,1/4,1/2,1/8];
% sort them from higher to lower
probabilities = sort(p,'descend');
probabilities(end-1) = probabilities(end-1) + probabilities(end);
probabilities(end) = [];
alphabet_size = numel(probabilities);
num_Qs = zeros(1,n); % allocate array of size n for number of questions asked per experiment
for i = 1:n % generate n experiments
Qs = 0; % number of questions asked in this experiment
for j = 1:alphabet_size % iterate through questions
question = rand; % generate random number in range [0, 1]
Qs = Qs + 1; % incremenet number of questions asked
if question < probabilities(j)/sum(probabilities(j:end)) % if question produces a "yes" answer
break;
end
end
num_Qs(i) = Qs; % store number of questions asked for this experiment
end
x = mean(num_Qs)
x ~ 1.75
在这种情况下条件概率的向量是:
p = [1/8,1/4,1/2,1/8];
p = sort(p,'descend');
cond_p = p./cumsum(p,'reverse')
cond_p =
0.5000 0.5000 0.5000 1.0000
作为作业中的一个问题,我被要求编写一个 Octave 函数来模拟 1000 次寻找随机变量 X 和字母表 {0 的实验, 1, 2, 3} 和 pmf:
Px(0) = 1/8
Px(1) = 1/4
Px(2) = 1/2
Px(3) = 1/8
通过询问一系列二进制、"yes" 或 "no" 问题。
我已经确定要求找到 X 值的二进制问题的最佳序列是简单地询问 "Is X = p?" 其中 p 是可能的值,按概率递减的顺序排列。
所以最佳顺序是:
X = 2吗?
如果没有:
X = 1吗?
如果没有:
X=0吗?
如果不是那么X = 3
这是我写的函数:
function x = guessing_experiment(probabilities, n)
% generates n simulations of finding a random number in an alphabet by asking binary questions,
% where 'probabilities' is a list of the probabilities per number in the order the questions will be asked
num_Qs = zeros(1,n); % allocate array of size n for number of questions asked per experiment
[num_col, alphabet_size] = size(probabilities); % get size of alphabet
for i = 1:n % generate n experiments
Qs = 0; % number of questions asked in this experiment
for j = 1:alphabet_size - 1 % iterate through questions
question = rand; % generate random number in range [0, 1]
Qs++; % incremenet number of questions asked
if (question <= probabilities(j)) % if question produces a "yes" answer
break;
endif
endfor
num_Qs(i) = Qs; % store number of questions asked for this experiment
endfor
x = mean(num_Qs); % calculate mean number of questions asked over the n experiments
end
被称为guessing_experiment([1/2, 1/4, 1/8, 1/8], 1000)
其中数组是每个问题产生 "yes" 答案的概率,按照提问方式排列,n 是实验次数。
问这些问题应该产生平均 1.75 个问题,但我的程序总是产生 ~1.87 个平均问题。我的脚本哪里出错了?
我假设它与生成一个新的随机数来模拟被问到的 3 个问题中的每一个有关。
我删除了我之前的错误答案,该答案指出您的脚本是正确的,而您的计算是错误的。我再想想,你的计算是对的。我自己尝试使用以下 MATLAB 脚本:
% probabilities for each number
p = [1/8,1/4,1/2,1/8];
% sort them from higher to lower
p = sort(p,'descend');
% number of questions per probability
nq = 1:length(p)-1;
% the last question can distinguish between two variables
nq(end+1) = nq(end);
% number of trials
n = 100000;
% random sample number of questions
q = randsample(nq,n,true,p);
% mean number of questions
avgQ = mean(q)
和获得的平均值。是 1.75 - 正如您计算的那样。 我会尝试再看一下您的代码,看看有什么问题
编辑
您的脚本的问题是您忽略了 conditional probability,即,在询问有关变量的问题时,您忽略了您已经知道的有关它的信息。例如,当您问第三个问题时,值为 0
的概率是 而不是 p=1/8
而是 p=1/2
因为您已经知道它不是 1
或 2
。
您需要做的修复是将概率除以可能的事件概率 probabilities(j)/sum(probabilities(j:end))
:
n = 10000;
p = [1/8,1/4,1/2,1/8];
% sort them from higher to lower
probabilities = sort(p,'descend');
probabilities(end-1) = probabilities(end-1) + probabilities(end);
probabilities(end) = [];
alphabet_size = numel(probabilities);
num_Qs = zeros(1,n); % allocate array of size n for number of questions asked per experiment
for i = 1:n % generate n experiments
Qs = 0; % number of questions asked in this experiment
for j = 1:alphabet_size % iterate through questions
question = rand; % generate random number in range [0, 1]
Qs = Qs + 1; % incremenet number of questions asked
if question < probabilities(j)/sum(probabilities(j:end)) % if question produces a "yes" answer
break;
end
end
num_Qs(i) = Qs; % store number of questions asked for this experiment
end
x = mean(num_Qs)
x ~ 1.75
在这种情况下条件概率的向量是:
p = [1/8,1/4,1/2,1/8];
p = sort(p,'descend');
cond_p = p./cumsum(p,'reverse')
cond_p =
0.5000 0.5000 0.5000 1.0000