如何将一个整数随机分成固定数量的整数,使得得到的元组均匀分布?
How to randomly divide an integer into a fixed number of integers, such that the obtained tuples are uniformly distributed?
基于此回复:Random numbers that add to 100: Matlab
我尝试应用建议的方法将一个整数随机分成固定数量的整数,这些整数的总和等于该整数。虽然当值是 not 整数时,该方法似乎会产生一组均匀分布的点,但在整数的情况下,生成的元组并不是以相等的概率获得的。
R 中的以下实现显示了这一点,其中使用 3 个除数测试了一个简单的案例,并且要除以等于 5 的整数:
# Randomly divide an integer into a defined number of integers
# Goal: obtain with equal probability any combination of variable values, with the condition that sum(variables) = dividend.
# install.packages(rgl) # Install rgl package if not yet installed. This allows to use the plot3d function to create a 3D scatterplot.
library(rgl)
n_draws = 10000
n_variables = 3 # Number of divisors. These need to be randomly calculated. Their value must be in the interval [0:dividend] and their sum must be equal to the dividend. Two variables can have the same value.
dividend = 5 # Number that needs to be divided.
rand_variables = matrix(nrow = n_draws, ncol = n_variables) # This matrix contains the final values for each variable (one column per variable).
rand_samples = matrix(nrow = n_draws, ncol = n_variables-1) # This matrix contains the intermediate values that are used to randomly divide the dividend.
for (k in 1:n_draws){
rand_samples[k,] = sample(x = c(0:dividend), size = n_variables-1, replace = TRUE) # Randomly select (n_variables - 1) values within the range 0:dividend. The values in rand_samples are uniformly distributed.
midpoints = sort(rand_samples[k,])
rand_variables[k,] = sample(diff(c(0, midpoints, dividend)), n_variables) # Calculate the values of each variable such that their sum is equal to the dividend.
}
plot3d(rand_variables) # Create a 3D scatterplot showing the values of rand_variables. This plot does not show how frequently each combination of values of the n_variables is obtained, only which combinations of values are possible.
table(data.frame(rand_variables)) # This prints out the count of each combination of values of n_variables. It shows that the combinations of values in the corners (e.g. (5,0,0)) are obtained less frequently than other combinations (e.g. (1,2,2)).
最后一行给出了以下输出,显示了满足条件 X1 + X2 + X3 = 5 的 (X1, X2, X3) 值的每个组合获得了多少次:
, , X3 = 0
X2
X1 0 1 2 3 4 5
0 0 0 0 0 0 397
1 0 0 0 0 471 0
2 0 0 0 469 0 0
3 0 0 446 0 0 0
4 0 456 0 0 0 0
5 358 0 0 0 0 0
, , X3 = 1
X2
X1 0 1 2 3 4 5
0 0 0 0 0 450 0
1 0 0 0 539 0 0
2 0 0 560 0 0 0
3 0 588 0 0 0 0
4 426 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 2
X2
X1 0 1 2 3 4 5
0 0 0 0 428 0 0
1 0 0 603 0 0 0
2 0 549 0 0 0 0
3 461 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 3
X2
X1 0 1 2 3 4 5
0 0 0 500 0 0 0
1 0 549 0 0 0 0
2 455 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 4
X2
X1 0 1 2 3 4 5
0 0 465 0 0 0 0
1 458 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 5
X2
X1 0 1 2 3 4 5
0 372 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
如输出所示,平面角中值的组合(例如 (5,0,0))的获得频率低于其他元组。
如何获得具有相同概率的任何整数元组?
我正在寻找适用于任何正整数和任何数量的除数的解决方案。
我认为尝试手动制作这些 combinations/permutations 是在重新发明轮子。在 partitions
中实现了执行此操作的有效算法。例如,
library(partitions) # compositions, parts, restrictedparts may be of interest
sample_size <- 1000
pool <- compositions(5, 3) # pool of possible tuples
samp <- sample(ncol(pool), sample_size, TRUE) # sample uniformly
## These are you sampled tuples, each column
z <- matrix(pool[,samp], 3)
旁注:不要使用 data.frame,使用矩阵来存储一组整数。 data.frames 将在您每次修改某些内容时被完全复制([.data.frame
不是原语),而矩阵将就地修改。
基于此回复:Random numbers that add to 100: Matlab 我尝试应用建议的方法将一个整数随机分成固定数量的整数,这些整数的总和等于该整数。虽然当值是 not 整数时,该方法似乎会产生一组均匀分布的点,但在整数的情况下,生成的元组并不是以相等的概率获得的。 R 中的以下实现显示了这一点,其中使用 3 个除数测试了一个简单的案例,并且要除以等于 5 的整数:
# Randomly divide an integer into a defined number of integers
# Goal: obtain with equal probability any combination of variable values, with the condition that sum(variables) = dividend.
# install.packages(rgl) # Install rgl package if not yet installed. This allows to use the plot3d function to create a 3D scatterplot.
library(rgl)
n_draws = 10000
n_variables = 3 # Number of divisors. These need to be randomly calculated. Their value must be in the interval [0:dividend] and their sum must be equal to the dividend. Two variables can have the same value.
dividend = 5 # Number that needs to be divided.
rand_variables = matrix(nrow = n_draws, ncol = n_variables) # This matrix contains the final values for each variable (one column per variable).
rand_samples = matrix(nrow = n_draws, ncol = n_variables-1) # This matrix contains the intermediate values that are used to randomly divide the dividend.
for (k in 1:n_draws){
rand_samples[k,] = sample(x = c(0:dividend), size = n_variables-1, replace = TRUE) # Randomly select (n_variables - 1) values within the range 0:dividend. The values in rand_samples are uniformly distributed.
midpoints = sort(rand_samples[k,])
rand_variables[k,] = sample(diff(c(0, midpoints, dividend)), n_variables) # Calculate the values of each variable such that their sum is equal to the dividend.
}
plot3d(rand_variables) # Create a 3D scatterplot showing the values of rand_variables. This plot does not show how frequently each combination of values of the n_variables is obtained, only which combinations of values are possible.
table(data.frame(rand_variables)) # This prints out the count of each combination of values of n_variables. It shows that the combinations of values in the corners (e.g. (5,0,0)) are obtained less frequently than other combinations (e.g. (1,2,2)).
最后一行给出了以下输出,显示了满足条件 X1 + X2 + X3 = 5 的 (X1, X2, X3) 值的每个组合获得了多少次:
, , X3 = 0
X2
X1 0 1 2 3 4 5
0 0 0 0 0 0 397
1 0 0 0 0 471 0
2 0 0 0 469 0 0
3 0 0 446 0 0 0
4 0 456 0 0 0 0
5 358 0 0 0 0 0
, , X3 = 1
X2
X1 0 1 2 3 4 5
0 0 0 0 0 450 0
1 0 0 0 539 0 0
2 0 0 560 0 0 0
3 0 588 0 0 0 0
4 426 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 2
X2
X1 0 1 2 3 4 5
0 0 0 0 428 0 0
1 0 0 603 0 0 0
2 0 549 0 0 0 0
3 461 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 3
X2
X1 0 1 2 3 4 5
0 0 0 500 0 0 0
1 0 549 0 0 0 0
2 455 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 4
X2
X1 0 1 2 3 4 5
0 0 465 0 0 0 0
1 458 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
, , X3 = 5
X2
X1 0 1 2 3 4 5
0 372 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
如输出所示,平面角中值的组合(例如 (5,0,0))的获得频率低于其他元组。
如何获得具有相同概率的任何整数元组?
我正在寻找适用于任何正整数和任何数量的除数的解决方案。
我认为尝试手动制作这些 combinations/permutations 是在重新发明轮子。在 partitions
中实现了执行此操作的有效算法。例如,
library(partitions) # compositions, parts, restrictedparts may be of interest
sample_size <- 1000
pool <- compositions(5, 3) # pool of possible tuples
samp <- sample(ncol(pool), sample_size, TRUE) # sample uniformly
## These are you sampled tuples, each column
z <- matrix(pool[,samp], 3)
旁注:不要使用 data.frame,使用矩阵来存储一组整数。 data.frames 将在您每次修改某些内容时被完全复制([.data.frame
不是原语),而矩阵将就地修改。