如何将一个整数随机分成固定数量的整数,使得得到的元组均匀分布?

How to randomly divide an integer into a fixed number of integers, such that the obtained tuples are uniformly distributed?

基于此回复:Random numbers that add to 100: Matlab 我尝试应用建议的方法将一个整数随机分成固定数量的整数,这些整数的总和等于该整数。虽然当值是 not 整数时,该方法似乎会产生一组均匀分布的点,但在整数的情况下,生成的元组并不是以相等的概率获得的。 R 中的以下实现显示了这一点,其中使用 3 个除数测试了一个简单的案例,并且要除以等于 5 的整数:

# Randomly divide an integer into a defined number of integers 
# Goal: obtain with equal probability any combination of variable values, with the condition that sum(variables) = dividend.

# install.packages(rgl)  # Install rgl package if not yet installed. This allows to use the plot3d function to create a 3D scatterplot.
library(rgl)

n_draws = 10000
n_variables = 3  # Number of divisors. These need to be randomly calculated. Their value must be in the interval [0:dividend] and their sum must be equal to the dividend. Two variables can have the same value.
dividend = 5  # Number that needs to be divided.
rand_variables = matrix(nrow = n_draws, ncol = n_variables)  # This matrix contains the final values for each variable (one column per variable).
rand_samples = matrix(nrow = n_draws, ncol = n_variables-1)  # This matrix contains the intermediate values that are used to randomly divide the dividend.

for (k in 1:n_draws){
    rand_samples[k,] = sample(x = c(0:dividend), size = n_variables-1, replace = TRUE)  # Randomly select (n_variables - 1) values within the range 0:dividend. The values in rand_samples are uniformly distributed.
    midpoints = sort(rand_samples[k,])
    rand_variables[k,] = sample(diff(c(0, midpoints, dividend)), n_variables)  # Calculate the values of each variable such that their sum is equal to the dividend.
}

plot3d(rand_variables)  # Create a 3D scatterplot showing the values of rand_variables. This plot does not show how frequently each combination of values of the n_variables is obtained, only which combinations of values are possible.

table(data.frame(rand_variables))  # This prints out the count of each combination of values of n_variables. It shows that the combinations of values in the corners (e.g. (5,0,0)) are obtained less frequently than other combinations (e.g. (1,2,2)).

最后一行给出了以下输出,显示了满足条件 X1 + X2 + X3 = 5 的 (X1, X2, X3) 值的每个组合获得了多少次:

, , X3 = 0

   X2
X1    0   1   2   3   4   5
  0   0   0   0   0   0 397
  1   0   0   0   0 471   0
  2   0   0   0 469   0   0
  3   0   0 446   0   0   0
  4   0 456   0   0   0   0
  5 358   0   0   0   0   0

, , X3 = 1

   X2
X1    0   1   2   3   4   5
  0   0   0   0   0 450   0
  1   0   0   0 539   0   0
  2   0   0 560   0   0   0
  3   0 588   0   0   0   0
  4 426   0   0   0   0   0
  5   0   0   0   0   0   0

, , X3 = 2

   X2
X1    0   1   2   3   4   5
  0   0   0   0 428   0   0
  1   0   0 603   0   0   0
  2   0 549   0   0   0   0
  3 461   0   0   0   0   0
  4   0   0   0   0   0   0
  5   0   0   0   0   0   0

, , X3 = 3

   X2
X1    0   1   2   3   4   5
  0   0   0 500   0   0   0
  1   0 549   0   0   0   0
  2 455   0   0   0   0   0
  3   0   0   0   0   0   0
  4   0   0   0   0   0   0
  5   0   0   0   0   0   0

, , X3 = 4

   X2
X1    0   1   2   3   4   5
  0   0 465   0   0   0   0
  1 458   0   0   0   0   0
  2   0   0   0   0   0   0
  3   0   0   0   0   0   0
  4   0   0   0   0   0   0
  5   0   0   0   0   0   0

, , X3 = 5

   X2
X1    0   1   2   3   4   5
  0 372   0   0   0   0   0
  1   0   0   0   0   0   0
  2   0   0   0   0   0   0
  3   0   0   0   0   0   0
  4   0   0   0   0   0   0
  5   0   0   0   0   0   0

如输出所示,平面角中值的组合(例如 (5,0,0))的获得频率低于其他元组。

如何获得具有相同概率的任何整数元组?

我正在寻找适用于任何正整数和任何数量的除数的解决方案。

我认为尝试手动制作这些 combinations/permutations 是在重新发明轮子。在 partitions 中实现了执行此操作的有效算法。例如,

library(partitions)                            # compositions, parts, restrictedparts may be of interest
sample_size <- 1000
pool <- compositions(5, 3)                     # pool of possible tuples
samp <- sample(ncol(pool), sample_size, TRUE)  # sample uniformly

## These are you sampled tuples, each column
z <- matrix(pool[,samp], 3)

旁注:不要使用 data.frame,使用矩阵来存储一组整数。 data.frames 将在您每次修改某些内容时被完全复制([.data.frame 不是原语),而矩阵将就地修改。