如果已知数据点数、均值和标准差，是否有确定数据集（离散比例，1 到 5）的函数？

Question

我的问题与这个问题非常相似："Reverse" statistics。然而，他们想要创建一个正态（和随机）分布任意大小以符合特定的均值和标准差。但是，假设我们知道多于均值和标准差。偏差，我们知道数据点数以及离散尺度值落在

所以我真的有两个问题。首先，鉴于我们知道，

平均值
标准差
n
1 到 5 的离散比例（即值只能是 1、2、3、4 或 5）

...是否可以知道确切的数据集？例如，如果我们知道在 1-5 的李克特量表上有 5 个数据点，均值为 4.40，标准差为 1.20，是否可以算出数据集为 {5, 5, 5, 5, 2}（值的顺序不重要）？

其次，是否已经有功能可以自动解决这个问题？

Answer 1

由于我要识别的数据集相当小 (n < 30)，我的一个朋友建议使用 itertools combinations_with_replacement() 来生成所有可能的数据集，然后写一个匹配函数给定我的参数。

这是最终代码。

from itertools import combinations_with_replacement
from statistics import pstdev

# Function to figure out exact data set given we know:
## N is integer (number of data points)
## SIGMA is float (standard deviation)
## MU is float (mean)
## discreteScale is list (all possible values of data points)
#### Function returns list containing tuple(s) (possible sets of data points that match criteria)

def find_DataSet(N, MU, SIGMA, discreteScale):
    possibleCombs = combinations_with_replacement(discreteScale, N)
    
    container = []
    for dataSet in possibleCombs:
        mu = sum(dataSet)/len(dataSet)
        roundMu = round(mu, 2)
        
        sigma = pstdev(dataSet)
        roundSigma = round(sigma, 2)
        
        if ((roundMu == MU) and (roundSigma == SIGMA)):
            container.append(dataSet)
        
    return container

示例输出：

result = find_DataSet(20, 4.50, 0.81, [1, 2, 3, 4, 5])
print(result)

[(2, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), (3 , 3, 3, 3, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5)]

如果已知数据点数、均值和标准差，是否有确定数据集（离散比例，1 到 5）的函数？

Is there a function to determine dataset (discrete scale, 1 to 5) if number of data points, mean, and standard deviation is known?

language-agnostic

statistics

function