删除值后重新计算总体方差的公式
Formula to recalculate population variance after removing a value
假设我有一个数据集{10, 20, 30}
。我这里的均值和方差是 mean = 20
和 variance = 66.667
。如果我要从数据集中 删除 10
将其转换为 {20, 30}
?
,是否有一个公式可以让我计算新的方差值
这是一个与 https://math.stackexchange.com/questions/3112650/formula-to-recalculate-variance-after-removing-a-value-and-adding-another-one-gi which deals with the case when there is replacement. https://math.stackexchange.com/questions/775391/can-i-calculate-the-new-standard-deviation-when-adding-a-value-without-knowing-t is also a similar question except that deals with adding adding a value instead of removing one. 类似的问题,涉及删除样本,但我不知道如何修改它以处理人口。
要计算 Mean
和 Variance
我们需要 3 个参数:
N - number of items
Sx - sum of items
Sxx - sum of items squared
有了所有这些值,我们可以找到均值和方差作为
Mean = Sx / N
Variance = Sxx / N - Sx * Sx / N / N
你的情况
items = {10, 20, 30}
N = 3
Sx = 60 = 10 + 20 + 30
Sxx = 1400 = 100 + 400 + 900 = 10 * 10 + 20 * 20 + 30 * 30
Mean = 60 / 3 = 20
Variance = 1400 / 3 - 60 * 60 / 3 / 3 = 66.666667
如果要删除 item
,只需 更新 N, Sx, Sxx
值并计算新方差:
item = 10
N' = N - 1 = 3 - 1 = 2
Sx' = Sx - item = 60 - 10 = 50
Sxx' = Sxx - item * item = 1400 - 10 * 10 = 1300
Mean' = Sx' / N' = 50 / 2 = 25
Variance' = Sxx' / N' - Sx' * Sx' / N' / N' = 1300 / 2 - 50 * 50 / 2 / 2 = 25
因此,如果您删除 item = 10
,新的均值和方差将为
Mean' = 25
Variance' = 25
假设我有一个数据集{10, 20, 30}
。我这里的均值和方差是 mean = 20
和 variance = 66.667
。如果我要从数据集中 删除 10
将其转换为 {20, 30}
?
这是一个与 https://math.stackexchange.com/questions/3112650/formula-to-recalculate-variance-after-removing-a-value-and-adding-another-one-gi which deals with the case when there is replacement. https://math.stackexchange.com/questions/775391/can-i-calculate-the-new-standard-deviation-when-adding-a-value-without-knowing-t is also a similar question except that deals with adding adding a value instead of removing one.
要计算 Mean
和 Variance
我们需要 3 个参数:
N - number of items
Sx - sum of items
Sxx - sum of items squared
有了所有这些值,我们可以找到均值和方差作为
Mean = Sx / N
Variance = Sxx / N - Sx * Sx / N / N
你的情况
items = {10, 20, 30}
N = 3
Sx = 60 = 10 + 20 + 30
Sxx = 1400 = 100 + 400 + 900 = 10 * 10 + 20 * 20 + 30 * 30
Mean = 60 / 3 = 20
Variance = 1400 / 3 - 60 * 60 / 3 / 3 = 66.666667
如果要删除 item
,只需 更新 N, Sx, Sxx
值并计算新方差:
item = 10
N' = N - 1 = 3 - 1 = 2
Sx' = Sx - item = 60 - 10 = 50
Sxx' = Sxx - item * item = 1400 - 10 * 10 = 1300
Mean' = Sx' / N' = 50 / 2 = 25
Variance' = Sxx' / N' - Sx' * Sx' / N' / N' = 1300 / 2 - 50 * 50 / 2 / 2 = 25
因此,如果您删除 item = 10
,新的均值和方差将为
Mean' = 25
Variance' = 25