R: z-score 归一化
R: z-score normalization
我想对 R 中矩阵的每一行进行 z-score 归一化。我使用归一化函数,它可以很好地用于此目的:
library(som)
training <- matrix(seq(1:20), ncol = 10)
training
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 3 5 7 9 11 13 15 17 19
[2,] 2 4 6 8 10 12 14 16 18 20
training_zscore <- normalize(training, byrow=TRUE)
training_zscore
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
[2,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
假设我现在有另一个矩阵,例如以下:
validation <- matrix(seq(1:20)*2, ncol = 10)
validation
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 2 6 10 14 18 22 26 30 34 38
[2,] 4 8 12 16 20 24 28 32 36 40
我还想对这个新矩阵进行 z-score 变换。然而,缩放比例应该与训练 z 分数矩阵相同。我怎样才能做到这一点?
如果我只执行单独的 z-score 归一化,我会得到以下输出:
> validation_zscore <- normalize(validation, byrow=TRUE)
> validation_zscore
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
[2,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
但这不是我想要的,例如在训练矩阵中,值“10”被转换为“-0.1651446”的 z 分数。这也应该是验证矩阵中的情况(然而,这里的 10 被转换为“-0.8257228”的 z 分数):
感谢您的帮助!
不清楚,但我假设您希望 validation
的每一行都使用 training
作为 "reference" 进行规范化。如果是这样,您可以使用 base::scale
并给出均值和标准差的数值。无论如何,使用 som::normalize
有什么意义?
training <- matrix(seq(1:20), ncol = 10)
training_zscore <- t(scale(t(training)))
training_zscore
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
# [2,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
# attr(,"scaled:center")
# [1] 10 11
# attr(,"scaled:scale")
# [1] 6.055301 6.055301
validation <- matrix(seq(1:20)*2, ncol = 10)
validation_zscore <- t(scale(t(validation), center = rowMeans(training),
scale = apply(training, 1, sd)))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] -1.321157 -0.6605783 0.0000000 0.6605783 1.321157 1.981735 2.642313 3.302891 3.963470 4.624048
# [2,] -1.156012 -0.4954337 0.1651446 0.8257228 1.486301 2.146879 2.807458 3.468036 4.128614 4.789192
# attr(,"scaled:center")
# [1] 10 11
# attr(,"scaled:scale")
# [1] 6.055301 6.055301
我想对 R 中矩阵的每一行进行 z-score 归一化。我使用归一化函数,它可以很好地用于此目的:
library(som)
training <- matrix(seq(1:20), ncol = 10)
training
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 3 5 7 9 11 13 15 17 19
[2,] 2 4 6 8 10 12 14 16 18 20
training_zscore <- normalize(training, byrow=TRUE)
training_zscore
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
[2,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
假设我现在有另一个矩阵,例如以下:
validation <- matrix(seq(1:20)*2, ncol = 10)
validation
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 2 6 10 14 18 22 26 30 34 38
[2,] 4 8 12 16 20 24 28 32 36 40
我还想对这个新矩阵进行 z-score 变换。然而,缩放比例应该与训练 z 分数矩阵相同。我怎样才能做到这一点?
如果我只执行单独的 z-score 归一化,我会得到以下输出:
> validation_zscore <- normalize(validation, byrow=TRUE)
> validation_zscore
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
[2,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
但这不是我想要的,例如在训练矩阵中,值“10”被转换为“-0.1651446”的 z 分数。这也应该是验证矩阵中的情况(然而,这里的 10 被转换为“-0.8257228”的 z 分数):
感谢您的帮助!
不清楚,但我假设您希望 validation
的每一行都使用 training
作为 "reference" 进行规范化。如果是这样,您可以使用 base::scale
并给出均值和标准差的数值。无论如何,使用 som::normalize
有什么意义?
training <- matrix(seq(1:20), ncol = 10)
training_zscore <- t(scale(t(training)))
training_zscore
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
# [2,] -1.486301 -1.156012 -0.8257228 -0.4954337 -0.1651446 0.1651446 0.4954337 0.8257228 1.156012 1.486301
# attr(,"scaled:center")
# [1] 10 11
# attr(,"scaled:scale")
# [1] 6.055301 6.055301
validation <- matrix(seq(1:20)*2, ncol = 10)
validation_zscore <- t(scale(t(validation), center = rowMeans(training),
scale = apply(training, 1, sd)))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] -1.321157 -0.6605783 0.0000000 0.6605783 1.321157 1.981735 2.642313 3.302891 3.963470 4.624048
# [2,] -1.156012 -0.4954337 0.1651446 0.8257228 1.486301 2.146879 2.807458 3.468036 4.128614 4.789192
# attr(,"scaled:center")
# [1] 10 11
# attr(,"scaled:scale")
# [1] 6.055301 6.055301