为什么 "Varimax" 和 "varimax" 之间的 psych::principal 有差异?
Why are there differences in psych::principal between "Varimax" and "varimax"?
在related question中,我曾问过为什么stats::varimax
和GPArotation::Varimax
之间存在差异,两者都是psych::principal
调用,具体取决于为[设置的选项=17=].
这两者之间的差异(参见其他问题)解释了与 psych::principal
的部分差异,但不是全部差异。
psych::principal
似乎 加剧了 这些差异。
(我有一个简单的理论为什么,我想得到证实)。
library(GPArotation)
library(psych)
data("Thurstone")
principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)
# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question
diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings) # very small differences, see this question:
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05
principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")
diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233
mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ)) # principal has MUCH bigger differences
#> [1] -0.02115263
这对于浮点人工制品或其他东西来说似乎太大了。
我的假设是(额外的)差异源于 GPArotation::Varimax
默认为 (Kaiser) normalize == FALSE
,而 **stats::varimax
默认到 (Kaiser) normalize == TRUE
,不能 与 `principal::psych`` 中的设置不同。
stats::varimax
手册:
## varimax with normalize = TRUE is the default
GPArotation::Varimax
/ GPArotation::GPForth
手册:
The argument normalize gives an indication of if and how any normalization should be done before rotation, and then undone after rotation. If normalize is FALSE (the default) no normalization is done. If normalize is TRUE then Kaiser normalization is done. (So squared row entries of normalized A sum to 1.0. This is sometimes called Horst normalization.)
此外,他们 psych::Kaiser
手册警告:
The GPArotation package does not (by default) normalize, nor does the fa function. Then, to make it more confusing, varimax in stats does,Varimax in GPArotation does not.
谁能确认差异实际上是由归一化选项造成的?
这似乎证实了应用 psych::kaiser
(我认为它是为此目的而构建)将差异缩小回 stats::varimax
和 GPArotation::Varimax
之间的原始差异:
principal.Varimax.GPA.kaiser <- kaiser(f = principal.unrotated, rotate = "Varimax")
diff.statsvari.gpavar.bothkaiser <- unclass(principal.Varimax.GPA.kaiser$loadings - principal.varimax.stats$loadings)
mean(abs(diff.statsvari.gpavar.bothkaiser))
#> [1] 8.036863e-05
这几乎是相同的结果,所以我认为假设得到证实。
来自 psych::principal
的更大差异是因为 normalize
的默认值不同。
更新
对于各自的旋转矩阵(或 Th
是什么),差异也(再次)小得多:
principal.Varimax.GPA.kaiser$Th - principal.varimax.stats$rot.mat # those differences are very small now, too
#> [,1] [,2] [,3] [,4]
#> [1,] 1.380279e-04 1.380042e-05 -2.214319e-04 -2.279170e-06
#> [2,] 9.631517e-05 2.391296e-05 1.531723e-04 -3.371868e-05
#> [3,] 1.758299e-04 7.917460e-05 6.788867e-05 1.099072e-04
#> [4,] -9.548010e-05 6.500162e-05 -1.679753e-05 -5.213475e-05
由于程序的精度不同而出现加载不同的问题(当然,由于在psych::principal中没有评估normalize
选项,所有其他程序必须与该选项一起使用切换为 TRUE)。虽然可以配置 stats::varimax 和 GPArotation::Varimax 中的精度(参数 eps
),但在 psych::principal 中忽略了这一点,并且似乎隐含地等于 stats::varimax with eps =1e-5。
如果我们将 stats::varimax 和 GPArotations::Varimax 的准确度提高到 eps=1e-15
,那么我们会得到与我的 MatMate 中同时实现的结果相同的结果(至少最多 8 位数字)-我测试过的程序也非常准确地等于 SPSS 计算。
psych::principal 中缺少对显式选项 eps
的处理似乎是一个错误,其糟糕的隐式默认值肯定不能令人满意。
有趣的是,GPArotation::Varimax 需要与 eps=1e-15
进行多次旋转,请参阅最后的输出;因此,要么实施了另一个内部程序,要么在决定何时可以停止迭代时对 eps 参数进行不同的评估。一个示例(见本答案末尾处)可能暗示了这样一种效果。
低于比较协议;仅显示载荷的前两行。
The first three computations with low accuracy (eps=1e-5)
all procedures give equal or comparable results,
except GPArotation, which is already near the exact answer
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings *1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
The second three computations with (attempted) high accuracy (eps=1e-15)
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000
尝试 little/default 具有大 eps
或没有
的准确结果
# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871717.3 216618.7 198176.3 175204.47
Vocabulary 855663.1 294146.3 152930.7 180517.21
# =============================================================================
现在尝试用更小的eps
获得更准确的结果
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings *1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.69 198174.31 175207.86
Vocabulary 855662.58 294147.47 152928.77 180519.37
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.7 198174.31 175207.86
Vocabulary 855662.6 294147.5 152928.77 180519.37
Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize, :
convergence not obtained in GPForth. 1000 iterations used.
# Result by MatMate: --------------------------------------------------------
lad = cholesky(Thurstone)
pc = rot(lad,"pca")
pc4 = pc[*,1..4] // arrive at the first four pc's
t = gettrans( normzl(pc4),"varimax") // get rotation-matrix for row-normalized pc's
vmx = pc4 * t // rotate pc4 by rotation-matrix
display = vmx * 1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.68 198174.31 175207.87
Vocabulary 855662.58 294147.46 152928.77 180519.37
# ===============================================================================>
通过在 stat:: 中将 eps 设置为 1e-12 并在 GPArotation 中设置为 1e-6 可以实现 stats::varimax 和 GPArotation::Varimax 结果的更好匹配,其中一个值为对方的广场。然后我们得到
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.8 198174.49 175207.63
Vocabulary 855662.5 294147.6 152928.94 180519.21
> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.80 216619.74 198174.40 175207.85
Vocabulary 855662.55 294147.52 152928.86 180519.36
在related question中,我曾问过为什么stats::varimax
和GPArotation::Varimax
之间存在差异,两者都是psych::principal
调用,具体取决于为[设置的选项=17=].
这两者之间的差异(参见其他问题)解释了与 psych::principal
的部分差异,但不是全部差异。
psych::principal
似乎 加剧了 这些差异。
(我有一个简单的理论为什么,我想得到证实)。
library(GPArotation)
library(psych)
data("Thurstone")
principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)
# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question
diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings) # very small differences, see this question:
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05
principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")
diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233
mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ)) # principal has MUCH bigger differences
#> [1] -0.02115263
这对于浮点人工制品或其他东西来说似乎太大了。
我的假设是(额外的)差异源于 GPArotation::Varimax
默认为 (Kaiser) normalize == FALSE
,而 **stats::varimax
默认到 (Kaiser) normalize == TRUE
,不能 与 `principal::psych`` 中的设置不同。
stats::varimax
手册:
## varimax with normalize = TRUE is the default
GPArotation::Varimax
/ GPArotation::GPForth
手册:
The argument normalize gives an indication of if and how any normalization should be done before rotation, and then undone after rotation. If normalize is FALSE (the default) no normalization is done. If normalize is TRUE then Kaiser normalization is done. (So squared row entries of normalized A sum to 1.0. This is sometimes called Horst normalization.)
此外,他们 psych::Kaiser
手册警告:
The GPArotation package does not (by default) normalize, nor does the fa function. Then, to make it more confusing, varimax in stats does,Varimax in GPArotation does not.
谁能确认差异实际上是由归一化选项造成的?
这似乎证实了应用 psych::kaiser
(我认为它是为此目的而构建)将差异缩小回 stats::varimax
和 GPArotation::Varimax
之间的原始差异:
principal.Varimax.GPA.kaiser <- kaiser(f = principal.unrotated, rotate = "Varimax")
diff.statsvari.gpavar.bothkaiser <- unclass(principal.Varimax.GPA.kaiser$loadings - principal.varimax.stats$loadings)
mean(abs(diff.statsvari.gpavar.bothkaiser))
#> [1] 8.036863e-05
这几乎是相同的结果,所以我认为假设得到证实。
来自 psych::principal
的更大差异是因为 normalize
的默认值不同。
更新
对于各自的旋转矩阵(或 Th
是什么),差异也(再次)小得多:
principal.Varimax.GPA.kaiser$Th - principal.varimax.stats$rot.mat # those differences are very small now, too
#> [,1] [,2] [,3] [,4]
#> [1,] 1.380279e-04 1.380042e-05 -2.214319e-04 -2.279170e-06
#> [2,] 9.631517e-05 2.391296e-05 1.531723e-04 -3.371868e-05
#> [3,] 1.758299e-04 7.917460e-05 6.788867e-05 1.099072e-04
#> [4,] -9.548010e-05 6.500162e-05 -1.679753e-05 -5.213475e-05
由于程序的精度不同而出现加载不同的问题(当然,由于在psych::principal中没有评估normalize
选项,所有其他程序必须与该选项一起使用切换为 TRUE)。虽然可以配置 stats::varimax 和 GPArotation::Varimax 中的精度(参数 eps
),但在 psych::principal 中忽略了这一点,并且似乎隐含地等于 stats::varimax with eps =1e-5。
如果我们将 stats::varimax 和 GPArotations::Varimax 的准确度提高到 eps=1e-15
,那么我们会得到与我的 MatMate 中同时实现的结果相同的结果(至少最多 8 位数字)-我测试过的程序也非常准确地等于 SPSS 计算。
psych::principal 中缺少对显式选项 eps
的处理似乎是一个错误,其糟糕的隐式默认值肯定不能令人满意。
有趣的是,GPArotation::Varimax 需要与 eps=1e-15
进行多次旋转,请参阅最后的输出;因此,要么实施了另一个内部程序,要么在决定何时可以停止迭代时对 eps 参数进行不同的评估。一个示例(见本答案末尾处)可能暗示了这样一种效果。
低于比较协议;仅显示载荷的前两行。
The first three computations with low accuracy (eps=1e-5)
all procedures give equal or comparable results,
except GPArotation, which is already near the exact answer
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings *1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
The second three computations with (attempted) high accuracy (eps=1e-15)
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000
尝试 little/default 具有大 eps
或没有
# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871717.3 216618.7 198176.3 175204.47
Vocabulary 855663.1 294146.3 152930.7 180517.21
# =============================================================================
现在尝试用更小的eps
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings *1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.69 198174.31 175207.86
Vocabulary 855662.58 294147.47 152928.77 180519.37
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.7 198174.31 175207.86
Vocabulary 855662.6 294147.5 152928.77 180519.37
Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize, :
convergence not obtained in GPForth. 1000 iterations used.
# Result by MatMate: --------------------------------------------------------
lad = cholesky(Thurstone)
pc = rot(lad,"pca")
pc4 = pc[*,1..4] // arrive at the first four pc's
t = gettrans( normzl(pc4),"varimax") // get rotation-matrix for row-normalized pc's
vmx = pc4 * t // rotate pc4 by rotation-matrix
display = vmx * 1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.68 198174.31 175207.87
Vocabulary 855662.58 294147.46 152928.77 180519.37
# ===============================================================================>
通过在 stat:: 中将 eps 设置为 1e-12 并在 GPArotation 中设置为 1e-6 可以实现 stats::varimax 和 GPArotation::Varimax 结果的更好匹配,其中一个值为对方的广场。然后我们得到
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.8 198174.49 175207.63
Vocabulary 855662.5 294147.6 152928.94 180519.21
> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.80 216619.74 198174.40 175207.85
Vocabulary 855662.55 294147.52 152928.86 180519.36