envfit 结果是如何创建的？

Question

我有一个关于如何从 vegan 包中的 envfit() 函数重新创建结果的问题。

这里是 envfit() 与排序和环境向量一起使用的示例。

data(varespec)
data(varechem)
ord <- metaMDS(varespec)
chem.envfit <- envfit(ord, varechem, choices = c(1,2), permutations = 999)
chem.scores.envfit <- as.data.frame(scores(chem.envfit, display = "vectors"))
chem.scores.envfit

"The values that you see in the table are the standardised coefficients from the linear regression used to project the vectors into the ordination. These are directions for arrows of unit length." - 来自 Plotted envfit vectors not matching NMDS scores

的评论

此外，来自 ?envfit：

The printed output of continuous variables (vectors) gives the direction cosines which are the coordinates of the heads of unit length vectors. In plot these are scaled by their correlation (square root of the column r2) so that weak predictors have shorter arrows than strong predictors. You can see the scaled relative lengths using command scores.

有人可以明确地告诉我什么是线性模型运行，正在使用什么标准化系数，以及在何处应用余弦来创建这些值？

Answer 1

我可能不应该在那个答案中说 "standardised"。

对于varechem中的每一列（变量）和排序的前两个轴（choices = 1:2），线性模型是：

\hat(env_j) = \beta_1 * scr1 + \beta_2 * scr2

其中 env_j 是 varechem 中的第 j$ 个变量，scr1 和 scr2 是所考虑的第一和第二轴上的轴分数（即choices = 1:2 定义的平面，但这扩展到更高的维度），\beta 是轴分数对的回归系数。

在这个模型中没有截距，因为我们（加权）将所有变量集中在 varechem 和轴分数中，权重实际上只与 CCA、capscale() 和 DCA 方法有关加权模型本身。

轴分数跨越 space 中的箭头头是该模型的系数——我们实际上进行了归一化（我在另一个回复中错误地表示为 "standardised"）以便箭头有单位长度。这些值（envfit 输出中的 NMDS1 和 NMDS2 列）在 https://en.wikipedia.org/wiki/Direction_cosine 的意义上是 方向余弦 。

这是我们在不涉及权重且 env 中的所有变量都是数字的情况下所做操作的简化演练，如您的示例所示。（请注意，出于效率原因，我们实际上并没有这样做：如果您确实需要详细信息，请参阅 vectorfit() 后面的代码以了解所使用的 QR 分解。）

## extract the axis scores for the axes we want, 1 and 2
scrs <- scores(ord, choices = c(1,2))

## centre the scores (note not standardising them)
scrs <- as.data.frame(scale(scrs, scale = FALSE, center = TRUE))

## centre the environmental variables - keep as matrix
env <- scale(varechem, scale = FALSE, center = TRUE)

## fit the linear models with no intercept
mod <- lm(env ~ NMDS1 + NMDS2 - 1, data = scrs)

## extract the coefficients from the models
betas <- coef(mod)

## normalize coefs to unit length
##   i.e. betas for a  particular env var have sum of squares = 1
t(sweep(betas, 2L, sqrt(colSums(betas^2)), "/"))

最后一行给出：

> t(sweep(betas, 2L, sqrt(colSums(betas^2)), "/"))
               NMDS1      NMDS2
N        -0.05731557 -0.9983561
P         0.61972792  0.7848167
K         0.76646744  0.6422832
Ca        0.68520442  0.7283508
Mg        0.63252973  0.7745361
S         0.19139498  0.9815131
Al       -0.87159427  0.4902279
Fe       -0.93600826  0.3519780
Mn        0.79870870 -0.6017179
Zn        0.61755690  0.7865262
Mo       -0.90308490  0.4294621
Baresoil  0.92487118 -0.3802806
Humdepth  0.93282052 -0.3603413
pH       -0.64797447  0.7616621

它复制了（除了显示更多符号数字）在这种情况下 envfit() 返回的值：

> chem.envfit

***VECTORS

            NMDS1    NMDS2     r2 Pr(>r)    
N        -0.05732 -0.99836 0.2536  0.045 *  
P         0.61973  0.78482 0.1938  0.099 .  
K         0.76647  0.64228 0.1809  0.095 .  
Ca        0.68520  0.72835 0.4119  0.006 ** 
Mg        0.63253  0.77454 0.4270  0.003 ** 
S         0.19139  0.98151 0.1752  0.109    
Al       -0.87159  0.49023 0.5269  0.002 ** 
Fe       -0.93601  0.35198 0.4450  0.002 ** 
Mn        0.79871 -0.60172 0.5231  0.002 ** 
Zn        0.61756  0.78653 0.1879  0.100 .  
Mo       -0.90308  0.42946 0.0609  0.545    
Baresoil  0.92487 -0.38028 0.2508  0.061 .  
Humdepth  0.93282 -0.36034 0.5201  0.001 ***
pH       -0.64797  0.76166 0.2308  0.067 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Permutation: free
Number of permutations: 999

envfit 结果是如何创建的？

How are envfit results created?

regression

r

vegan