在 R 中寻找 Iris 数据的协方差矩阵的问题

Problem with finding covariance matrix for Iris data in R

在 R 中尝试查找 Iris 数据的协方差矩阵时,我总是得到 NA。

library(ggplot2)
library(dplyr)

dim(iris)
head(iris)

numIris <- iris %>% 
  select_if(is.numeric)

plot(numIris[1:100,])

Xraw <- numIris[1:1000,]

plot(iris[1:150,-c(5)]) #species name is the 5th column; excluding it here.
Xraw = iris[1:1000,-c(5)] # this excludes the 5th column, which is the species column
#first, to get covariance, we need to subtract the mean from each column

X = scale(Xraw, scale = FALSE)

head(X)

Xs <- scale(Xraw, scale = TRUE)
head(Xs)

covMat  = (t(X)%*%X)/ (nrow(X)-1)
head(covMat)

您有什么理由不能使用 cov(numIris)

通过尝试 select 1000 行的 matrix/data 框架只有 150 行,你最终得到 850 行充满 NA 值(尝试 tail(Xraw)看)。如果您设置 Xraw <- iris[, -5] 并从那里开始,您会得到这样的结果 all.equal(covMat, cov(iris[, -5]))TRUE