如何重叠 PCA 图中的不同信息?
How to overlapping the different information in the PCA plot?
我尝试绘制代表两种类别的修改后的 PCA 图。
例如,我想用 1) 基于 Kingdom 的 PCA 椭圆绘制 PCA,2) 用 Class 变量着色的 PCA 样本点。
这是示例数据集。
require(ggplot2)
require(ggfortify)
set.seed(1)
df <- structure(list(Sample = c("cat", "dog", "rabbit", "chicken", "duck", "butterfly", "ladybug", "rose", "lily", "iris", "maple tree", "pinetree", "ginkgo"),
Class = c("mammalia", "mammalia", "mammalia", "bird", "bird", "insect", "insect", "flower", "flower", "flower", "tree", "tree", "tree"),
Kingdom = c("animalia", "animalia", "animalia", "animalia", "animalia", "animalia", "animalia", "plantae", "plantae", "plantae", "plantae", "plantae", "plantae")),
class = "data.frame", row.names = c(NA, -13L))
rownames(df)<-df[,1]
df[,1]<-NULL
for(i in 3:20){
df[,i]<-sample(100, size=nrow(df), replace=TRUE)
}
df[,c(3:20)]<-log(df[,c(3:20)]+1, base=2)
df #I ommitted the value from v13 to v20 for simplicity
Class Kingdom V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
cat mammalia animalia 6.108524 5.781360 5.087463 5.357552 4.247928 5.614710 5.000000 5.129283 5.321928 6.303781
dog mammalia animalia 5.321928 6.228819 6.409391 5.491853 4.523562 6.491853 4.169925 3.459432 4.643856 4.523562
rabbit mammalia animalia 1.000000 3.000000 5.169925 4.700440 6.303781 4.584963 6.209453 1.000000 5.754888 6.149747
chicken bird animalia 5.129283 6.209453 6.149747 6.149747 6.044394 6.409391 6.459432 5.459432 6.539159 4.857981
duck bird animalia 6.459432 6.321928 6.228819 5.321928 6.149747 4.906891 6.392317 5.906891 6.442943 5.247928
butterfly insect animalia 5.459432 6.426265 5.426265 5.700440 6.459432 3.807355 6.507795 4.754888 5.357552 5.954196
ladybug insect animalia 3.906891 5.247928 5.285402 5.426265 6.149747 4.523562 5.614710 4.000000 6.614710 5.554589
rose flower plantae 6.375039 6.491853 4.392317 2.807355 6.247928 6.554589 6.022368 5.882643 6.392317 6.087463
lily flower plantae 5.906891 5.247928 4.857981 4.643856 6.357552 4.857981 6.569856 4.906891 6.507795 6.442943
iris flower plantae 5.700440 5.129283 4.392317 5.044394 6.658211 5.614710 6.599913 4.643856 5.169925 6.643856
maple tree tree plantae 6.614710 6.491853 5.491853 3.906891 3.807355 5.087463 5.930737 5.426265 5.459432 6.169925
pinetree tree plantae 6.426265 5.491853 6.459432 1.584963 5.357552 5.523562 5.700440 5.614710 1.000000 6.643856
ginkgo tree plantae 4.459432 6.321928 6.149747 5.523562 6.491853 4.459432 6.554589 6.266787 4.906891 5.700440
指导数据集后,我使用自动绘图绘制 PCA 图
pca_df<-prcomp(df[,-c(1,2)], scale.=TRUE)
autoplot(pca_df)
autoplot(pca_df, data=df, colour='Kingdom', frame.type='norm')
在这个情节中,我想根据 Class
重叠颜色
pca_df<-prcomp(df[,-c(1,2)], scale.=TRUE)
autoplot(pca_df)
autoplot(pca_df, data=df, colour='Class')
总而言之,我想使用来自 Kingdom 变量的椭圆表示,并使用 Class 变量着色。
如何在一个 PCA 图中组合不同的表示形式?
加试
我尝试使用其他包 (factoextra) 绘制 PCA 图,但它显示相同的结果。
require (factoextra)
fviz_pca_ind(pca_df,
col.ind=df$Kingdom,
addEllipses = TRUE,
col.ind.sup=df$Class,
repel=TRUE)
您可以为省略号指定 frame.colour
,为点指定 colour
:
autoplot(pca_df, data=df, colour='Class', frame.type='norm', frame.colour = "Kingdom")
我尝试绘制代表两种类别的修改后的 PCA 图。
例如,我想用 1) 基于 Kingdom 的 PCA 椭圆绘制 PCA,2) 用 Class 变量着色的 PCA 样本点。
这是示例数据集。
require(ggplot2)
require(ggfortify)
set.seed(1)
df <- structure(list(Sample = c("cat", "dog", "rabbit", "chicken", "duck", "butterfly", "ladybug", "rose", "lily", "iris", "maple tree", "pinetree", "ginkgo"),
Class = c("mammalia", "mammalia", "mammalia", "bird", "bird", "insect", "insect", "flower", "flower", "flower", "tree", "tree", "tree"),
Kingdom = c("animalia", "animalia", "animalia", "animalia", "animalia", "animalia", "animalia", "plantae", "plantae", "plantae", "plantae", "plantae", "plantae")),
class = "data.frame", row.names = c(NA, -13L))
rownames(df)<-df[,1]
df[,1]<-NULL
for(i in 3:20){
df[,i]<-sample(100, size=nrow(df), replace=TRUE)
}
df[,c(3:20)]<-log(df[,c(3:20)]+1, base=2)
df #I ommitted the value from v13 to v20 for simplicity
Class Kingdom V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
cat mammalia animalia 6.108524 5.781360 5.087463 5.357552 4.247928 5.614710 5.000000 5.129283 5.321928 6.303781
dog mammalia animalia 5.321928 6.228819 6.409391 5.491853 4.523562 6.491853 4.169925 3.459432 4.643856 4.523562
rabbit mammalia animalia 1.000000 3.000000 5.169925 4.700440 6.303781 4.584963 6.209453 1.000000 5.754888 6.149747
chicken bird animalia 5.129283 6.209453 6.149747 6.149747 6.044394 6.409391 6.459432 5.459432 6.539159 4.857981
duck bird animalia 6.459432 6.321928 6.228819 5.321928 6.149747 4.906891 6.392317 5.906891 6.442943 5.247928
butterfly insect animalia 5.459432 6.426265 5.426265 5.700440 6.459432 3.807355 6.507795 4.754888 5.357552 5.954196
ladybug insect animalia 3.906891 5.247928 5.285402 5.426265 6.149747 4.523562 5.614710 4.000000 6.614710 5.554589
rose flower plantae 6.375039 6.491853 4.392317 2.807355 6.247928 6.554589 6.022368 5.882643 6.392317 6.087463
lily flower plantae 5.906891 5.247928 4.857981 4.643856 6.357552 4.857981 6.569856 4.906891 6.507795 6.442943
iris flower plantae 5.700440 5.129283 4.392317 5.044394 6.658211 5.614710 6.599913 4.643856 5.169925 6.643856
maple tree tree plantae 6.614710 6.491853 5.491853 3.906891 3.807355 5.087463 5.930737 5.426265 5.459432 6.169925
pinetree tree plantae 6.426265 5.491853 6.459432 1.584963 5.357552 5.523562 5.700440 5.614710 1.000000 6.643856
ginkgo tree plantae 4.459432 6.321928 6.149747 5.523562 6.491853 4.459432 6.554589 6.266787 4.906891 5.700440
指导数据集后,我使用自动绘图绘制 PCA 图
pca_df<-prcomp(df[,-c(1,2)], scale.=TRUE)
autoplot(pca_df)
autoplot(pca_df, data=df, colour='Kingdom', frame.type='norm')
在这个情节中,我想根据 Class
重叠颜色pca_df<-prcomp(df[,-c(1,2)], scale.=TRUE)
autoplot(pca_df)
autoplot(pca_df, data=df, colour='Class')
总而言之,我想使用来自 Kingdom 变量的椭圆表示,并使用 Class 变量着色。 如何在一个 PCA 图中组合不同的表示形式?
加试
我尝试使用其他包 (factoextra) 绘制 PCA 图,但它显示相同的结果。
require (factoextra)
fviz_pca_ind(pca_df,
col.ind=df$Kingdom,
addEllipses = TRUE,
col.ind.sup=df$Class,
repel=TRUE)
您可以为省略号指定 frame.colour
,为点指定 colour
:
autoplot(pca_df, data=df, colour='Class', frame.type='norm', frame.colour = "Kingdom")