在 PCA 图中命名样本
name the samples in a PCA plot
我有一个包含大量数据的 PCA 图,我想确定哪些样本是异常值。
当我使用
geom.ind = c("text")
那么文字太多,我什么都看不懂。
这是一个最小的可重现示例。 (我已经在这里使用它 但答案只能手动工作而且我真的有一个很棒的数据框)
dataframe <- data_frame("c1"=c(78,89,0),"c2"=c(89,89,34),"c3"=c(56,0,4))
row.names(dataframe) <- c("name1","name2","name3")
sub <- PCA(dataframe)
pca <- fviz_pca_ind(sub, pointsize = "cos2",
pointshape = 21, fill = "#E7B800",
repel = TRUE, # Avoid text overlapping (slow if many points)
geom = c("text","point"),
xlab = "PC1", ylab = "PC2",label = row.names(dataframe)
)
interactive <- ggplotly(pca,dynamicTicks = T,tooltip = c("x","y",label = list))
如您所见,我尝试使用 ggplotly() 函数来完成,但这不起作用。
我想在我的绘图中识别样本名称 (name1,name2,name3)。我怎样才能为一个伟大的数据集做到这一点?
非常感谢您
您可以使用以下代码
library(tidyverse)
library("factoextra")
library(plotly)
library(FactoMineR)
dataframe <- data_frame("c1"=c(78,89,0),"c2"=c(89,89,34),"c3"=c(56,0,4))
row.names(dataframe) <- c("name1","name2","name3")
sub <- PCA(dataframe)
pca <- fviz_pca_ind(sub, pointsize = "cos2",
pointshape = 21, fill = "#E7B800",
repel = TRUE, # Avoid text overlapping (slow if many points)
geom = c("text","point"),
xlab = "PC1", ylab = "PC2",label = c("ind")
)
interactive <- ggplotly(pca,tooltip = c("x","y","colour"))
bggly <- plotly_build(interactive)
bggly$x$data[[1]]$text <-
with(pca$data, paste0("name: ", name,
"</br></br>x: ", x,
"</br>y: ", y,
"</br>coord: ", coord,
"</br>cos2: ", cos2,
"</br>contrib: ", contrib))
bggly
在 Stéphane Laurent 的帮助下。
对于以第一列作为行名称的 .csv 格式的大型数据集,您可以将其读取为 df <- read.csv("Test_Data.csv", row.names = 1)
,前提是您的行名称不重复。
我有一个包含大量数据的 PCA 图,我想确定哪些样本是异常值。 当我使用
geom.ind = c("text")
那么文字太多,我什么都看不懂。
这是一个最小的可重现示例。 (我已经在这里使用它
dataframe <- data_frame("c1"=c(78,89,0),"c2"=c(89,89,34),"c3"=c(56,0,4))
row.names(dataframe) <- c("name1","name2","name3")
sub <- PCA(dataframe)
pca <- fviz_pca_ind(sub, pointsize = "cos2",
pointshape = 21, fill = "#E7B800",
repel = TRUE, # Avoid text overlapping (slow if many points)
geom = c("text","point"),
xlab = "PC1", ylab = "PC2",label = row.names(dataframe)
)
interactive <- ggplotly(pca,dynamicTicks = T,tooltip = c("x","y",label = list))
如您所见,我尝试使用 ggplotly() 函数来完成,但这不起作用。
我想在我的绘图中识别样本名称 (name1,name2,name3)。我怎样才能为一个伟大的数据集做到这一点?
非常感谢您
您可以使用以下代码
library(tidyverse)
library("factoextra")
library(plotly)
library(FactoMineR)
dataframe <- data_frame("c1"=c(78,89,0),"c2"=c(89,89,34),"c3"=c(56,0,4))
row.names(dataframe) <- c("name1","name2","name3")
sub <- PCA(dataframe)
pca <- fviz_pca_ind(sub, pointsize = "cos2",
pointshape = 21, fill = "#E7B800",
repel = TRUE, # Avoid text overlapping (slow if many points)
geom = c("text","point"),
xlab = "PC1", ylab = "PC2",label = c("ind")
)
interactive <- ggplotly(pca,tooltip = c("x","y","colour"))
bggly <- plotly_build(interactive)
bggly$x$data[[1]]$text <-
with(pca$data, paste0("name: ", name,
"</br></br>x: ", x,
"</br>y: ", y,
"</br>coord: ", coord,
"</br>cos2: ", cos2,
"</br>contrib: ", contrib))
bggly
在 Stéphane Laurent df <- read.csv("Test_Data.csv", row.names = 1)
,前提是您的行名称不重复。