有没有办法将总和添加到 fviz_eig 图中?
Is there a way to add a cum sum to a fviz_eig plot?
我正在尝试实现一个漂亮的 PC 图以及解释的累积方差。
我正在处理的数据框位于 https://www.kaggle.com/miroslavsabo/young-people-survey?select=responses.csv
df.responses <- read.csv("Data/responses.csv")
pref <- colnames(df.responses[0:63]) #columns for Music, Movies and Hobbies preferences
for(i in 1:length(pref)){
df.responses[is.na(df.responses[,i]), i] <- median(df.responses[,i], na.rm = TRUE)
}
df.movies <- data.frame(df.responses[20:31])
上面我只是加载了 df,删除了我感兴趣的列的 na 并选择了我想要 PCA 的子集。
library(ggplot2)
library(factoextra)
pca.movies <- prcomp(df.movies, scale = TRUE,)
pca.movies$rotation <- -pca.movies$rotation
pca.movies$x <- -pca.movies$x
fviz_pca_var(pca.movies,
col.var = "contrib",
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE
)
pv.movies <- pca.movies$sdev^2
pvp.movies <- pv.movies/sum(pv.movies)
pvp.movies
fviz_eig(pca.movies,
addlabels = T,
barcolor = "#E7B800",
barfill = "#E7B800",
linecolor = "#00AFBB",
choice = "variance",
ylim=c(0,25))
plot(cumsum(pvp.movies),xlab = "Cumulative proportion of Variance Explained", ylim=c(0,1),type = 'b')
有了上面的内容,我成功地获得了两个不错的 PCA 图,我想在第二个图中添加累积和线(第三个丑陋的图中显示的那个)
有没有办法将这样的线添加到 fviz_eig 图中?
我知道这个 PCA 并不是很有效,我只是用一些数据可视化来挑战自己。
fviz_eig
返回的对象是一个ggplot
对象,因此可以合并两个图如下:
p <- fviz_eig(pca.movies,
addlabels = T,
barcolor = "#E7B800",
barfill = "#E7B800",
linecolor = "#00AFBB",
choice = "variance",
ylim=c(0,25))
df <- data.frame(x=1:length(pvp.movies),
y=cumsum(pvp.movies)*100/4)
p <- p +
geom_point(data=df, aes(x, y), size=2, color="#00AFBB") +
geom_line(data=df, aes(x, y), color="#00AFBB") +
scale_y_continuous(sec.axis = sec_axis(~ . * 4,
name = "Cumulative proportion of Variance Explained") )
print(p)
我正在尝试实现一个漂亮的 PC 图以及解释的累积方差。 我正在处理的数据框位于 https://www.kaggle.com/miroslavsabo/young-people-survey?select=responses.csv
df.responses <- read.csv("Data/responses.csv")
pref <- colnames(df.responses[0:63]) #columns for Music, Movies and Hobbies preferences
for(i in 1:length(pref)){
df.responses[is.na(df.responses[,i]), i] <- median(df.responses[,i], na.rm = TRUE)
}
df.movies <- data.frame(df.responses[20:31])
上面我只是加载了 df,删除了我感兴趣的列的 na 并选择了我想要 PCA 的子集。
library(ggplot2)
library(factoextra)
pca.movies <- prcomp(df.movies, scale = TRUE,)
pca.movies$rotation <- -pca.movies$rotation
pca.movies$x <- -pca.movies$x
fviz_pca_var(pca.movies,
col.var = "contrib",
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE
)
pv.movies <- pca.movies$sdev^2
pvp.movies <- pv.movies/sum(pv.movies)
pvp.movies
fviz_eig(pca.movies,
addlabels = T,
barcolor = "#E7B800",
barfill = "#E7B800",
linecolor = "#00AFBB",
choice = "variance",
ylim=c(0,25))
plot(cumsum(pvp.movies),xlab = "Cumulative proportion of Variance Explained", ylim=c(0,1),type = 'b')
有了上面的内容,我成功地获得了两个不错的 PCA 图,我想在第二个图中添加累积和线(第三个丑陋的图中显示的那个) 有没有办法将这样的线添加到 fviz_eig 图中? 我知道这个 PCA 并不是很有效,我只是用一些数据可视化来挑战自己。
fviz_eig
返回的对象是一个ggplot
对象,因此可以合并两个图如下:
p <- fviz_eig(pca.movies,
addlabels = T,
barcolor = "#E7B800",
barfill = "#E7B800",
linecolor = "#00AFBB",
choice = "variance",
ylim=c(0,25))
df <- data.frame(x=1:length(pvp.movies),
y=cumsum(pvp.movies)*100/4)
p <- p +
geom_point(data=df, aes(x, y), size=2, color="#00AFBB") +
geom_line(data=df, aes(x, y), color="#00AFBB") +
scale_y_continuous(sec.axis = sec_axis(~ . * 4,
name = "Cumulative proportion of Variance Explained") )
print(p)