在 ggarrangeplot 中仅绘制分位数

Question

我有一个情节，我正在比较几个（大约 12 个）不相关的描述符。为了方便展示所有这些图，我列一个表：

library(facetscales)
library(ggplot2)

comb <- lapply(colnames(iris[1:4]), function(x) ggplot(iris, aes(x = get(x))) + 
                 geom_histogram(position = "identity", aes(y= ..ncount.., fill = Species), bins = 10) +
                 theme_classic() + 
                 facet_grid(Species~., scales ="free_y") +
                 theme(legend.position = 'None',

                       panel.spacing = unit(2, "lines"),
                       legend.title = element_blank(),
                       strip.background = element_blank(),
                       strip.text.y = element_blank(),
                       plot.margin = unit(c(10,10,10,10), "points")
                 )+
                 xlab(x) +
                 scale_x_continuous() 
)

我将其与 ggarrange 函数一起使用

ggarrange(plotlist = comb, common.legend = TRUE, legend = "bottom", ncol = 2, nrow = 2)

创建适合我需要的情节：

但是，我的一些数据有一些极端异常值。因此，我需要创建图表来显示我的数据框中每列的 90% 分位数数据。

我想实施一个类似于 Warner 在这个问题中提出的解决方案：() ，但我无法用我现有的解决方案正确实施这个解决方案。我是什么正在寻找一种应用从以下行获得的信息的方法：

quantiles <- lapply(iris, quantile, c(0, 0.9)) # find 90% quantiles for all columns

这样在上面的lapply函数中只显示第90个百分位数的数据。

Answer 1

我认为您想删除第 90 个百分位数以上的数据并绘制剩余的数据。这是执行此操作的一些代码。我将代码移到一个单独的函数中以使其更易于调试，并将分位数值作为参数以使其易于更改。我还在 ggplot 调用中使用了 aes_string 而不是需要使用 get.

library(facetscales)
library(ggplot2)
library(ggpubr)

myplot <- function(x, q) {
    data <- iris %>% dplyr::select(x)   # Select the column of interest
    quantiles <- quantile(data[,1], q)  # Calculate the required quantile
    filtered_data <- iris %>% dplyr::filter(.data[[x]] < quantiles[1]) # Filter the column with the required quantile
    ggplot(filtered_data, aes_string(x = x)) +
        geom_histogram(position = "identity", aes(y= ..ncount.., fill = Species), bins = 10) +
        theme_classic() + 
        facet_grid(Species~., scales ="free_y") +
        theme(legend.position = 'None',
                    
                    panel.spacing = unit(2, "lines"),
                    legend.title = element_blank(),
                    strip.background = element_blank(),
                    strip.text.y = element_blank(),
                    plot.margin = unit(c(10,10,10,10), "points")
        ) +
        xlab(x) +
        scale_x_continuous() 
}
comb <- lapply(colnames(iris[1:4]), function(x) myplot(x, 0.9))
ggarrange(plotlist = comb, common.legend = TRUE, legend = "bottom", ncol = 2, nrow = 2)

在 ggarrangeplot 中仅绘制分位数

Plotting only quantiles in a ggarrangeplot

r

histogram

percentile

ggplot2