将密度图分成 4 组并将这些组添加到数据 table

Question

我正在用 ggplot() 在 R 中创建一个密度图，我在其中指定 median、5% 和 95% 分位数线（geom_vline()）。这是我的情节结构：

probs <- c(0.05, 0.95)
quantiles <- quantile(dt.all2018$Qeff, prob = probs)
q5 <- as.numeric(quantiles[1])
q95 <- as.numeric(quantiles[2])
median <- median(dt.all2018$Qeff)


p <- (ggplot(dt.all2018) + 
      geom_density(aes(x = Qeff, y = ..scaled..),  colour = "#007d3c") +
      ggtitle("Qeff 2018") +
      geom_vline(aes(xintercept = median, color = "median"), linetype = "dashed") +
      geom_vline(aes(xintercept = q5, color = "5%"), linetype = "dashed") +
      geom_vline(aes(xintercept = q95, color = "95%"), linetype = "dashed") +
      scale_color_manual(name = "statistics", values = c('5%' = "#0000FF", '95%' = "red", median = "#007d3c")) +
      theme(panel.background = element_blank(), axis.line = element_line(colour = "black"),
            plot.title = element_text(lineheight = .8, hjust = 0.5, face = "bold"),
            legend.box.background = element_rect(colour = "black"), legend.box.margin = margin(t = 1, l = 1),
            legend.title = element_blank()))%>%
      ggplotly()

那么我的剧情是这样的（没有我自己画的部分）：

现在我想创建一个新列 group，其中包含我的数据的组号，即将它所属的组添加到相应的 Qeff。第 1 组是 5% 以内的所有内容，第二组是 5% 和 median 之间的所有内容，第 3 组是 median 和 95% 之间的所有内容，第 4 组是所有内容在 95% 之后。 group 列应仅包含数字 1 到 4。

我该怎么做？

这是我的一小段数据 table:

structure(list(EK = c(311746.83, 0, 408503.01, 965723.51, 447176.86, 
0, 0, 237703401.51, 11650300.16, 761470.17, 15514898.49, 791067269.75, 
35591131, 10754272.33, 9496742.11, 512370.9, 1134032.95, 35318984.4, 
5630139.9, 1111511.07), EH = c(345245.44, 0, 439620.18, 894773.08, 
485161.85, 0, 0, 331524231.52, 19502922.3, 1007182.97, 13714848.49, 
470803897.97, 36394200.3, 11485817.1, 9542583.17, 532302.49, 
1071746.46, 20666845.08, 5333889.99, 938096.94), Peff = c(104.78, 
0, 91.52, 112.18, 113.39, 0, 0, 86.18, 101.04, 104.39, 106.23, 
86.4, 96.19, 86.38, 113.5, 115.88, 104.61, 96.31, 95.6, 101.71
), Qeff = c(-0.01, 0, 0, 0, 0, 0, 0, 0, -0.01, -0.01, 0, 0, 0, 
0, 0.01, 0, 0, 0, 0, 0)), class = c("data.table", "data.frame"
), row.names = c(NA, -20L), .internal.selfref = <pointer: 0x000002671f801ef0>)

Answer 1

通过使用 cut() 函数，

    dt.all2018 <- dt.all2018 %>%
      mutate(group = cut(Qeff, 
                         breaks=c(-Inf, q5, median, q95, Inf), 
                         labels=c(1, 2, 3, 4)))

第二种方式需要更多的测试。很抱歉造成混淆

将密度图分成 4 组并将这些组添加到数据 table

Split density plot in 4 groups and add the groups to data table

grouping

r

ggplot2

density-plot

ggplotly