将密度图分成 4 组并将这些组添加到数据 table
Split density plot in 4 groups and add the groups to data table
我正在用 ggplot()
在 R
中创建一个密度图,我在其中指定 median
、5%
和 95%
分位数线(geom_vline()
)。这是我的情节结构:
probs <- c(0.05, 0.95)
quantiles <- quantile(dt.all2018$Qeff, prob = probs)
q5 <- as.numeric(quantiles[1])
q95 <- as.numeric(quantiles[2])
median <- median(dt.all2018$Qeff)
p <- (ggplot(dt.all2018) +
geom_density(aes(x = Qeff, y = ..scaled..), colour = "#007d3c") +
ggtitle("Qeff 2018") +
geom_vline(aes(xintercept = median, color = "median"), linetype = "dashed") +
geom_vline(aes(xintercept = q5, color = "5%"), linetype = "dashed") +
geom_vline(aes(xintercept = q95, color = "95%"), linetype = "dashed") +
scale_color_manual(name = "statistics", values = c('5%' = "#0000FF", '95%' = "red", median = "#007d3c")) +
theme(panel.background = element_blank(), axis.line = element_line(colour = "black"),
plot.title = element_text(lineheight = .8, hjust = 0.5, face = "bold"),
legend.box.background = element_rect(colour = "black"), legend.box.margin = margin(t = 1, l = 1),
legend.title = element_blank()))%>%
ggplotly()
那么我的剧情是这样的(没有我自己画的部分):
现在我想创建一个新列 group
,其中包含我的数据的组号,即将它所属的组添加到相应的 Qeff
。第 1 组是 5%
以内的所有内容,第二组是 5%
和 median
之间的所有内容,第 3 组是 median
和 95%
之间的所有内容,第 4 组是所有内容在 95%
之后。 group
列应仅包含数字 1 到 4。
我该怎么做?
这是我的一小段数据 table:
structure(list(EK = c(311746.83, 0, 408503.01, 965723.51, 447176.86,
0, 0, 237703401.51, 11650300.16, 761470.17, 15514898.49, 791067269.75,
35591131, 10754272.33, 9496742.11, 512370.9, 1134032.95, 35318984.4,
5630139.9, 1111511.07), EH = c(345245.44, 0, 439620.18, 894773.08,
485161.85, 0, 0, 331524231.52, 19502922.3, 1007182.97, 13714848.49,
470803897.97, 36394200.3, 11485817.1, 9542583.17, 532302.49,
1071746.46, 20666845.08, 5333889.99, 938096.94), Peff = c(104.78,
0, 91.52, 112.18, 113.39, 0, 0, 86.18, 101.04, 104.39, 106.23,
86.4, 96.19, 86.38, 113.5, 115.88, 104.61, 96.31, 95.6, 101.71
), Qeff = c(-0.01, 0, 0, 0, 0, 0, 0, 0, -0.01, -0.01, 0, 0, 0,
0, 0.01, 0, 0, 0, 0, 0)), class = c("data.table", "data.frame"
), row.names = c(NA, -20L), .internal.selfref = <pointer: 0x000002671f801ef0>)
通过使用 cut() 函数,
dt.all2018 <- dt.all2018 %>%
mutate(group = cut(Qeff,
breaks=c(-Inf, q5, median, q95, Inf),
labels=c(1, 2, 3, 4)))
第二种方式需要更多的测试。很抱歉造成混淆
我正在用 ggplot()
在 R
中创建一个密度图,我在其中指定 median
、5%
和 95%
分位数线(geom_vline()
)。这是我的情节结构:
probs <- c(0.05, 0.95)
quantiles <- quantile(dt.all2018$Qeff, prob = probs)
q5 <- as.numeric(quantiles[1])
q95 <- as.numeric(quantiles[2])
median <- median(dt.all2018$Qeff)
p <- (ggplot(dt.all2018) +
geom_density(aes(x = Qeff, y = ..scaled..), colour = "#007d3c") +
ggtitle("Qeff 2018") +
geom_vline(aes(xintercept = median, color = "median"), linetype = "dashed") +
geom_vline(aes(xintercept = q5, color = "5%"), linetype = "dashed") +
geom_vline(aes(xintercept = q95, color = "95%"), linetype = "dashed") +
scale_color_manual(name = "statistics", values = c('5%' = "#0000FF", '95%' = "red", median = "#007d3c")) +
theme(panel.background = element_blank(), axis.line = element_line(colour = "black"),
plot.title = element_text(lineheight = .8, hjust = 0.5, face = "bold"),
legend.box.background = element_rect(colour = "black"), legend.box.margin = margin(t = 1, l = 1),
legend.title = element_blank()))%>%
ggplotly()
那么我的剧情是这样的(没有我自己画的部分):
现在我想创建一个新列 group
,其中包含我的数据的组号,即将它所属的组添加到相应的 Qeff
。第 1 组是 5%
以内的所有内容,第二组是 5%
和 median
之间的所有内容,第 3 组是 median
和 95%
之间的所有内容,第 4 组是所有内容在 95%
之后。 group
列应仅包含数字 1 到 4。
我该怎么做?
这是我的一小段数据 table:
structure(list(EK = c(311746.83, 0, 408503.01, 965723.51, 447176.86,
0, 0, 237703401.51, 11650300.16, 761470.17, 15514898.49, 791067269.75,
35591131, 10754272.33, 9496742.11, 512370.9, 1134032.95, 35318984.4,
5630139.9, 1111511.07), EH = c(345245.44, 0, 439620.18, 894773.08,
485161.85, 0, 0, 331524231.52, 19502922.3, 1007182.97, 13714848.49,
470803897.97, 36394200.3, 11485817.1, 9542583.17, 532302.49,
1071746.46, 20666845.08, 5333889.99, 938096.94), Peff = c(104.78,
0, 91.52, 112.18, 113.39, 0, 0, 86.18, 101.04, 104.39, 106.23,
86.4, 96.19, 86.38, 113.5, 115.88, 104.61, 96.31, 95.6, 101.71
), Qeff = c(-0.01, 0, 0, 0, 0, 0, 0, 0, -0.01, -0.01, 0, 0, 0,
0, 0.01, 0, 0, 0, 0, 0)), class = c("data.table", "data.frame"
), row.names = c(NA, -20L), .internal.selfref = <pointer: 0x000002671f801ef0>)
通过使用 cut() 函数,
dt.all2018 <- dt.all2018 %>%
mutate(group = cut(Qeff,
breaks=c(-Inf, q5, median, q95, Inf),
labels=c(1, 2, 3, 4)))
第二种方式需要更多的测试。很抱歉造成混淆