如何根据数据列值向 x 轴添加附加值?
How to add additional values to x-axis based on data column values?
我正在使用循环绘制多个基因的散点图。为每个基因生成多个 png 文件。每个基因/png 文件包含两个散点图:左侧的 Group1 和右侧的 Group2。每组都包含健康和不健康的样本。到这里我已经成功导出代码了。
但是,我现在需要做的是在每个 'time point' 的 x 轴上为每个健康和不健康的组添加样本数。这是基于 'samples' 列。对于每个时间点,这应该显示为“(健康状况下的样本数,不健康状况下的样本数)”。谁能帮我实现这个目标?
我当前的 2 个基因示例数据框 'data' 如下:
Biomarkers TimePoint Group Scale Readings Condition samples
Gene1 52.5 Group1 25 0.027 Healthy 33
Gene1 52.5 Group2 25 0.024 Healthy 35
Gene1 57.5 Group1 25 0.029 Healthy 39
Gene1 57.5 Group2 25 0.023 Healthy 46
Gene1 62.5 Group1 25 0.030 Healthy 38
Gene1 62.5 Group2 25 0.024 Healthy 42
Gene1 67.5 Group1 25 0.033 Healthy 23
Gene1 67.5 Group2 25 0.026 Healthy 16
Gene2 52.5 Group1 25 0.051 Healthy 33
Gene2 52.5 Group2 25 0.046 Healthy 35
Gene2 57.5 Group1 25 0.052 Healthy 39
Gene2 57.5 Group2 25 0.048 Healthy 46
Gene2 62.5 Group1 25 0.049 Healthy 38
Gene2 62.5 Group2 25 0.051 Healthy 42
Gene2 67.5 Group1 25 0.051 Healthy 23
Gene2 67.5 Group2 25 0.052 Healthy 16
Gene1 52.5 Group1 25.01 0.026 Unhealthy 41
Gene1 52.5 Group2 25.01 0.023 Unhealthy 57
Gene1 57.5 Group1 25.01 0.027 Unhealthy 79
Gene1 57.5 Group2 25.01 0.024 Unhealthy 70
Gene1 62.5 Group1 25.01 0.030 Unhealthy 93
Gene1 62.5 Group2 25.01 0.025 Unhealthy 84
Gene1 67.5 Group1 25.01 0.033 Unhealthy 98
Gene1 67.5 Group2 25.01 0.022 Unhealthy 64
Gene2 52.5 Group1 25.01 0.043 Unhealthy 36
Gene2 52.5 Group2 25.01 0.044 Unhealthy 57
Gene2 57.5 Group1 25.01 0.043 Unhealthy 79
Gene2 57.5 Group2 25.01 0.043 Unhealthy 70
Gene2 62.5 Group1 25.01 0.043 Unhealthy 93
Gene2 62.5 Group2 25.01 0.044 Unhealthy 84
Gene2 67.5 Group1 25.01 0.044 Unhealthy 98
Gene2 67.5 Group2 25.01 0.044 Unhealthy 64
Gene1 52.5 Group1 50 0.035 Healthy 33
Gene1 52.5 Group2 50 0.029 Healthy 35
Gene1 57.5 Group1 50 0.039 Healthy 39
Gene1 57.5 Group2 50 0.031 Healthy 46
Gene1 62.5 Group1 50 0.038 Healthy 38
Gene1 62.5 Group2 50 0.030 Healthy 42
Gene1 67.5 Group1 50 0.040 Healthy 23
Gene1 67.5 Group2 50 0.035 Healthy 16
Gene2 52.5 Group1 50 0.058 Healthy 33
Gene2 52.5 Group2 50 0.053 Healthy 35
Gene2 57.5 Group1 50 0.059 Healthy 39
Gene2 57.5 Group2 50 0.056 Healthy 46
Gene2 62.5 Group1 50 0.057 Healthy 38
Gene2 62.5 Group2 50 0.058 Healthy 42
Gene2 67.5 Group1 50 0.061 Healthy 23
Gene2 67.5 Group2 50 0.058 Healthy 16
Gene1 52.5 Group1 50.01 0.038 Unhealthy 41
Gene1 52.5 Group2 50.01 0.030 Unhealthy 57
Gene1 57.5 Group1 50.01 0.038 Unhealthy 79
Gene1 57.5 Group2 50.01 0.031 Unhealthy 70
Gene1 62.5 Group1 50.01 0.040 Unhealthy 93
Gene1 62.5 Group2 50.01 0.032 Unhealthy 84
Gene1 67.5 Group1 50.01 0.043 Unhealthy 98
Gene1 67.5 Group2 50.01 0.033 Unhealthy 64
Gene2 52.5 Group1 50.01 0.052 Unhealthy 36
Gene2 52.5 Group2 50.01 0.051 Unhealthy 57
Gene2 57.5 Group1 50.01 0.052 Unhealthy 79
Gene2 57.5 Group2 50.01 0.051 Unhealthy 70
Gene2 62.5 Group1 50.01 0.052 Unhealthy 93
Gene2 62.5 Group2 50.01 0.052 Unhealthy 84
Gene2 67.5 Group1 50.01 0.053 Unhealthy 98
Gene2 67.5 Group2 50.01 0.051 Unhealthy 64
Gene1 52.5 Group1 75 0.045 Healthy 33
Gene1 52.5 Group2 75 0.038 Healthy 35
Gene1 57.5 Group1 75 0.048 Healthy 39
Gene1 57.5 Group2 75 0.041 Healthy 46
Gene1 62.5 Group1 75 0.047 Healthy 38
Gene1 62.5 Group2 75 0.040 Healthy 42
Gene1 67.5 Group1 75 0.050 Healthy 23
Gene1 67.5 Group2 75 0.043 Healthy 16
Gene2 52.5 Group1 75 0.066 Healthy 33
Gene2 52.5 Group2 75 0.064 Healthy 35
Gene2 57.5 Group1 75 0.065 Healthy 39
Gene2 57.5 Group2 75 0.064 Healthy 46
Gene2 62.5 Group1 75 0.068 Healthy 38
Gene2 62.5 Group2 75 0.071 Healthy 42
Gene2 67.5 Group1 75 0.070 Healthy 23
Gene2 67.5 Group2 75 0.071 Healthy 16
Gene1 52.5 Group1 75.01 0.057 Unhealthy 41
Gene1 52.5 Group2 75.01 0.041 Unhealthy 57
Gene1 57.5 Group1 75.01 0.056 Unhealthy 79
Gene1 57.5 Group2 75.01 0.040 Unhealthy 70
Gene1 62.5 Group1 75.01 0.057 Unhealthy 93
Gene1 62.5 Group2 75.01 0.043 Unhealthy 84
Gene1 67.5 Group1 75.01 0.059 Unhealthy 98
Gene1 67.5 Group2 75.01 0.046 Unhealthy 64
Gene2 52.5 Group1 75.01 0.063 Unhealthy 36
Gene2 52.5 Group2 75.01 0.060 Unhealthy 57
Gene2 57.5 Group1 75.01 0.061 Unhealthy 79
Gene2 57.5 Group2 75.01 0.062 Unhealthy 70
Gene2 62.5 Group1 75.01 0.062 Unhealthy 93
Gene2 62.5 Group2 75.01 0.062 Unhealthy 84
Gene2 67.5 Group1 75.01 0.061 Unhealthy 98
Gene2 67.5 Group2 75.01 0.060 Unhealthy 64
Gene1 52.5 Group1 100 0.056 Healthy 33
Gene1 52.5 Group2 100 0.046 Healthy 35
Gene1 57.5 Group1 100 0.063 Healthy 39
Gene1 57.5 Group2 100 0.048 Healthy 46
Gene1 62.5 Group1 100 0.060 Healthy 38
Gene1 62.5 Group2 100 0.052 Healthy 42
Gene1 67.5 Group1 100 0.064 Healthy 23
Gene1 67.5 Group2 100 0.055 Healthy 16
Gene2 52.5 Group1 100 0.082 Healthy 33
Gene2 52.5 Group2 100 0.074 Healthy 35
Gene2 57.5 Group1 100 0.070 Healthy 39
Gene2 57.5 Group2 100 0.075 Healthy 46
Gene2 62.5 Group1 100 0.074 Healthy 38
Gene2 62.5 Group2 100 0.078 Healthy 42
Gene2 67.5 Group1 100 0.080 Healthy 23
Gene2 67.5 Group2 100 0.075 Healthy 16
Gene1 52.5 Group1 100.01 0.090 Unhealthy 41
Gene1 52.5 Group2 100.01 0.060 Unhealthy 57
Gene1 57.5 Group1 100.01 0.093 Unhealthy 79
Gene1 57.5 Group2 100.01 0.053 Unhealthy 70
Gene1 62.5 Group1 100.01 0.089 Unhealthy 93
Gene1 62.5 Group2 100.01 0.057 Unhealthy 84
Gene1 67.5 Group1 100.01 0.089 Unhealthy 98
Gene1 67.5 Group2 100.01 0.065 Unhealthy 64
Gene2 52.5 Group1 100.01 0.074 Unhealthy 36
Gene2 52.5 Group2 100.01 0.074 Unhealthy 57
Gene2 57.5 Group1 100.01 0.077 Unhealthy 79
Gene2 57.5 Group2 100.01 0.078 Unhealthy 70
Gene2 62.5 Group1 100.01 0.073 Unhealthy 93
Gene2 62.5 Group2 100.01 0.073 Unhealthy 84
Gene2 67.5 Group1 100.01 0.072 Unhealthy 98
Gene2 67.5 Group2 100.01 0.074 Unhealthy 64
我的数据输入是:
dput(data)
structure(list(Biomarkers = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Gene1",
"Gene2"), class = "factor"), TimePoint = c(52.5, 52.5, 57.5,
57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5,
67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5,
52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5,
62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5,
67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5,
57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5,
62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5,
52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5,
57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5,
67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5,
52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5,
62.5, 62.5, 67.5, 67.5), Group = structure(c(1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Group1",
"Group2"), class = "factor"), Scale = c(25, 25, 25, 25, 25, 25,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25.01, 25.01, 25.01,
25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01,
25.01, 25.01, 25.01, 25.01, 50, 50, 50, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 50, 50.01, 50.01, 50.01, 50.01, 50.01,
50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01,
50.01, 50.01, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01,
75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01,
100.01, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01,
100.01, 100.01), Readings = c(0.027, 0.024, 0.029, 0.023, 0.03,
0.024, 0.033, 0.026, 0.051, 0.046, 0.052, 0.048, 0.049, 0.051,
0.051, 0.052, 0.026, 0.023, 0.027, 0.024, 0.03, 0.025, 0.033,
0.022, 0.043, 0.044, 0.043, 0.043, 0.043, 0.044, 0.044, 0.044,
0.035, 0.029, 0.039, 0.031, 0.038, 0.03, 0.04, 0.035, 0.058,
0.053, 0.059, 0.056, 0.057, 0.058, 0.061, 0.058, 0.038, 0.03,
0.038, 0.031, 0.04, 0.032, 0.043, 0.033, 0.052, 0.051, 0.052,
0.051, 0.052, 0.052, 0.053, 0.051, 0.045, 0.038, 0.048, 0.041,
0.047, 0.04, 0.05, 0.043, 0.066, 0.064, 0.065, 0.064, 0.068,
0.071, 0.07, 0.071, 0.057, 0.041, 0.056, 0.04, 0.057, 0.043,
0.059, 0.046, 0.063, 0.06, 0.061, 0.062, 0.062, 0.062, 0.061,
0.06, 0.056, 0.046, 0.063, 0.048, 0.06, 0.052, 0.064, 0.055,
0.082, 0.074, 0.07, 0.075, 0.074, 0.078, 0.08, 0.075, 0.09, 0.06,
0.093, 0.053, 0.089, 0.057, 0.089, 0.065, 0.074, 0.074, 0.077,
0.078, 0.073, 0.073, 0.072, 0.074), Condition = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Healthy",
"Unhealthy"), class = "factor"), samples = c(33L, 35L, 39L, 46L,
38L, 42L, 23L, 16L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 41L,
57L, 79L, 70L, 93L, 84L, 98L, 64L, 36L, 57L, 79L, 70L, 93L, 84L,
98L, 64L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 33L, 35L, 39L,
46L, 38L, 42L, 23L, 16L, 41L, 57L, 79L, 70L, 93L, 84L, 98L, 64L,
36L, 57L, 79L, 70L, 93L, 84L, 98L, 64L, 33L, 35L, 39L, 46L, 38L,
42L, 23L, 16L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 41L, 57L,
79L, 70L, 93L, 84L, 98L, 64L, 36L, 57L, 79L, 70L, 93L, 84L, 98L,
64L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 33L, 35L, 39L, 46L,
38L, 42L, 23L, 16L, 41L, 57L, 79L, 70L, 93L, 84L, 98L, 64L, 36L,
57L, 79L, 70L, 93L, 84L, 98L, 64L)), class = "data.frame", row.names = c(NA,
-128L))
我现在的代码是这样的:
# Load libraries
library(ggplot2)
library(magrittr)
library(dplyr)
library(gridExtra)
library(grid)
proc_plot <- function(sub) {
data_Group1 <- sub[sub$Group == "Group1", ]
data_Group2 <- sub[sub$Group == "Group2", ]
min_rdg <- min(data_Group1$Readings, data_Group2$Readings)
max_rdg <- max(data_Group1$Readings, data_Group2$Readings)
# Group1
graph_Group1 <- ggplot(data_Group1, aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title="Group1", x="Time point", y="Readings") +
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1", "2", "3", "4")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale),size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
# Group2
graph_Group2 <- ggplot(data_Group2, aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title="Group2", x="Time point", y="Readings") +
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1", "2", "3", "4")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale), size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
png (paste0("ScatterPlot_", sub$Biomarkers[[1]], ".png"), height=600, width=1111)
output <- grid.arrange(graph_Group1, graph_Group2, nrow = 1,
top=textGrob(sub$Biomarkers[[1]], gp=gpar(fontsize=20)))
dev.off()
return(output)
}
# BUILD PLOT LIST AND PNG FILES
plot_list <- by(data, data$Biomarkers, proc_plot)
dev.off()
grid.draw(plot_list$Gene1)
dev.off()
grid.draw(plot_list$Gene2)
我还在下面附上了 Gene1 的示例 png 文件。我已经手动添加了红色数字以突出显示并表明它正是我需要为每个 gene/png 文件(但为黑色)所需要的。
感谢任何帮助。谢谢。
您可以使用 \n
在标签中换行。例如,
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1\n(33, 41)", "2\n(39, 79)", "3\n(38, 93)", "4\n(23, 98)"))
您可以像这样以编程方式执行此操作:
lab_df = data_Group1 %>% group_by(TimePoint) %>%
summarize(label = sprintf("(%s, %s)", first(samples[Condition == "Healthy"]), first(samples[Condition == "Unhealthy"])))
lab_df
# # A tibble: 4 x 2
# TimePoint label
# <dbl> <chr>
# 1 52.5 (33, 41)
# 2 57.5 (39, 79)
# 3 62.5 (38, 93)
# 4 67.5 (23, 98)
ggplot(...) + ... +
scale_x_continuous(
breaks = lab_df$TimePoint,
labels = paste(1:nrow(lab_df), lab_df$label, sep = "\n")
)
完整的服务解决方案。简化为使用 for
循环而不是单独处理组,标签以编程方式处理。
proc_plot <- function(sub) {
lab_df = sub %>% group_by(TimePoint, Group) %>%
summarize(label = sprintf(
"(%s, %s)",
first(samples[Condition == "Healthy"]),
first(samples[Condition == "Unhealthy"])
)) %>%
arrange(Group, TimePoint) # make sure things are in order
min_rdg <- min(sub$Readings)
max_rdg <- max(sub$Readings)
graphs = list()
for (i in unique(sub$Group)) {
this_lab = lab_df[lab_df$Group == i, ]
graphs[[i]] = ggplot(sub[sub$Group == i, ], aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title = i, x = "Time point", y = "Readings") +
scale_x_continuous(breaks = this_lab$TimePoint,
labels = paste(1:nrow(this_lab), this_lab$label, sep = "\n")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale),size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
}
png (paste0("ScatterPlot_", sub$Biomarkers[[1]], ".png"), height=600, width=1111)
output <- grid.arrange(grobs = graphs, nrow = 1,
top = textGrob(sub$Biomarkers[[1]], gp = gpar(fontsize = 20)))
dev.off()
return(output)
}
proc_plot(sub[sub$Biomarkers == "Gene1", ])
我正在使用循环绘制多个基因的散点图。为每个基因生成多个 png 文件。每个基因/png 文件包含两个散点图:左侧的 Group1 和右侧的 Group2。每组都包含健康和不健康的样本。到这里我已经成功导出代码了。
但是,我现在需要做的是在每个 'time point' 的 x 轴上为每个健康和不健康的组添加样本数。这是基于 'samples' 列。对于每个时间点,这应该显示为“(健康状况下的样本数,不健康状况下的样本数)”。谁能帮我实现这个目标?
我当前的 2 个基因示例数据框 'data' 如下:
Biomarkers TimePoint Group Scale Readings Condition samples
Gene1 52.5 Group1 25 0.027 Healthy 33
Gene1 52.5 Group2 25 0.024 Healthy 35
Gene1 57.5 Group1 25 0.029 Healthy 39
Gene1 57.5 Group2 25 0.023 Healthy 46
Gene1 62.5 Group1 25 0.030 Healthy 38
Gene1 62.5 Group2 25 0.024 Healthy 42
Gene1 67.5 Group1 25 0.033 Healthy 23
Gene1 67.5 Group2 25 0.026 Healthy 16
Gene2 52.5 Group1 25 0.051 Healthy 33
Gene2 52.5 Group2 25 0.046 Healthy 35
Gene2 57.5 Group1 25 0.052 Healthy 39
Gene2 57.5 Group2 25 0.048 Healthy 46
Gene2 62.5 Group1 25 0.049 Healthy 38
Gene2 62.5 Group2 25 0.051 Healthy 42
Gene2 67.5 Group1 25 0.051 Healthy 23
Gene2 67.5 Group2 25 0.052 Healthy 16
Gene1 52.5 Group1 25.01 0.026 Unhealthy 41
Gene1 52.5 Group2 25.01 0.023 Unhealthy 57
Gene1 57.5 Group1 25.01 0.027 Unhealthy 79
Gene1 57.5 Group2 25.01 0.024 Unhealthy 70
Gene1 62.5 Group1 25.01 0.030 Unhealthy 93
Gene1 62.5 Group2 25.01 0.025 Unhealthy 84
Gene1 67.5 Group1 25.01 0.033 Unhealthy 98
Gene1 67.5 Group2 25.01 0.022 Unhealthy 64
Gene2 52.5 Group1 25.01 0.043 Unhealthy 36
Gene2 52.5 Group2 25.01 0.044 Unhealthy 57
Gene2 57.5 Group1 25.01 0.043 Unhealthy 79
Gene2 57.5 Group2 25.01 0.043 Unhealthy 70
Gene2 62.5 Group1 25.01 0.043 Unhealthy 93
Gene2 62.5 Group2 25.01 0.044 Unhealthy 84
Gene2 67.5 Group1 25.01 0.044 Unhealthy 98
Gene2 67.5 Group2 25.01 0.044 Unhealthy 64
Gene1 52.5 Group1 50 0.035 Healthy 33
Gene1 52.5 Group2 50 0.029 Healthy 35
Gene1 57.5 Group1 50 0.039 Healthy 39
Gene1 57.5 Group2 50 0.031 Healthy 46
Gene1 62.5 Group1 50 0.038 Healthy 38
Gene1 62.5 Group2 50 0.030 Healthy 42
Gene1 67.5 Group1 50 0.040 Healthy 23
Gene1 67.5 Group2 50 0.035 Healthy 16
Gene2 52.5 Group1 50 0.058 Healthy 33
Gene2 52.5 Group2 50 0.053 Healthy 35
Gene2 57.5 Group1 50 0.059 Healthy 39
Gene2 57.5 Group2 50 0.056 Healthy 46
Gene2 62.5 Group1 50 0.057 Healthy 38
Gene2 62.5 Group2 50 0.058 Healthy 42
Gene2 67.5 Group1 50 0.061 Healthy 23
Gene2 67.5 Group2 50 0.058 Healthy 16
Gene1 52.5 Group1 50.01 0.038 Unhealthy 41
Gene1 52.5 Group2 50.01 0.030 Unhealthy 57
Gene1 57.5 Group1 50.01 0.038 Unhealthy 79
Gene1 57.5 Group2 50.01 0.031 Unhealthy 70
Gene1 62.5 Group1 50.01 0.040 Unhealthy 93
Gene1 62.5 Group2 50.01 0.032 Unhealthy 84
Gene1 67.5 Group1 50.01 0.043 Unhealthy 98
Gene1 67.5 Group2 50.01 0.033 Unhealthy 64
Gene2 52.5 Group1 50.01 0.052 Unhealthy 36
Gene2 52.5 Group2 50.01 0.051 Unhealthy 57
Gene2 57.5 Group1 50.01 0.052 Unhealthy 79
Gene2 57.5 Group2 50.01 0.051 Unhealthy 70
Gene2 62.5 Group1 50.01 0.052 Unhealthy 93
Gene2 62.5 Group2 50.01 0.052 Unhealthy 84
Gene2 67.5 Group1 50.01 0.053 Unhealthy 98
Gene2 67.5 Group2 50.01 0.051 Unhealthy 64
Gene1 52.5 Group1 75 0.045 Healthy 33
Gene1 52.5 Group2 75 0.038 Healthy 35
Gene1 57.5 Group1 75 0.048 Healthy 39
Gene1 57.5 Group2 75 0.041 Healthy 46
Gene1 62.5 Group1 75 0.047 Healthy 38
Gene1 62.5 Group2 75 0.040 Healthy 42
Gene1 67.5 Group1 75 0.050 Healthy 23
Gene1 67.5 Group2 75 0.043 Healthy 16
Gene2 52.5 Group1 75 0.066 Healthy 33
Gene2 52.5 Group2 75 0.064 Healthy 35
Gene2 57.5 Group1 75 0.065 Healthy 39
Gene2 57.5 Group2 75 0.064 Healthy 46
Gene2 62.5 Group1 75 0.068 Healthy 38
Gene2 62.5 Group2 75 0.071 Healthy 42
Gene2 67.5 Group1 75 0.070 Healthy 23
Gene2 67.5 Group2 75 0.071 Healthy 16
Gene1 52.5 Group1 75.01 0.057 Unhealthy 41
Gene1 52.5 Group2 75.01 0.041 Unhealthy 57
Gene1 57.5 Group1 75.01 0.056 Unhealthy 79
Gene1 57.5 Group2 75.01 0.040 Unhealthy 70
Gene1 62.5 Group1 75.01 0.057 Unhealthy 93
Gene1 62.5 Group2 75.01 0.043 Unhealthy 84
Gene1 67.5 Group1 75.01 0.059 Unhealthy 98
Gene1 67.5 Group2 75.01 0.046 Unhealthy 64
Gene2 52.5 Group1 75.01 0.063 Unhealthy 36
Gene2 52.5 Group2 75.01 0.060 Unhealthy 57
Gene2 57.5 Group1 75.01 0.061 Unhealthy 79
Gene2 57.5 Group2 75.01 0.062 Unhealthy 70
Gene2 62.5 Group1 75.01 0.062 Unhealthy 93
Gene2 62.5 Group2 75.01 0.062 Unhealthy 84
Gene2 67.5 Group1 75.01 0.061 Unhealthy 98
Gene2 67.5 Group2 75.01 0.060 Unhealthy 64
Gene1 52.5 Group1 100 0.056 Healthy 33
Gene1 52.5 Group2 100 0.046 Healthy 35
Gene1 57.5 Group1 100 0.063 Healthy 39
Gene1 57.5 Group2 100 0.048 Healthy 46
Gene1 62.5 Group1 100 0.060 Healthy 38
Gene1 62.5 Group2 100 0.052 Healthy 42
Gene1 67.5 Group1 100 0.064 Healthy 23
Gene1 67.5 Group2 100 0.055 Healthy 16
Gene2 52.5 Group1 100 0.082 Healthy 33
Gene2 52.5 Group2 100 0.074 Healthy 35
Gene2 57.5 Group1 100 0.070 Healthy 39
Gene2 57.5 Group2 100 0.075 Healthy 46
Gene2 62.5 Group1 100 0.074 Healthy 38
Gene2 62.5 Group2 100 0.078 Healthy 42
Gene2 67.5 Group1 100 0.080 Healthy 23
Gene2 67.5 Group2 100 0.075 Healthy 16
Gene1 52.5 Group1 100.01 0.090 Unhealthy 41
Gene1 52.5 Group2 100.01 0.060 Unhealthy 57
Gene1 57.5 Group1 100.01 0.093 Unhealthy 79
Gene1 57.5 Group2 100.01 0.053 Unhealthy 70
Gene1 62.5 Group1 100.01 0.089 Unhealthy 93
Gene1 62.5 Group2 100.01 0.057 Unhealthy 84
Gene1 67.5 Group1 100.01 0.089 Unhealthy 98
Gene1 67.5 Group2 100.01 0.065 Unhealthy 64
Gene2 52.5 Group1 100.01 0.074 Unhealthy 36
Gene2 52.5 Group2 100.01 0.074 Unhealthy 57
Gene2 57.5 Group1 100.01 0.077 Unhealthy 79
Gene2 57.5 Group2 100.01 0.078 Unhealthy 70
Gene2 62.5 Group1 100.01 0.073 Unhealthy 93
Gene2 62.5 Group2 100.01 0.073 Unhealthy 84
Gene2 67.5 Group1 100.01 0.072 Unhealthy 98
Gene2 67.5 Group2 100.01 0.074 Unhealthy 64
我的数据输入是:
dput(data)
structure(list(Biomarkers = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Gene1",
"Gene2"), class = "factor"), TimePoint = c(52.5, 52.5, 57.5,
57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5,
67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5,
52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5,
62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5,
67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5,
57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5,
62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5,
52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5,
57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5,
67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5,
52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5,
62.5, 62.5, 67.5, 67.5), Group = structure(c(1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Group1",
"Group2"), class = "factor"), Scale = c(25, 25, 25, 25, 25, 25,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25.01, 25.01, 25.01,
25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01,
25.01, 25.01, 25.01, 25.01, 50, 50, 50, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 50, 50.01, 50.01, 50.01, 50.01, 50.01,
50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01,
50.01, 50.01, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01,
75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01,
100.01, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01,
100.01, 100.01), Readings = c(0.027, 0.024, 0.029, 0.023, 0.03,
0.024, 0.033, 0.026, 0.051, 0.046, 0.052, 0.048, 0.049, 0.051,
0.051, 0.052, 0.026, 0.023, 0.027, 0.024, 0.03, 0.025, 0.033,
0.022, 0.043, 0.044, 0.043, 0.043, 0.043, 0.044, 0.044, 0.044,
0.035, 0.029, 0.039, 0.031, 0.038, 0.03, 0.04, 0.035, 0.058,
0.053, 0.059, 0.056, 0.057, 0.058, 0.061, 0.058, 0.038, 0.03,
0.038, 0.031, 0.04, 0.032, 0.043, 0.033, 0.052, 0.051, 0.052,
0.051, 0.052, 0.052, 0.053, 0.051, 0.045, 0.038, 0.048, 0.041,
0.047, 0.04, 0.05, 0.043, 0.066, 0.064, 0.065, 0.064, 0.068,
0.071, 0.07, 0.071, 0.057, 0.041, 0.056, 0.04, 0.057, 0.043,
0.059, 0.046, 0.063, 0.06, 0.061, 0.062, 0.062, 0.062, 0.061,
0.06, 0.056, 0.046, 0.063, 0.048, 0.06, 0.052, 0.064, 0.055,
0.082, 0.074, 0.07, 0.075, 0.074, 0.078, 0.08, 0.075, 0.09, 0.06,
0.093, 0.053, 0.089, 0.057, 0.089, 0.065, 0.074, 0.074, 0.077,
0.078, 0.073, 0.073, 0.072, 0.074), Condition = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Healthy",
"Unhealthy"), class = "factor"), samples = c(33L, 35L, 39L, 46L,
38L, 42L, 23L, 16L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 41L,
57L, 79L, 70L, 93L, 84L, 98L, 64L, 36L, 57L, 79L, 70L, 93L, 84L,
98L, 64L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 33L, 35L, 39L,
46L, 38L, 42L, 23L, 16L, 41L, 57L, 79L, 70L, 93L, 84L, 98L, 64L,
36L, 57L, 79L, 70L, 93L, 84L, 98L, 64L, 33L, 35L, 39L, 46L, 38L,
42L, 23L, 16L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 41L, 57L,
79L, 70L, 93L, 84L, 98L, 64L, 36L, 57L, 79L, 70L, 93L, 84L, 98L,
64L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 33L, 35L, 39L, 46L,
38L, 42L, 23L, 16L, 41L, 57L, 79L, 70L, 93L, 84L, 98L, 64L, 36L,
57L, 79L, 70L, 93L, 84L, 98L, 64L)), class = "data.frame", row.names = c(NA,
-128L))
我现在的代码是这样的:
# Load libraries
library(ggplot2)
library(magrittr)
library(dplyr)
library(gridExtra)
library(grid)
proc_plot <- function(sub) {
data_Group1 <- sub[sub$Group == "Group1", ]
data_Group2 <- sub[sub$Group == "Group2", ]
min_rdg <- min(data_Group1$Readings, data_Group2$Readings)
max_rdg <- max(data_Group1$Readings, data_Group2$Readings)
# Group1
graph_Group1 <- ggplot(data_Group1, aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title="Group1", x="Time point", y="Readings") +
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1", "2", "3", "4")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale),size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
# Group2
graph_Group2 <- ggplot(data_Group2, aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title="Group2", x="Time point", y="Readings") +
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1", "2", "3", "4")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale), size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
png (paste0("ScatterPlot_", sub$Biomarkers[[1]], ".png"), height=600, width=1111)
output <- grid.arrange(graph_Group1, graph_Group2, nrow = 1,
top=textGrob(sub$Biomarkers[[1]], gp=gpar(fontsize=20)))
dev.off()
return(output)
}
# BUILD PLOT LIST AND PNG FILES
plot_list <- by(data, data$Biomarkers, proc_plot)
dev.off()
grid.draw(plot_list$Gene1)
dev.off()
grid.draw(plot_list$Gene2)
我还在下面附上了 Gene1 的示例 png 文件。我已经手动添加了红色数字以突出显示并表明它正是我需要为每个 gene/png 文件(但为黑色)所需要的。
感谢任何帮助。谢谢。
您可以使用 \n
在标签中换行。例如,
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1\n(33, 41)", "2\n(39, 79)", "3\n(38, 93)", "4\n(23, 98)"))
您可以像这样以编程方式执行此操作:
lab_df = data_Group1 %>% group_by(TimePoint) %>%
summarize(label = sprintf("(%s, %s)", first(samples[Condition == "Healthy"]), first(samples[Condition == "Unhealthy"])))
lab_df
# # A tibble: 4 x 2
# TimePoint label
# <dbl> <chr>
# 1 52.5 (33, 41)
# 2 57.5 (39, 79)
# 3 62.5 (38, 93)
# 4 67.5 (23, 98)
ggplot(...) + ... +
scale_x_continuous(
breaks = lab_df$TimePoint,
labels = paste(1:nrow(lab_df), lab_df$label, sep = "\n")
)
完整的服务解决方案。简化为使用 for
循环而不是单独处理组,标签以编程方式处理。
proc_plot <- function(sub) {
lab_df = sub %>% group_by(TimePoint, Group) %>%
summarize(label = sprintf(
"(%s, %s)",
first(samples[Condition == "Healthy"]),
first(samples[Condition == "Unhealthy"])
)) %>%
arrange(Group, TimePoint) # make sure things are in order
min_rdg <- min(sub$Readings)
max_rdg <- max(sub$Readings)
graphs = list()
for (i in unique(sub$Group)) {
this_lab = lab_df[lab_df$Group == i, ]
graphs[[i]] = ggplot(sub[sub$Group == i, ], aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title = i, x = "Time point", y = "Readings") +
scale_x_continuous(breaks = this_lab$TimePoint,
labels = paste(1:nrow(this_lab), this_lab$label, sep = "\n")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale),size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
}
png (paste0("ScatterPlot_", sub$Biomarkers[[1]], ".png"), height=600, width=1111)
output <- grid.arrange(grobs = graphs, nrow = 1,
top = textGrob(sub$Biomarkers[[1]], gp = gpar(fontsize = 20)))
dev.off()
return(output)
}
proc_plot(sub[sub$Biomarkers == "Gene1", ])