ggplot:当数据点是每周时,如何将 x 轴分成几个月?

ggplot: how to break the x-axis into months when data points are per week?

问题:当 x 轴基于连续 +200 周时,如何使它更具可读性?我计划将 x 轴分成几个月。问题是一周的第一天不一定与一个月的第一天匹配。因此,有一个重叠我不知道如何处理(连续两个月放在同一周)。

我正在想象 Covid-19 前后的外科手术。

x 轴对应自周一 2017-01-02 (yyyy-mm-dd) 以来的连续几周,范围 1 - 209。每个 geom_point() 对应于每周的外科手术次数。

通常,我会简单地将 x 轴分成更小的范围,例如x 轴对应于 3 个月。不幸的是,由于 b$cons_week 计算自 2017-01-02 以来连续过去的每个星期一,它不一定对应于“月份休息”(因为一个月的第一天不一定与一周的第一天重合)。因此,我不知道如何打破x轴。

我的数据是这样的

> head(b)
# A tibble: 6 x 3
  diagnosis  cons_week corona
  <chr>          <dbl> <fct> 
1 2017-10-19        42 Normal
2 2017-07-11        28 Normal
3 2020-06-30       183 C19   
4 2020-06-27       182 C19   
5 2017-01-04         1 Normal
6 2017-12-07        49 Normal

首先,我计算每周的手术次数:

lin.model <- b %>% 
  group_by(corona, cons_week) %>%
  summarise(n = n()) 

所以

# A tibble: 80 x 3
# Groups:   corona [2]
   corona cons_week     n
   <fct>      <dbl> <int>
 1 C19          173     1
 2 C19          175     1
 3 C19          181     1
 4 C19          182     2

然后

ggplot(lin.model,
       aes(x = cons_week, y = n, color = corona, fill = corona)) +
  geom_point(size = 5, shape = 21) +
  geom_smooth(se = F, method = lm, color = "black", show.legend = F) +
  geom_smooth(lty = 2, show.legend = F) + 
  scale_color_manual(name = "",
                     values = c("#8B3A62", "#6DBCC3"),
                     labels = c("COVID-19", "Normal"),
                     guide = guide_legend(reverse=TRUE)) + 
  scale_fill_manual(name = "",
                    values = alpha(c("#8B3A62", "#6DBCC3"), .25),
                    labels = c("COVID-19", "Normal"),
                    guide = guide_legend(reverse=TRUE)) + 
  scale_x_continuous(name = "",
                     breaks = seq(0, 210, 12)) + 
  scale_y_continuous(name = "",
                     breaks = seq(0, 30, 5), limits = c(0, 30)) + 
  theme(axis.title.y = element_text(color = "grey20", 
                                    size = 17, 
                                    face="bold", 
                                    margin=ggplot2::margin(r=10)),
        axis.line = element_line(colour = "black"),
        axis.text.x = element_text(size = 15),
        axis.text.y = element_text(size = 15),
        panel.grid.major = element_line(colour = "grey90"),
        panel.grid.minor = element_line(colour = "grey90"),
        panel.border = element_blank(),
        panel.background = element_blank(),
        legend.position = "top",
        legend.key = element_rect(fill = "white"),
        legend.text=element_text(size=15))

我想知道,b$diagnosis 能否以某种方式用于打破 x 轴? b$diagnosis对应具体的手术日期

预期输出

数据

b <- structure(list(diagnosis = c("2017-10-19", "2017-07-11", "2020-06-30", 
"2020-06-27", "2017-01-04", "2017-12-07", "2017-09-18", "2020-07-27", 
"2020-08-28", "2020-12-29", "2018-04-12", "2020-06-20", "2020-08-29", 
"2018-02-05", "2018-01-12", "2017-07-15", "2018-03-07", "2020-02-29", 
"2019-08-24", "2017-08-08", "2018-11-27", "2017-03-15", "2017-05-12", 
"2020-10-22", "2019-08-31", "2017-11-17", "2019-04-17", "2018-11-15", 
"2018-02-08", "2019-08-09", "2019-10-06", "2017-08-30", "2019-05-09", 
"2017-06-05", "2017-10-04", "2018-01-27", "2017-06-16", "2019-03-29", 
"2017-06-16", "2018-07-19", "2020-04-23", "2020-01-31", "2020-06-27", 
"2019-12-11", "2019-08-13", "2017-05-07", "2020-05-08", "2020-09-05", 
"2019-12-18", "2018-07-24", "2017-07-31", "2017-01-23", "2018-09-08", 
"2018-12-18", "2017-08-01", "2019-04-11", "2017-05-12", "2019-03-15", 
"2019-06-12", "2017-05-10", "2020-10-27", "2018-08-26", "2019-06-03", 
"2020-07-31", "2017-12-02", "2018-11-07", "2018-03-23", "2019-08-18", 
"2019-08-30", "2018-07-23", "2018-08-08", "2018-10-10", "2019-05-26", 
"2017-11-18", "2020-07-19", "2017-02-07", "2017-08-15", "2020-01-05", 
"2019-07-28", "2017-05-28", "2017-01-02", "2018-09-25", "2017-03-26", 
"2017-04-24", "2018-03-26", "2020-12-01", "2018-09-27", "2019-09-26", 
"2017-10-06", "2019-01-11", "2020-08-15", "2017-02-06", "2018-06-07", 
"2018-03-15", "2017-12-17", "2017-02-08", "2019-11-02", "2020-12-05", 
"2017-09-16", "2017-06-18"), cons_week = c(42, 28, 183, 182, 
1, 49, 38, 187, 191, 209, 67, 181, 191, 58, 54, 28, 62, 165, 
138, 32, 100, 11, 19, 199, 139, 46, 120, 98, 58, 136, 144, 35, 
123, 23, 40, 56, 24, 117, 24, 81, 173, 161, 182, 154, 137, 18, 
175, 192, 155, 82, 31, 4, 88, 103, 31, 119, 19, 115, 128, 19, 
200, 86, 127, 187, 48, 97, 64, 137, 139, 82, 84, 93, 125, 46, 
185, 6, 33, 157, 134, 21, 1, 91, 12, 17, 65, 205, 91, 143, 40, 
106, 189, 6, 75, 63, 50, 6, 148, 205, 37, 24), corona = structure(c(2L, 
2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 2L, 2L), .Label = c("C19", "Normal"), class = "factor")), row.names = c(NA, 
-100L), class = c("tbl_df", "tbl", "data.frame"))

我建议将您的 cons_week 转换为日期,例如:

lin.model <- b %>% 
  group_by(corona, cons_week) %>%
  summarise(n = n()) %>%
  mutate(cons_week_dt = as.Date("2017-01-02") + cons_week*7)

然后:

ggplot(lin.model,
       aes(x = cons_week_dt, y = n, color = corona, fill = corona)) +
       ...
       scale_x_date(date_breaks = "6 months", date_labels = "%b%Y", expand = c(0.07, 0)) +
       ...