结合时间趋势图和时间线

Combining time trend plot with timeline

我想创建一个情节(最好使用 ggplot2),在其中可视化时间线和 time-trend 情节。

举个实际的例子,我汇总了每年的失业率。我还有一个数据集,表示与劳动力市场相关的重要立法变化。因此,我想创建一个时间轴,其中显示失业率遵循相同的 x-axis(时间)。

我生成了一些toy-data,见下面的代码:

set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))


year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit", 
            "Pre-school became free", 
            "Five-day workweek were introduced", 
            "Labor law reform 1976", 
            "Unemployment benefit were cut in half", 
            "Apprenticeship Act allows on-the-job training",
            "Changes in discrimination law",
            "Equal Pay for Equal Work was", 
            "9 weeks vacation were introduced",
            "Unemployment benefit were removed")

imp_event  <- data.frame(year, events)

我可以很容易地绘制出 time-trend 跨越这些年:

library(tidyverse)
                      
ggplot(data = un_emp, aes(x = year, y = unemployment)) + 
  geom_line(color = "#FC4E07", size = 0.5) +
  theme_bw()

但是我如何以一种漂亮而有效的方式在情节中包含事件(在 imp_event 中找到)?我该怎么做?

我的目标是制作一个看起来像 here 中的时间线,但将其与上面显示的 time-trend 情节结合起来。我该怎么做?

我已经尝试使用 vline 但我无法添加活动标签。

谢谢!

您可以通过叠加 geom_text() 调用来实现此目的,但这需要 xy 值的长度与其他图中的长度相同,因此您不能只需为其提供一个新的 df 并覆盖它。

相反,您可以通过在 year 上执行从 un_empimp_eventsleft_join 来实现您想要的效果。因为 imp_events 每年只有一行,所以 df 中 events 的大部分缺失值都是完美的,因为我怀疑您只希望每个事件都显示为标签一次。

例如:

joined_data <- un_emp %>% left_join(imp_event, by = "year")

ggplot(data = joined_data, aes(x = year, y = unemployment)) + 
  geom_line(color = "#FC4E07", size = 0.5) +
  geom_text(data = joined_data, aes(x = year, y = unemployment, label = (events), size = 3)) +
  theme_bw() 

这给你这样的东西:

您可以查看可用选项并尝试 geom_text() here

我使用 Jon Spring 的解决方案,但将 geom_segment 替换为 geom_vline,结果接近我想要的结果。最终代码如下所示:


joined_data <- un_emp %>% left_join(imp_event, by = "year")

ggplot(data = joined_data, aes(x = year, y = unemployment)) + 
  geom_line(color = "red", size = 0.5) +

  theme_classic() +
  labs(y = "Unemployment rate", 
       x = "Years", 
       caption = "Data from XXXX") +
  geom_vline(data = joined_data %>% filter(!is.na(events)),  aes(xintercept = year), color = "gray70",  linetype = "dashed") +   
  ggrepel::geom_text_repel(data = joined_data, aes(x = year, y = unemployment-0.03, label = str_wrap(events, 10)), color = "gray70", direction = "y", size = 2.5, lineheight = 0.7, point.padding = 0.8)

产生以下情节:

我想奖励@Jon Spring 赏金但不确定如何奖励评论。

我认为这应该可以解决问题:

首先,我用 hline 创建了轴,使用您为数据设置的平均值作为 y 截距。然后我向事件的数据框添加了一个变量“高度”,它采用轴的值并添加从正态分布中提取的值。我用它来绘制创建指向每个点的线的线段。最后,我反转了年份标签的 y 位置,因此它始终位于线段的另一侧。

library(tidyverse)

set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))

year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit", 
            "Pre-school became free", 
            "Five-day workweek were introduced", 
            "Labor law reform 1976", 
            "Unemployment benefit were cut in half", 
            "Apprenticeship Act allows on-the-job training",
            "Changes in discrimination law",
            "Equal Pay for Equal Work was", 
            "9 weeks vacation were introduced",
            "Unemployment benefit were removed")

imp_event  <- data.frame(year, events) %>% 
  mutate(height = mean(unemployment) + rnorm(n(), 0, 0.02))

    ggplot(un_emp) +
  
  geom_hline(yintercept = 0.05) +
  
  geom_line(aes(x = year,
                y = unemployment),
            color = "red",
            alpha = 0.3,
            size = 1) +
  
  geom_segment(data = imp_event,
               aes(x = year,
                   xend = year,
                   y = 0.05,
                   yend = height)) +
  
  geom_text(data = imp_event,
            aes(label = year, 
                x = year,
                y = 0.05 + 0.002 * sign(0.05 - height)), 
            angle = 90, 
            size = 3.5, 
            fontface = "bold",
            check_overlap = T) +
  
  geom_point(data = imp_event,
             aes(x = year,
                 y = height,
                 fill = as.factor(events)),
             shape = 21,
             size = 4) +
  
  scale_x_continuous(name = NULL, 
                     labels = NULL) +
  
  scale_fill_discrete(name = "Event") +
  
  scale_y_continuous(name = "Unemployment Rate") +
  
  theme_bw() + 
  
  theme(panel.border = element_blank(),
        axis.line.y  = element_line(),
        axis.ticks.x = element_blank(),
        panel.grid = element_blank(),
        legend.position="bottom")