从多列的汇总数据制作折线图

making a line graph from summary data of multiple columns

我有一个数据框,其中包含不同年份房屋的评估值。它的格式是每年都有自己的列,因此可以总结为:

> summary(realact$tot_appr_val_2016)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
         0      58822     126440     288633     217916 1132770203       9856 
> summary(realact$tot_appr_val_2017)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
         0      66107     138759     302922     231039 1132096090      14936 
> summary(realact$tot_appr_val_2018)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
         0      70000     144000     309053     235198 1464720000      20640 

我想看看它们是如何随时间变化的,方法是将平均值和最大值绘制成折线图,因此年份在 x 轴上,值在 y 轴上。这是一些近似于我的数据集结构的虚拟数据:

house_id = c("id1", "id2", "id3", "id4", "id5", "id6")
value_2016 = c(1000, 1002, 2000, 20004, 1000, 9000)
value_2017 = c(2000, 2402, 1400, 30004, 2000, 12000)
value_2018 = c(4000, 3200, 600, 40004, 3000, 15000)
df = data.frame(house_id, value_2016, value_2017, value_2018)

你可以试试

library(dplyr)
library(ggplot2)
library(reshape2)

df %>%
  reshape2::melt(id = 'house_id',
                 variable.name = "year") %>%
  mutate(year = str_remove(year, "value_")) %>%
  group_by(year) %>%
  summarize(mean_val = mean(value),
            max_val = max(value)) %>%
  mutate(year = as.numeric(year)) %>%
  reshape2::melt(id = 'year',
                 variable.name = 'type') %>%
  ggplot(aes(x = year, y = value, group = type, color = type)) +
  geom_line()