从多列的汇总数据制作折线图
making a line graph from summary data of multiple columns
我有一个数据框,其中包含不同年份房屋的评估值。它的格式是每年都有自己的列,因此可以总结为:
> summary(realact$tot_appr_val_2016)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 58822 126440 288633 217916 1132770203 9856
> summary(realact$tot_appr_val_2017)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 66107 138759 302922 231039 1132096090 14936
> summary(realact$tot_appr_val_2018)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 70000 144000 309053 235198 1464720000 20640
我想看看它们是如何随时间变化的,方法是将平均值和最大值绘制成折线图,因此年份在 x 轴上,值在 y 轴上。这是一些近似于我的数据集结构的虚拟数据:
house_id = c("id1", "id2", "id3", "id4", "id5", "id6")
value_2016 = c(1000, 1002, 2000, 20004, 1000, 9000)
value_2017 = c(2000, 2402, 1400, 30004, 2000, 12000)
value_2018 = c(4000, 3200, 600, 40004, 3000, 15000)
df = data.frame(house_id, value_2016, value_2017, value_2018)
你可以试试
library(dplyr)
library(ggplot2)
library(reshape2)
df %>%
reshape2::melt(id = 'house_id',
variable.name = "year") %>%
mutate(year = str_remove(year, "value_")) %>%
group_by(year) %>%
summarize(mean_val = mean(value),
max_val = max(value)) %>%
mutate(year = as.numeric(year)) %>%
reshape2::melt(id = 'year',
variable.name = 'type') %>%
ggplot(aes(x = year, y = value, group = type, color = type)) +
geom_line()
我有一个数据框,其中包含不同年份房屋的评估值。它的格式是每年都有自己的列,因此可以总结为:
> summary(realact$tot_appr_val_2016)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 58822 126440 288633 217916 1132770203 9856
> summary(realact$tot_appr_val_2017)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 66107 138759 302922 231039 1132096090 14936
> summary(realact$tot_appr_val_2018)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 70000 144000 309053 235198 1464720000 20640
我想看看它们是如何随时间变化的,方法是将平均值和最大值绘制成折线图,因此年份在 x 轴上,值在 y 轴上。这是一些近似于我的数据集结构的虚拟数据:
house_id = c("id1", "id2", "id3", "id4", "id5", "id6")
value_2016 = c(1000, 1002, 2000, 20004, 1000, 9000)
value_2017 = c(2000, 2402, 1400, 30004, 2000, 12000)
value_2018 = c(4000, 3200, 600, 40004, 3000, 15000)
df = data.frame(house_id, value_2016, value_2017, value_2018)
你可以试试
library(dplyr)
library(ggplot2)
library(reshape2)
df %>%
reshape2::melt(id = 'house_id',
variable.name = "year") %>%
mutate(year = str_remove(year, "value_")) %>%
group_by(year) %>%
summarize(mean_val = mean(value),
max_val = max(value)) %>%
mutate(year = as.numeric(year)) %>%
reshape2::melt(id = 'year',
variable.name = 'type') %>%
ggplot(aes(x = year, y = value, group = type, color = type)) +
geom_line()