R代码(Rstats)根据长格式数据中的列计算失业率
R code (Rstats) calculating unemployment rate based off columns in long form data
我正在尝试根据以下数据计算失业率,并将其作为新行添加到数据 table。我想根据日期按劳动力划分失业人数,并将每个数据点添加为一行。
本质上,我正试图从这个开始
date
series_1
value
2021-01-01
labourforce
13793
2021-02-01
labourforce
13812
2021-03-01
labourforce
13856
2021-01-01
unemployed
875
2021-02-01
unemployed
805
2021-03-01
unemployed
778
至此
date
series_1
value
2021-01-01
labourforce
13793
2021-02-01
labourforce
13812
2021-03-01
labourforce
13856
2021-01-01
unemployed
875
2021-02-01
unemployed
805
2021-03-01
unemployed
778
2021-01-01
unemploymentrate
6.3
2021-02-01
unemploymentrate
5.8
2021-03-01
unemploymentrate
5.6
到目前为止,这是我的代码。我知道最后一行是错的?欢迎任何建议或想法!
longdata %>%
group_by(date) %>%
summarise(series_1 = 'unemploymentrate',
value = series_1$unemployed/series_1$labourforce))
对于每一天,您可以通过 'labourforce'
获得 'unemployed'
的比率,并将其作为新行添加到原始数据集。
library(dplyr)
df %>%
group_by(date) %>%
summarise(value = value[series_1 == 'unemployed']/value[series_1 == 'labourforce'] * 100,
series_1 = 'unemploymentrate') %>%
bind_rows(df) %>%
arrange(series_1)
# date value series_1
# <chr> <dbl> <chr>
#1 2021-01-01 13793 labourforce
#2 2021-02-01 13812 labourforce
#3 2021-03-01 13856 labourforce
#4 2021-01-01 875 unemployed
#5 2021-02-01 805 unemployed
#6 2021-03-01 778 unemployed
#7 2021-01-01 6.34 unemploymentrate
#8 2021-02-01 5.83 unemploymentrate
#9 2021-03-01 5.61 unemploymentrate
尝试:
library(dplyr)
library(tidyr)
df %>%
pivot_wider(names_from = series_1, values_from = value) %>%
mutate(unempolymentrate = round(unemployed*100/labourforce, 2)) %>%
pivot_longer(-1, names_to = "series_1", values_to = "value") %>%
mutate(series_1 = factor(series_1, levels = c("labourforce", "unemployed", "unempolymentrate"))) %>%
arrange(series_1, date)
#> # A tibble: 9 x 3
#> date series_1 value
#> <chr> <fct> <dbl>
#> 1 2021-01-01 labourforce 13793
#> 2 2021-02-01 labourforce 13812
#> 3 2021-03-01 labourforce 13856
#> 4 2021-01-01 unemployed 875
#> 5 2021-02-01 unemployed 805
#> 6 2021-03-01 unemployed 778
#> 7 2021-01-01 unempolymentrate 6.34
#> 8 2021-02-01 unempolymentrate 5.83
#> 9 2021-03-01 unempolymentrate 5.61
由 reprex package (v2.0.0) 于 2021-04-23 创建
数据
df <- structure(list(date = c("2021-01-01", "2021-02-01", "2021-03-01",
"2021-01-01", "2021-02-01", "2021-03-01"), series_1 = c("labourforce",
"labourforce", "labourforce", "unemployed", "unemployed", "unemployed"
), value = c(13793L, 13812L, 13856L, 875L, 805L, 778L)), class = "data.frame", row.names = c(NA,
-6L))
我正在尝试根据以下数据计算失业率,并将其作为新行添加到数据 table。我想根据日期按劳动力划分失业人数,并将每个数据点添加为一行。
本质上,我正试图从这个开始
date | series_1 | value |
---|---|---|
2021-01-01 | labourforce | 13793 |
2021-02-01 | labourforce | 13812 |
2021-03-01 | labourforce | 13856 |
2021-01-01 | unemployed | 875 |
2021-02-01 | unemployed | 805 |
2021-03-01 | unemployed | 778 |
至此
date | series_1 | value |
---|---|---|
2021-01-01 | labourforce | 13793 |
2021-02-01 | labourforce | 13812 |
2021-03-01 | labourforce | 13856 |
2021-01-01 | unemployed | 875 |
2021-02-01 | unemployed | 805 |
2021-03-01 | unemployed | 778 |
2021-01-01 | unemploymentrate | 6.3 |
2021-02-01 | unemploymentrate | 5.8 |
2021-03-01 | unemploymentrate | 5.6 |
到目前为止,这是我的代码。我知道最后一行是错的?欢迎任何建议或想法!
longdata %>%
group_by(date) %>%
summarise(series_1 = 'unemploymentrate',
value = series_1$unemployed/series_1$labourforce))
对于每一天,您可以通过 'labourforce'
获得 'unemployed'
的比率,并将其作为新行添加到原始数据集。
library(dplyr)
df %>%
group_by(date) %>%
summarise(value = value[series_1 == 'unemployed']/value[series_1 == 'labourforce'] * 100,
series_1 = 'unemploymentrate') %>%
bind_rows(df) %>%
arrange(series_1)
# date value series_1
# <chr> <dbl> <chr>
#1 2021-01-01 13793 labourforce
#2 2021-02-01 13812 labourforce
#3 2021-03-01 13856 labourforce
#4 2021-01-01 875 unemployed
#5 2021-02-01 805 unemployed
#6 2021-03-01 778 unemployed
#7 2021-01-01 6.34 unemploymentrate
#8 2021-02-01 5.83 unemploymentrate
#9 2021-03-01 5.61 unemploymentrate
尝试:
library(dplyr)
library(tidyr)
df %>%
pivot_wider(names_from = series_1, values_from = value) %>%
mutate(unempolymentrate = round(unemployed*100/labourforce, 2)) %>%
pivot_longer(-1, names_to = "series_1", values_to = "value") %>%
mutate(series_1 = factor(series_1, levels = c("labourforce", "unemployed", "unempolymentrate"))) %>%
arrange(series_1, date)
#> # A tibble: 9 x 3
#> date series_1 value
#> <chr> <fct> <dbl>
#> 1 2021-01-01 labourforce 13793
#> 2 2021-02-01 labourforce 13812
#> 3 2021-03-01 labourforce 13856
#> 4 2021-01-01 unemployed 875
#> 5 2021-02-01 unemployed 805
#> 6 2021-03-01 unemployed 778
#> 7 2021-01-01 unempolymentrate 6.34
#> 8 2021-02-01 unempolymentrate 5.83
#> 9 2021-03-01 unempolymentrate 5.61
由 reprex package (v2.0.0) 于 2021-04-23 创建 数据
df <- structure(list(date = c("2021-01-01", "2021-02-01", "2021-03-01",
"2021-01-01", "2021-02-01", "2021-03-01"), series_1 = c("labourforce",
"labourforce", "labourforce", "unemployed", "unemployed", "unemployed"
), value = c(13793L, 13812L, 13856L, 875L, 805L, 778L)), class = "data.frame", row.names = c(NA,
-6L))