R代码(Rstats)根据长格式数据中的列计算失业率

R code (Rstats) calculating unemployment rate based off columns in long form data

我正在尝试根据以下数据计算失业率,并将其作为新行添加到数据 table。我想根据日期按劳动力划分失业人数,并将每个数据点添加为一行。

本质上,我正试图从这个开始

date series_1 value
2021-01-01 labourforce 13793
2021-02-01 labourforce 13812
2021-03-01 labourforce 13856
2021-01-01 unemployed 875
2021-02-01 unemployed 805
2021-03-01 unemployed 778

至此

date series_1 value
2021-01-01 labourforce 13793
2021-02-01 labourforce 13812
2021-03-01 labourforce 13856
2021-01-01 unemployed 875
2021-02-01 unemployed 805
2021-03-01 unemployed 778
2021-01-01 unemploymentrate 6.3
2021-02-01 unemploymentrate 5.8
2021-03-01 unemploymentrate 5.6

到目前为止,这是我的代码。我知道最后一行是错的?欢迎任何建议或想法!

longdata %>% 
  group_by(date) %>%
  summarise(series_1 = 'unemploymentrate',
  value = series_1$unemployed/series_1$labourforce))

对于每一天,您可以通过 'labourforce' 获得 'unemployed' 的比率,并将其作为新行添加到原始数据集。

library(dplyr)

df %>% 
  group_by(date) %>%
  summarise(value = value[series_1 == 'unemployed']/value[series_1 == 'labourforce'] * 100, 
            series_1 = 'unemploymentrate') %>%
  bind_rows(df) %>%
  arrange(series_1)

#   date          value series_1        
#  <chr>         <dbl> <chr>           
#1 2021-01-01 13793    labourforce     
#2 2021-02-01 13812    labourforce     
#3 2021-03-01 13856    labourforce     
#4 2021-01-01   875    unemployed      
#5 2021-02-01   805    unemployed      
#6 2021-03-01   778    unemployed      
#7 2021-01-01     6.34 unemploymentrate
#8 2021-02-01     5.83 unemploymentrate
#9 2021-03-01     5.61 unemploymentrate

尝试:

library(dplyr)
library(tidyr)

 
  df %>% 
  pivot_wider(names_from = series_1, values_from = value) %>% 
  mutate(unempolymentrate = round(unemployed*100/labourforce, 2)) %>% 
    pivot_longer(-1, names_to = "series_1", values_to = "value") %>%
    mutate(series_1 = factor(series_1, levels = c("labourforce", "unemployed", "unempolymentrate"))) %>% 
    arrange(series_1, date)

#> # A tibble: 9 x 3
#>   date       series_1            value
#>   <chr>      <fct>               <dbl>
#> 1 2021-01-01 labourforce      13793   
#> 2 2021-02-01 labourforce      13812   
#> 3 2021-03-01 labourforce      13856   
#> 4 2021-01-01 unemployed         875   
#> 5 2021-02-01 unemployed         805   
#> 6 2021-03-01 unemployed         778   
#> 7 2021-01-01 unempolymentrate     6.34
#> 8 2021-02-01 unempolymentrate     5.83
#> 9 2021-03-01 unempolymentrate     5.61

reprex package (v2.0.0) 于 2021-04-23 创建 数据

df <- structure(list(date = c("2021-01-01", "2021-02-01", "2021-03-01", 
                              "2021-01-01", "2021-02-01", "2021-03-01"), series_1 = c("labourforce", 
                                                                                      "labourforce", "labourforce", "unemployed", "unemployed", "unemployed"
                              ), value = c(13793L, 13812L, 13856L, 875L, 805L, 778L)), class = "data.frame", row.names = c(NA, 
                                                                                                                           -6L))