只为选定的列旋转更长的时间

pivot longer for only selected columns

我有一个包含 23 列的宽数据集。我想 select 某些列并将它们调整为行(长格式),但只有这些 selected 列。这是我的数据集的示例:

# A tibble: 2 x 23
   year   popu    dd popmale ddmale popfemale ddfemale pop40  dd40 pop41_50 dd41_50 pop51_60 dd51_60 pop61_70 dd61_70 pop71_80 dd71_80 pop81_90

1  2011 197548  2167   98145   1302     99403     1302 56822    52    27614      88    33368     384    30477     683    25418     630    14961
2  2012 200724  2250   99783   1354    100941      896 58646    54    28256      91    34111     400    30919     705    25718     655    14862
# ... with 5 more variables: dd81_90 <dbl>, pop91_100 <dbl>, dd91_100 <dbl>, pop100 <dbl>, dd100 <dbl>

df<-structure(list(year = c(2011, 2012), popu = c(197548, 200724), 
    dd = c(2167, 2250), popmale = c(98145, 99783), ddmale = c(1302, 
    1354), popfemale = c(99403, 100941), ddfemale = c(1302, 896
    ), pop40 = c(56822, 58646), dd40 = c(52, 54), pop41_50 = c(27614, 
    28256), dd41_50 = c(88, 91), pop51_60 = c(33368, 34111), 
    dd51_60 = c(384, 400), pop61_70 = c(30477, 30919), dd61_70 = c(683, 
    705), pop71_80 = c(25418, 25718), dd71_80 = c(630, 655), 
    pop81_90 = c(14961, 14862), dd81_90 = c(288, 288), pop91_100 = c(7210, 
    6746), dd91_100 = c(54, 55), pop100 = c(1678, 1466), dd100 = c(1, 
    2)), row.names = 1:2, class = "data.frame")  

在上面的 DF 中,每个年龄类别都有不同的人口列(例如 pop41_50)和事件列(dd41_50)。

我想创建一个格式更长的数据框,它将年龄类别作为值放在一列中,并将人口和事件数量也作为值,如下所示:

   year   popu    dd    popmale ddmale    popfemale ddfemale  age_cate  pop_age  event_age
1  2011 197548    2167  98145   1302      99403     1302      40        56822     52
2  2011 197548    2167  98145   1302      99403     1302      41_50     27614     88
3  2011 197548    2167  98145   1302      99403     1302      51_60     33368     384
4  2011 197548    2167  98145   1302      99403     1302      61_70     30477     683
5  2011 197548    2167  98145   1302      99403     1302      71_80     25418     630
etc.

我尝试了以下脚本,但这将所有内容都放在一列中,这不是我想要的输出。

pivot_longer(df, -c(year, popu, dd), values_to = "number", names_to = "category")

非常感谢!

一个选项是先重命名列,然后在第二个下划线处拆分。

library(tidyverse)

df %>% 
  rename_with(~ str_replace(., "dd", "event_age_"), dd40:dd100) %>% 
  rename_with(~ str_replace(., "pop", "pop_age_"), pop40:pop100) %>% 
  tidyr::pivot_longer(., 
                      cols = -c("year", "popu", "dd","popmale","ddmale","popfemale","ddfemale"), 
                      names_to = c('.value', 'age_cate'), 
                      names_pattern = "^([^_]*_[^_]*)_(.*)")

输出

    year   popu    dd popmale ddmale popfemale ddfemale age_cate pop_age event_age
   <dbl>  <dbl> <dbl>   <dbl>  <dbl>     <dbl>    <dbl> <chr>      <dbl>     <dbl>
 1  2011 197548  2167   98145   1302     99403     1302 40         56822        52
 2  2011 197548  2167   98145   1302     99403     1302 41_50      27614        88
 3  2011 197548  2167   98145   1302     99403     1302 51_60      33368       384
 4  2011 197548  2167   98145   1302     99403     1302 61_70      30477       683
 5  2011 197548  2167   98145   1302     99403     1302 71_80      25418       630
 6  2011 197548  2167   98145   1302     99403     1302 81_90      14961       288
 7  2011 197548  2167   98145   1302     99403     1302 91_100      7210        54
 8  2011 197548  2167   98145   1302     99403     1302 100         1678         1
 9  2012 200724  2250   99783   1354    100941      896 40         58646        54
10  2012 200724  2250   99783   1354    100941      896 41_50      28256        91
11  2012 200724  2250   99783   1354    100941      896 51_60      34111       400
12  2012 200724  2250   99783   1354    100941      896 61_70      30919       705
13  2012 200724  2250   99783   1354    100941      896 71_80      25718       655
14  2012 200724  2250   99783   1354    100941      896 81_90      14862       288
15  2012 200724  2250   99783   1354    100941      896 91_100      6746        55
16  2012 200724  2250   99783   1354    100941      896 100         1466         2