pivot_longer 对于具有相同 names_to 的多组

pivot_longer for multiple sets having the same names_to

我正在尝试使用多个变量集的 pivot_longer,但我无法从示例中获得正确的语法。

我的虚拟数据集是:

library(dplyr)
library(tidyr)

ID =  c("id-1", "id-2", "id-3")
State = c("MD", "MD", "VA")
Time1Day= c( 1, 12, 30)
Time1Month = c( 1, 4, 5)
Time2Day = c( 9, 21, 13)
Time2Month = c( 12, 4, 5)
Time3Day = c( 7, 14, NA)
Time3Month = c( 1, 2, NA)


df <-data.frame(ID, State, Time1Day, Time1Month, Time2Day, Time2Month, Time3Day, Time3Month)

我想要的结果是:

    ID State  Time Day Month
1 id-1    MD Time1   1     1
2 id-1    MD Time2   9    12
3 id-1    MD Time3   7     1
4 id-2    MD Time1  12     4
5 id-2    MD Time2  21     4
6 id-2    MD Time3  14     2
7 id-3    VA Time1  30     5
8 id-3    VA Time2  13     5

我查看了 and here 以尝试获得正确的语法,并尝试了以下两种解决方案,但我无法开始工作:

df.long <- df %>% 
  pivot_longer(cols = starts_with("Time"), names_to = c("Day", "Month"), names_sep="(?=[0-9])"), values_to = "Time", values_drop_na = TRUE)

df.long <- df %>% 
  pivot_longer(cols = ends_with("Day"), names_to = c("Time"), values_to = "Days", values_drop_na = TRUE) %>% 
  pivot_longer(cols = ends_with("Month"), names_to = c("Time"), values_to = "Months", values_drop_na = TRUE)

如有任何关于我遗漏的建议以及如何修复它,我们将不胜感激

一种data.table方法

library(data.table)
# melt to long
DT <- melt(setDT(df), id.vars = c("ID", "State"), variable.factor = FALSE, na.rm = TRUE)
# split variable string
DT[, c("Time", "part2") := tstrsplit(variable, "(?<=[0-9])", perl=TRUE)]
# recast to wide
dcast(DT, ID + State + Time ~ part2, value.var = "value", drop = TRUE)
#      ID State  Time Day Month
# 1: id-1    MD Time1   1     1
# 2: id-1    MD Time2   9    12
# 3: id-1    MD Time3   7     1
# 4: id-2    MD Time1  12     4
# 5: id-2    MD Time2  21     4
# 6: id-2    MD Time3  14     2
# 7: id-3    VA Time1  30     5
# 8: id-3    VA Time2  13     5

编辑 添加 values_drop_na = TRUE 感谢 TarJae 的评论。

你可以使用

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(-c(ID, State), 
               names_to = c("Time", ".value"),
               names_pattern = "(Time\d)(.*)",
               values_drop_na = TRUE)

这个returns

# A tibble: 9 x 5
  ID    State Time    Day Month
  <chr> <chr> <chr> <dbl> <dbl>
1 id-1  MD    Time1     1     1
2 id-1  MD    Time2     9    12
3 id-1  MD    Time3     7     1
4 id-2  MD    Time1    12     4
5 id-2  MD    Time2    21     4
6 id-2  MD    Time3    14     2
7 id-3  VA    Time1    30     5
8 id-3  VA    Time2    13     5