如何在 R 中将按行一次观察的数据重组为按 ID(和多列)进行一次观察的数据?

How to restructure data with one observation by row into data with one observation by ID (and multiple columns) in R?

假设我有一个包含 3 个 ID 列和一个感兴趣列的数据框。每行代表一个观察结果。一些 ID 有多个观察值,即多行。

df <- data.frame(id1 = c(  1,   2,   3,   4,   4), 
                 id2 = c( 11,  12,  13,  14,  14), 
                 id3 = c(111, 112, 113, 114, 114), 
                 variable_of_interest = c(13, 24, 35, 31, 12))

  id1 id2 id3 variable_of_interest
1   1  11 111                   13
2   2  12 112                   24
3   3  13 113                   35
4   4  14 114                   31
5   4  14 114                   12

我的目标是将其重组为每个 ID 一行,保留 3 个 ID 并将新列命名为“variable_of_interest1”,“variable_of_interest2”:

  id1 id2 id3 variable_of_interest1 variable_of_interest1
1   1  11 111                    13                    NA
2   2  12 112                    24                    NA
3   3  13 113                    35                    NA
4   4  14 114                    31                    12

解决方案可能需要reshape2dcast功能,但直到现在,我都无法解决这个问题。

我们可以创建一个按 'id' 列分组的序列,然后 pivot_wider 重塑为宽

library(dplyr)
library(stringr)
library(tidyr)
library(data.table)
df %>% 
  mutate(ind = str_c('variable_of_interest', rowid(id1, id2, id3))) %>% 
  pivot_wider(names_from = ind, values_from = variable_of_interest)

-输出

# A tibble: 4 x 5
#    id1   id2   id3 variable_of_interest1 variable_of_interest2
#  <dbl> <dbl> <dbl>                 <dbl>                 <dbl>
#1     1    11   111                    13                    NA
#2     2    12   112                    24                    NA
#3     3    13   113                    35                    NA
#4     4    14   114                    31                    12

或者另一个选项是 data.table

library(data.table)
dcast(setDT(df),  id1 + id2 + id3 ~ 
  paste0('variable_of_interest', rowid(id1, id2, id3)),
      value.var = 'variable_of_interest')

-输出

#    id1 id2 id3 variable_of_interest1 variable_of_interest2
#1:   1  11 111                    13                    NA
#2:   2  12 112                    24                    NA
#3:   3  13 113                    35                    NA
#4:   4  14 114                    31                    12