协助重塑 R 中的密集纵向数据

Question

抱歉，如果我 post 不正确。我是 R 的新手，这是我第一次 post 接触 Whosebug。我已经尽可能多地阅读以找到解决我的问题的方法，但一直无法找到我可以使用的东西。

我有一些密集的纵向数据正在尝试重塑。目前它是宽格式的，看起来像这样：

Participant   D1_1_1   D1_1_2   D1_1_3   D1_1_4    D2_1_1   D2_1_2  etc...
P1               6        2        3        5        1         2
P2               4        9        3        6        4         1
P3               7        4        2        8        1         1
P4               1        5        1        1        6         7 
P5               2        0        8        2        1         4
etc..

列变量指的是在特定日期、全天特定时间对特定调查项目做出的响应。

所以：

D1_1_1 = 第 1 天，时间 1，项目 1

D1_1_2 = 第 1 天，时间 1，项目 2

...

D4_3_7 = 第 4 天，时间 3，项目 7

总的来说，我掌握的数据包括： 60 名参与者回答了 11 个项目，每天 4 次，持续 10 天（每个参与者总共 440 个数据点）。

我正在寻求帮助，以便能够有效地将其转换为长格式，例如，它可能看起来像这样：

Participant     Day     time    item 1   item 2 ... item 11
P1               1        1        6        2
P1               1        2        X        X
P1               1        3        X        X
P1               1        4        X        X
P1               2        1        1        4
etc..

其中 X 是参与者在特定日期、特定时间对给定调查项目的回答。

如有任何帮助，我们将不胜感激！

干杯

Answer 1

这是 pivot_longer + pivot_wider

的一种方式

library(dplyr)
library(tidyr)

pivot_longer(df, cols = -Participant, names_to = c("Day", "Time", "Item"), 
                 names_pattern = "D(\d+)_(\d+)_(\d+)") %>%
    mutate(Item = paste0("Item",Item)) %>%
    pivot_wider(names_from = Item, values_from = value)

# A tibble: 10 x 7
#   Participant Day   Time  Item1 Item2 Item3 Item4
#   <fct>       <chr> <chr> <int> <int> <int> <int>
# 1 P1          1     1         6     2     3     5
# 2 P1          2     1         1     2    NA    NA
# 3 P2          1     1         4     9     3     6
# 4 P2          2     1         4     1    NA    NA
# 5 P3          1     1         7     4     2     8
# 6 P3          2     1         1     1    NA    NA
# 7 P4          1     1         1     5     1     1
# 8 P4          2     1         6     7    NA    NA
# 9 P5          1     1         2     0     8     2
#10 P5          2     1         1     4    NA    NA

我们也可以使用 extract，使用与 pivot_longer

中的 names_pattern 相同的模式

pivot_longer(df, cols = -Participant) %>%
     extract(name, into = c("Day", "Time", "Item"), 
             regex = "D(\d+)_(\d+)_(\d+)") %>%
     pivot_wider(names_from = Item, values_from = value)

数据

df <- structure(list(Participant = structure(1:5, .Label = c("P1", 
"P2", "P3", "P4", "P5"), class = "factor"), D1_1_1 = c(6L, 4L, 
7L, 1L, 2L), D1_1_2 = c(2L, 9L, 4L, 5L, 0L), D1_1_3 = c(3L, 3L, 
2L, 1L, 8L), D1_1_4 = c(5L, 6L, 8L, 1L, 2L), D2_1_1 = c(1L, 4L, 
1L, 6L, 1L), D2_1_2 = c(2L, 1L, 1L, 7L, 4L)), class = "data.frame", 
row.names = c(NA, -5L))

Answer 2

Ronak 的答案非常有效，但没有必要使用 extract：pivot_longer 已经可以将列分成几个部分：

library(tidyr)

df %>%
  pivot_longer(cols = -Participant, names_to = c("day", "time", "item"), 
               names_pattern = "(D\d)_(\d)_(\d)") %>%
  pivot_wider(names_from = item, values_from = value, names_prefix = "Item")
#> # A tibble: 10 x 7
#>    Participant day   time  Item1 Item2 Item3 Item4
#>    <fct>       <chr> <chr> <int> <int> <int> <int>
#>  1 P1          D1    1         6     2     3     5
#>  2 P1          D2    1         1     2    NA    NA
#>  3 P2          D1    1         4     9     3     6
#>  4 P2          D2    1         4     1    NA    NA
#>  5 P3          D1    1         7     4     2     8
#>  6 P3          D2    1         1     1    NA    NA
#>  7 P4          D1    1         1     5     1     1
#>  8 P4          D2    1         6     7    NA    NA
#>  9 P5          D1    1         2     0     8     2
#> 10 P5          D2    1         1     4    NA    NA

数据：

df <- structure(list(Participant = structure(1:5, .Label = c("P1", 
"P2", "P3", "P4", "P5"), class = "factor"), D1_1_1 = c(6L, 4L, 
7L, 1L, 2L), D1_1_2 = c(2L, 9L, 4L, 5L, 0L), D1_1_3 = c(3L, 3L, 
2L, 1L, 8L), D1_1_4 = c(5L, 6L, 8L, 1L, 2L), D2_1_1 = c(1L, 4L, 
1L, 6L, 1L), D2_1_2 = c(2L, 1L, 1L, 7L, 4L)), class = "data.frame", 
row.names = c(NA, -5L))

协助重塑 R 中的密集纵向数据

Assistance with reshaping intensive longitudinal data in R

r

reshape

longitudinal