将长数据格式转换为宽格式

cast long data format to wide format

我需要根据以下条件(如果可能)将长数据格式 (long) 转换为宽格式 (wide):

1) 所有数据文件都是长格式(long),结构相同(id, name, value),但每个数据文件会有不同的变量、值和变量个数:

id = case
name = variable
value = variable value(s)

2) 每个数据文件都是不同的变量组合(因子、整数、数字)。有些因素每个案例可能有多个水平(水果和肉类),我想为这些因素中的每个水平创建一个单独的虚拟变量(逻辑)。因子和数值变量的数量将因数据文件而异。

3) 鉴于每个数据文件的变量都不同,我希望将它自动化,我可以在不更改任何变量名称的情况下将相同的代码应用于每个数据文件。

我已经尝试过 reshape2 和 tidyr,但找不到完成它的方法。

这是长格式:

    long
   id   name     value
1   1  fruit     apple
2   1  fruit    banana
3   1  fruit    orange
4   1  fruit pineapple
5   1   meat     steak
6   1   meat   chicken
7   1  fname      dave
8   1     wt       185
9   1 status    active
10  2  fruit     apple
11  2  fruit pineapple
12  2   meat   chicken
13  2  fname      jeff
14  2     wt       205
15  2 status    active
16  3  fruit     apple
17  3  fruit    banana
18  3   meat     steak
19  3  fname      jane
20  3     wt       125
21  3 status    lapsed

这是我更喜欢的宽幅格式:

wide
  id fruit.apple fruit.banana fruit.orange fruit.pineapple meat.steak meat.chicken fname  wt status
1  1        TRUE         TRUE         TRUE            TRUE       TRUE         TRUE  dave 185 active
2  2        TRUE        FALSE        FALSE            TRUE      FALSE         TRUE  jeff 205 active
3  3        TRUE         TRUE        FALSE           FALSE       TRUE        FALSE  jane 125 lapsed

长格式数据:

long <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), name = c("fruit", 
"fruit", "fruit", "fruit", "meat", "meat", "fname", "wt", "status", 
"fruit", "fruit", "meat", "fname", "wt", "status", "fruit", "fruit", 
"meat", "fname", "wt", "status"), value = c("apple", "banana", 
"orange", "pineapple", "steak", "chicken", "dave", "185", "active", 
"apple", "pineapple", "chicken", "jeff", "205", "active", "apple", 
"banana", "steak", "jane", "125", "lapsed")), .Names = c("id", 
"name", "value"), class = "data.frame", row.names = c(NA, -21L
))

解决方案使用 dplyrtidyr

library(dplyr)
library(tidyr)

wide <- long %>%
  mutate(value2 = ifelse(name %in% c("fruit", "meat"), "1", value),
         name2 = ifelse(name %in% c("fruit", "meat"), 
                       paste(name, value, sep = "."), name)) %>%
  select(-name, -value) %>%
  spread(name2, value2, fill = "0") %>%
  mutate_at(vars(matches("fruit|meat")), as.numeric) %>%
  mutate_at(vars(matches("fruit|meat")), as.logical)