R:使用向量作为 tidyr spread 函数中的关键参数

R: Use vector as key parameter in tidyr spread function

我正在尝试使用 tidyr spread 函数,除了我想传入我自己的特征名称向量以用于关键参数。

例如,默认用法为

test<-data.frame(id=c(1,1,2,2),
             feat=c("feat1", "feat2", "feat1", "feat2"),
             value = c(10,20, 1000, 2000))
test %>% spread(key = feat, value = value, fill = 0)
  id feat1 feat2
1  1    10    20
2  2  1000  2000

我想传入我自己的特征字符串向量作为键,类似这样。

featlist<-c("feat1", "feat2", "feat3")
test %>% spread(key = featlist, value = value, fill = 0)
#desired output
  id feat1 feat2 feat3
1  1    10    20     0
2  2  1000  2000     0
#Error output
Error: `var` must evaluate to a single number or a column name, not a character vector
#Trying spread_
test %>% spread_(key = featlist, value = "value", fill = 0)
Error: Only strings can be converted to symbols

只需将专长列设置为 featlist 的因素,然后将 drop 参数设置为 FALSE,如:

test<-data.frame(id=c(1,1,2,2),
                 feat=c("feat1", "feat2", "feat1", "feat2"),
                 value = c(10,20, 1000, 2000))

featlist<-c("feat1", "feat2", "feat3")
test$feat <- factor(test$feat, levels = featlist)

test %>% spread(key = feat, value = value, fill = 0, drop = FALSE)

这导致:

  id feat1 feat2 feat3
1  1    10    20     0
2  2  1000  2000     0

不幸的是,tidyr::spread 不允许将您自己的 vector 用作 key,但幸运的是 expand.grid 为您提供了使用自己的 vector 的选项] 并在调用 spread 函数之前展开 data.frame

library(tidyverse)
expand.grid(id=unique(test$id), feat = featlist) %>% #creates all combinations
  mutate(feat = as.character(feat)) %>%  
  left_join(test, by=c("id", "feat")) %>%      #Join with actual dataframe
  spread(key=feat, value = value, fill = 0)

#  id feat1 feat2 feat3
#1  1    10    20     0
#2  2  1000  2000     0

数据:

test<-data.frame(id=c(1,1,2,2),
                 feat=c("feat1", "feat2", "feat1", "feat2"),
                 value = c(10,20, 1000, 2000), stringsAsFactors = FALSE)

featlist<-c("feat1", "feat2", "feat3")