将字符串列转换为具有重复键的虚拟变量

Converting string column to dummy variables with duplicate keys

我正在尝试转换这个 -

> df.orig <- data.frame(id = c('foo', 'bar', 'foo'), action = c('abc','def','ghi'))
> df.orig
   id action
1 foo    abc
2 bar    def
3 foo    ghi

进入:

> df.new <- data.frame(id = c('foo', 'bar'), action_abc = c(1,0), action_def = c(0,1), action_ghi = c(1,0))
> df.new
   id action_abc action_def action_ghi
1 foo          1          0          1
2 bar          0          1          0

sparse.model.matrixdcast 似乎不能很好地处理多个键 ('foo')。

> sparse.model.matrix(id ~ action - 1, df.orig)
3 x 3 sparse Matrix of class "dgCMatrix"
  actionabc actiondef actionghi
1         1         .         .
2         .         1         .
3         .         .         1

通过使用table

  df <- data.frame(id = c('foo', 'bar', 'foo'), action = c('abc','def','ghi'),stringsAsFactors = F)

  table(df$id,df$action)

      abc def ghi
  bar   0   1   0
  foo   1   0   1