R 我可以用一个值重塑和前缀吗

R can I reshape and prefix with a value

如果我有一个 table 高的表格和 2 个变量 X1 X2,我怎样才能通过表格变宽,宽名称是表格值和变量名称的组合

tall <- data.frame( form=letters[1:3], x1=11:13, x2=21:23 )

所以我有:

  form x1 x2
1    a 11 21
2    b 12 22
3    c 13 23

我想要

a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11   21   12   22   13   23

这似乎有 dcast 的方面可以扩展,还有 tidy 函数拼接变量名的方面

您可以按形式拆分数据框,删除第一列,然后取消列出:

unlist(lapply(split(tall, tall$form), `[`, -1))

a.x1 a.x2 b.x1 b.x2 c.x1 c.x2 
  11   21   12   22   13   23 

尽管上面的 returns 一个命名向量并假设没有按形式具有多个值。如果不是这种情况并且您想要一个数据框,您可以这样做:

library(tidyr)
library(dplyr)

tall %>%
  pivot_wider(names_from = form, values_from = c(x1, x2), values_fn = list(x1 = list, x2 = list)) %>%
  unnest(cols = everything()) %>%
  rename_all(~ gsub("^(.*)_(.*)$", "\2.\1", .x))

一个简单的解决方案是使用 reshape,即

r <- reshape(cbind(id = 1,tall),
             direction = "wide",
             idvar = "id",
             timevar = "form")[-1]

这样

> r
  x1.a x2.a x1.b x2.b x1.c x2.c
1   11   21   12   22   13   23

编辑

如果您确实关心列名,可以查看以下代码:

  • 使用reshape+setNames,即
r <- setNames(r <- reshape(cbind(id = 1,tall),
                           direction = "wide",
                           idvar = "id",
                           timevar = "form")[-1],
              gsub("(.*)\.(.*)","\2\.\1",names(r)))

这样

> r
  a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
1   11   21   12   22   13   23
  • 或使用 outer
r <- setNames(c(t(tall[-1])),
              c(t(outer(tall$form,names(tall[-1]),paste,sep = "."))))

这样

> r
 a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
   11   21   12   22   13   23

tidyr::pivot_wider 的另一个解决方案:

wide <- tall %>%
  pivot_wider(names_from = form, values_from = c('x1', 'x2'), names_sep = '.')
   x1.a  x1.b  x1.c  x2.a  x2.b  x2.c
  <int> <int> <int> <int> <int> <int>
1    11    12    13    21    22    23

为了解决列名问题,我想到了这个(不优雅,但适用于这个例子):

names(wide) <- paste0(str_extract(pattern = '[A-z]?$', string = names(wide)), '.', str_extract(pattern = '^[:alnum:]*', string = names(wide)))

# plus arranging columns:
wide <- wide %>%
  select(starts_with(c('a', 'b', 'c')))
# A tibble: 1 x 6
   a.x1  a.x2  b.x1  b.x2  c.x1  c.x2
  <int> <int> <int> <int> <int> <int>
1    11    21    12    22    13    23

data.table 选项使用 dcast

library(data.table)

dcast(setDT(tall), rowid(form)~form, value.var = c('x1', 'x2'), sep = '.')[,form := NULL][]

#   x1.a x1.b x1.c x2.a x2.b x2.c
#1:   11   12   13   21   22   23