R 我可以用一个值重塑和前缀吗
R can I reshape and prefix with a value
如果我有一个 table 高的表格和 2 个变量 X1 X2,我怎样才能通过表格变宽,宽名称是表格值和变量名称的组合
tall <- data.frame( form=letters[1:3], x1=11:13, x2=21:23 )
所以我有:
form x1 x2
1 a 11 21
2 b 12 22
3 c 13 23
我想要
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11 21 12 22 13 23
这似乎有 dcast 的方面可以扩展,还有 tidy 函数拼接变量名的方面
您可以按形式拆分数据框,删除第一列,然后取消列出:
unlist(lapply(split(tall, tall$form), `[`, -1))
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11 21 12 22 13 23
尽管上面的 returns 一个命名向量并假设没有按形式具有多个值。如果不是这种情况并且您想要一个数据框,您可以这样做:
library(tidyr)
library(dplyr)
tall %>%
pivot_wider(names_from = form, values_from = c(x1, x2), values_fn = list(x1 = list, x2 = list)) %>%
unnest(cols = everything()) %>%
rename_all(~ gsub("^(.*)_(.*)$", "\2.\1", .x))
一个简单的解决方案是使用 reshape
,即
r <- reshape(cbind(id = 1,tall),
direction = "wide",
idvar = "id",
timevar = "form")[-1]
这样
> r
x1.a x2.a x1.b x2.b x1.c x2.c
1 11 21 12 22 13 23
编辑
如果您确实关心列名,可以查看以下代码:
- 使用
reshape
+setNames
,即
r <- setNames(r <- reshape(cbind(id = 1,tall),
direction = "wide",
idvar = "id",
timevar = "form")[-1],
gsub("(.*)\.(.*)","\2\.\1",names(r)))
这样
> r
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
1 11 21 12 22 13 23
- 或使用
outer
r <- setNames(c(t(tall[-1])),
c(t(outer(tall$form,names(tall[-1]),paste,sep = "."))))
这样
> r
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11 21 12 22 13 23
tidyr::pivot_wider
的另一个解决方案:
wide <- tall %>%
pivot_wider(names_from = form, values_from = c('x1', 'x2'), names_sep = '.')
x1.a x1.b x1.c x2.a x2.b x2.c
<int> <int> <int> <int> <int> <int>
1 11 12 13 21 22 23
为了解决列名问题,我想到了这个(不优雅,但适用于这个例子):
names(wide) <- paste0(str_extract(pattern = '[A-z]?$', string = names(wide)), '.', str_extract(pattern = '^[:alnum:]*', string = names(wide)))
# plus arranging columns:
wide <- wide %>%
select(starts_with(c('a', 'b', 'c')))
# A tibble: 1 x 6
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
<int> <int> <int> <int> <int> <int>
1 11 21 12 22 13 23
data.table
选项使用 dcast
library(data.table)
dcast(setDT(tall), rowid(form)~form, value.var = c('x1', 'x2'), sep = '.')[,form := NULL][]
# x1.a x1.b x1.c x2.a x2.b x2.c
#1: 11 12 13 21 22 23
如果我有一个 table 高的表格和 2 个变量 X1 X2,我怎样才能通过表格变宽,宽名称是表格值和变量名称的组合
tall <- data.frame( form=letters[1:3], x1=11:13, x2=21:23 )
所以我有:
form x1 x2
1 a 11 21
2 b 12 22
3 c 13 23
我想要
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11 21 12 22 13 23
这似乎有 dcast 的方面可以扩展,还有 tidy 函数拼接变量名的方面
您可以按形式拆分数据框,删除第一列,然后取消列出:
unlist(lapply(split(tall, tall$form), `[`, -1))
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11 21 12 22 13 23
尽管上面的 returns 一个命名向量并假设没有按形式具有多个值。如果不是这种情况并且您想要一个数据框,您可以这样做:
library(tidyr)
library(dplyr)
tall %>%
pivot_wider(names_from = form, values_from = c(x1, x2), values_fn = list(x1 = list, x2 = list)) %>%
unnest(cols = everything()) %>%
rename_all(~ gsub("^(.*)_(.*)$", "\2.\1", .x))
一个简单的解决方案是使用 reshape
,即
r <- reshape(cbind(id = 1,tall),
direction = "wide",
idvar = "id",
timevar = "form")[-1]
这样
> r
x1.a x2.a x1.b x2.b x1.c x2.c
1 11 21 12 22 13 23
编辑
如果您确实关心列名,可以查看以下代码:
- 使用
reshape
+setNames
,即
r <- setNames(r <- reshape(cbind(id = 1,tall),
direction = "wide",
idvar = "id",
timevar = "form")[-1],
gsub("(.*)\.(.*)","\2\.\1",names(r)))
这样
> r
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
1 11 21 12 22 13 23
- 或使用
outer
r <- setNames(c(t(tall[-1])),
c(t(outer(tall$form,names(tall[-1]),paste,sep = "."))))
这样
> r
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
11 21 12 22 13 23
tidyr::pivot_wider
的另一个解决方案:
wide <- tall %>%
pivot_wider(names_from = form, values_from = c('x1', 'x2'), names_sep = '.')
x1.a x1.b x1.c x2.a x2.b x2.c
<int> <int> <int> <int> <int> <int>
1 11 12 13 21 22 23
为了解决列名问题,我想到了这个(不优雅,但适用于这个例子):
names(wide) <- paste0(str_extract(pattern = '[A-z]?$', string = names(wide)), '.', str_extract(pattern = '^[:alnum:]*', string = names(wide)))
# plus arranging columns:
wide <- wide %>%
select(starts_with(c('a', 'b', 'c')))
# A tibble: 1 x 6
a.x1 a.x2 b.x1 b.x2 c.x1 c.x2
<int> <int> <int> <int> <int> <int>
1 11 21 12 22 13 23
data.table
选项使用 dcast
library(data.table)
dcast(setDT(tall), rowid(form)~form, value.var = c('x1', 'x2'), sep = '.')[,form := NULL][]
# x1.a x1.b x1.c x2.a x2.b x2.c
#1: 11 12 13 21 22 23