重塑数据 table 以将列名转换为行名
Reshaping data table to make column names into row names
我在 R
中有一个 data.table
> dt
SAMPLE junction count
1: R1 a 1
2: R2 a 1
3: R3 b 1
4: R3 a 1
5: R1 c 2
现在我想"reshape"数据table组成一个data frame
m
(基本上是通过样本矩阵与索引值的连接作为对应的计数值)。此外,请注意对于 dt
中不存在的 (SAMPLE,junction)
对,我假设对应的 count
值为 zero
。
有人可以帮助我如何实现这一目标吗?
> m
R1 R2 R3
a 1 1 1
b 0 0 1
c 2 0 0
data.table
中的 dcast
将数据集从 'long' 格式更改为 'wide' 格式。
library(data.table)#v1.9.5+
dcast(dt, junction~SAMPLE, value.var='count', fill=0)
# junction R1 R2 R3
#1: a 1 1 1
#2: b 0 0 1
#3: c 2 0 0
如果需要矩阵输出
library(reshape2)
acast(dt, junction~SAMPLE, value.var='count', fill=0)
# R1 R2 R3
#a 1 1 1
#b 0 0 1
#c 2 0 0
或 xtabs
来自 base R
xtabs(count~junction+SAMPLE, dt)
使用 tidyr
中的 spread
的替代方法:
library(tidyr)
spread(dt, SAMPLE, count, fill=0)
# junction R1 R2 R3
#1: a 1 1 1
#2: b 0 0 1
#3: c 2 0 0
或 stats
中 reshape
的老派解决方案:
reshape(dt, timevar='SAMPLE', idvar=c('junction'), direction='wide')
# junction count.R1 count.R2 count.R3
#1: a 1 1 1
#2: b NA NA 1
#3: c 2 NA NA
数据:
dt = structure(list(SAMPLE = c("R1", "R2", "R3", "R3", "R1"), junction = c("a",
"a", "b", "a", "c"), count = c(1, 1, 1, 1, 2)), .Names = c("SAMPLE",
"junction", "count"), row.names = c(NA, -5L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x05e924a0>)
我在 R
data.table
> dt
SAMPLE junction count
1: R1 a 1
2: R2 a 1
3: R3 b 1
4: R3 a 1
5: R1 c 2
现在我想"reshape"数据table组成一个data frame
m
(基本上是通过样本矩阵与索引值的连接作为对应的计数值)。此外,请注意对于 dt
中不存在的 (SAMPLE,junction)
对,我假设对应的 count
值为 zero
。
有人可以帮助我如何实现这一目标吗?
> m
R1 R2 R3
a 1 1 1
b 0 0 1
c 2 0 0
data.table
中的 dcast
将数据集从 'long' 格式更改为 'wide' 格式。
library(data.table)#v1.9.5+
dcast(dt, junction~SAMPLE, value.var='count', fill=0)
# junction R1 R2 R3
#1: a 1 1 1
#2: b 0 0 1
#3: c 2 0 0
如果需要矩阵输出
library(reshape2)
acast(dt, junction~SAMPLE, value.var='count', fill=0)
# R1 R2 R3
#a 1 1 1
#b 0 0 1
#c 2 0 0
或 xtabs
来自 base R
xtabs(count~junction+SAMPLE, dt)
使用 tidyr
中的 spread
的替代方法:
library(tidyr)
spread(dt, SAMPLE, count, fill=0)
# junction R1 R2 R3
#1: a 1 1 1
#2: b 0 0 1
#3: c 2 0 0
或 stats
中 reshape
的老派解决方案:
reshape(dt, timevar='SAMPLE', idvar=c('junction'), direction='wide')
# junction count.R1 count.R2 count.R3
#1: a 1 1 1
#2: b NA NA 1
#3: c 2 NA NA
数据:
dt = structure(list(SAMPLE = c("R1", "R2", "R3", "R3", "R1"), junction = c("a",
"a", "b", "a", "c"), count = c(1, 1, 1, 1, 2)), .Names = c("SAMPLE",
"junction", "count"), row.names = c(NA, -5L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x05e924a0>)