将 key/value 列分组为单行
Grouping key/value columns into single rows
我正在尝试采用键值组合并将所有值与键放在同一行。我很确定我曾经知道如何做到这一点(我认为 data.table)并且我一直在寻找常见的嫌疑人 reshape2、tidyr、data.table 等,但我不能似乎想不出一个简单的解决方案。
key1 = c(1,1,1,1,2,2,2,2)
key2 = c("A","A","B","B","C","C","D","D")
value = c("a","b","c","d","e","f","g","h")
kvframe = data.frame(key1,key2,value)
# key1 key2 value
#1 1 A a
#2 1 A b
#3 1 B c
#4 1 B d
#5 2 C e
#6 2 C f
#7 2 D g
#8 2 D h
这是我希望 table 的样子:
# key1 key2 value1 value2
# 1 A a b
# 1 B c d
# 2 C e f
# 2 D g h
大多数key1,key2对具有相同数量的对应值,但不是所有的都这样。我希望有一个值列数等于任何给定键集的最大值数的解决方案,其中任何具有较少值的对都用 NA 填充。
您需要组 'key1/key2' 的序列列。
library(data.table) # v1.9.5+
setDT(kvframe)[, Seq := paste0('value', 1:.N), by = .(key1, key2)] # generate Seq
dcast(kvframe, key1 + key2 ~Seq, value.var = 'value') # cast from long to wide
# key1 key2 value1 value2
#1: 1 A a b
#2: 1 B c d
#3: 2 C e f
#4: 2 D g h
或使用 base R
中的 reshape
d1 <- transform(kvframe, Seq=ave(seq_along(value),
key1, key2, FUN=seq_along))
reshape(d1, idvar=c('key1', 'key2'), timevar='Seq', direction='wide')
# key1 key2 value.1 value.2
#1 1 A a b
#3 1 B c d
#5 2 C e f
#7 2 D g h
或
library(tidyr)
spread(d1, Seq, value)
我正在尝试采用键值组合并将所有值与键放在同一行。我很确定我曾经知道如何做到这一点(我认为 data.table)并且我一直在寻找常见的嫌疑人 reshape2、tidyr、data.table 等,但我不能似乎想不出一个简单的解决方案。
key1 = c(1,1,1,1,2,2,2,2)
key2 = c("A","A","B","B","C","C","D","D")
value = c("a","b","c","d","e","f","g","h")
kvframe = data.frame(key1,key2,value)
# key1 key2 value
#1 1 A a
#2 1 A b
#3 1 B c
#4 1 B d
#5 2 C e
#6 2 C f
#7 2 D g
#8 2 D h
这是我希望 table 的样子:
# key1 key2 value1 value2
# 1 A a b
# 1 B c d
# 2 C e f
# 2 D g h
大多数key1,key2对具有相同数量的对应值,但不是所有的都这样。我希望有一个值列数等于任何给定键集的最大值数的解决方案,其中任何具有较少值的对都用 NA 填充。
您需要组 'key1/key2' 的序列列。
library(data.table) # v1.9.5+
setDT(kvframe)[, Seq := paste0('value', 1:.N), by = .(key1, key2)] # generate Seq
dcast(kvframe, key1 + key2 ~Seq, value.var = 'value') # cast from long to wide
# key1 key2 value1 value2
#1: 1 A a b
#2: 1 B c d
#3: 2 C e f
#4: 2 D g h
或使用 base R
reshape
d1 <- transform(kvframe, Seq=ave(seq_along(value),
key1, key2, FUN=seq_along))
reshape(d1, idvar=c('key1', 'key2'), timevar='Seq', direction='wide')
# key1 key2 value.1 value.2
#1 1 A a b
#3 1 B c d
#5 2 C e f
#7 2 D g h
或
library(tidyr)
spread(d1, Seq, value)