Melting/Splitting一行分成两行,使用原行中的两个列值,其余保持不变
Melting/Splitting a row into two rows, using two column values in the original row, leaving the rest intact
我有一个data.table如下:
DT <- fread(
"ID country year Event_A Event_B
4 NLD 2002 0 1
5 NLD 2002 0 1
6 NLD 2006 1 1
7 NLD 2006 1 0
8 NLD 2006 1 1
9 GBR 2002 0 1
10 GBR 2002 0 0
11 GBR 2002 0 1
12 GBR 2006 1 1
13 GBR 2006 1 1",
header = TRUE)
我想将事件列投射到行上而不对它们求和,从而创建新行。我试过了:
meltedsessions <- melt(Exp, id.vars = -c(Event_A", "Event_B"), measure.vars = c("Event_A", "Event_B"))
我需要将 id.vars
指定为负数,因为实际数据集还有另外 240 个变量需要保持完整。但是,如果我这样做,我会收到错误消息:
Error in melt.data.table(Exp, id.vars = c("ID", "country", "year"), measure.vars = c("Event_A", :
One or more values in 'id.vars' is invalid.
我该如何解决?
期望的输出:
DT <- fread(
"NewID ID country year Event
1 4 NLD 2002 0
2 4 NLD 2002 1
3 5 NLD 2002 0
4 5 NLD 2002 1
5 6 NLD 2006 1
6 6 NLD 2006 1
7 7 NLD 2006 1
8 7 NLD 2006 0
9 8 NLD 2006 1
10 8 NLD 2006 0
11 9 GBR 2002 1
12 9 GBR 2002 1
13 10 GBR 2002 0
14 10 GBR 2002 0
15 11 GBR 2002 0
16 12 GBR 2002 1
17 13 GBR 2006 1
18 14 GBR 2006 1
19 15 GBR 2006 1
20 16 GBR 2006 1",
header = TRUE)
而不是id.var
中的-
,可以使用setdiff
library(data.table)
melt(DT, id.var = setdiff(names(DT), c("Event_A", "Event_B")),
value.name = 'Event')[, variable := NULL][order(ID)]
# ID country year Event
# 1: 4 NLD 2002 0
# 2: 4 NLD 2002 1
# 3: 5 NLD 2002 0
# 4: 5 NLD 2002 1
# 5: 6 NLD 2006 1
# 6: 6 NLD 2006 1
# 7: 7 NLD 2006 1
# 8: 7 NLD 2006 0
# 9: 8 NLD 2006 1
#10: 8 NLD 2006 1
#11: 9 GBR 2002 0
#12: 9 GBR 2002 1
#13: 10 GBR 2002 0
#14: 10 GBR 2002 0
#15: 11 GBR 2002 0
#16: 11 GBR 2002 1
#17: 12 GBR 2006 1
#18: 12 GBR 2006 1
#19: 13 GBR 2006 1
#20: 13 GBR 2006 1
我有一个data.table如下:
DT <- fread(
"ID country year Event_A Event_B
4 NLD 2002 0 1
5 NLD 2002 0 1
6 NLD 2006 1 1
7 NLD 2006 1 0
8 NLD 2006 1 1
9 GBR 2002 0 1
10 GBR 2002 0 0
11 GBR 2002 0 1
12 GBR 2006 1 1
13 GBR 2006 1 1",
header = TRUE)
我想将事件列投射到行上而不对它们求和,从而创建新行。我试过了:
meltedsessions <- melt(Exp, id.vars = -c(Event_A", "Event_B"), measure.vars = c("Event_A", "Event_B"))
我需要将 id.vars
指定为负数,因为实际数据集还有另外 240 个变量需要保持完整。但是,如果我这样做,我会收到错误消息:
Error in melt.data.table(Exp, id.vars = c("ID", "country", "year"), measure.vars = c("Event_A", :
One or more values in 'id.vars' is invalid.
我该如何解决?
期望的输出:
DT <- fread(
"NewID ID country year Event
1 4 NLD 2002 0
2 4 NLD 2002 1
3 5 NLD 2002 0
4 5 NLD 2002 1
5 6 NLD 2006 1
6 6 NLD 2006 1
7 7 NLD 2006 1
8 7 NLD 2006 0
9 8 NLD 2006 1
10 8 NLD 2006 0
11 9 GBR 2002 1
12 9 GBR 2002 1
13 10 GBR 2002 0
14 10 GBR 2002 0
15 11 GBR 2002 0
16 12 GBR 2002 1
17 13 GBR 2006 1
18 14 GBR 2006 1
19 15 GBR 2006 1
20 16 GBR 2006 1",
header = TRUE)
而不是id.var
中的-
,可以使用setdiff
library(data.table)
melt(DT, id.var = setdiff(names(DT), c("Event_A", "Event_B")),
value.name = 'Event')[, variable := NULL][order(ID)]
# ID country year Event
# 1: 4 NLD 2002 0
# 2: 4 NLD 2002 1
# 3: 5 NLD 2002 0
# 4: 5 NLD 2002 1
# 5: 6 NLD 2006 1
# 6: 6 NLD 2006 1
# 7: 7 NLD 2006 1
# 8: 7 NLD 2006 0
# 9: 8 NLD 2006 1
#10: 8 NLD 2006 1
#11: 9 GBR 2002 0
#12: 9 GBR 2002 1
#13: 10 GBR 2002 0
#14: 10 GBR 2002 0
#15: 11 GBR 2002 0
#16: 11 GBR 2002 1
#17: 12 GBR 2006 1
#18: 12 GBR 2006 1
#19: 13 GBR 2006 1
#20: 13 GBR 2006 1