将数据从一行复制到 R data.table 中的新行
copy data from one row to a new row in R data.table
我有一个 table 如下:
library(data.table)
dt <- data.table(t0.id=1:2,t0.V1=c("a","e"),t0.V2=c("b","f"),t1.id=3:4,t1.V1=c("c","g"),t1.V2=c("d","h"))
dt
t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
1: 1 a b 3 c d
2: 2 e f 4 g h
并且我想将数据从第一行复制到新行,如下所示:
t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
1: 1 a b
2: 3 c d
3: 2 e f 4 g h
我知道如何复制该行(我见过 ),但我不知道如何按条件清除列(例如 t0.id==1),因为两者行将相等。
我想这可以通过行索引来完成,但我的真实 table 有数千行,我认为这不是最好的方法。
谢谢
编辑:
- 行的最终顺序无关紧要,也就是说,最后的第 1 行和第 2 行不需要彼此相邻。
- I 'manually'(通过查看一些变量)确定哪些行需要拆分。因此,唯一要应用的条件是基于 't0.id'.
library(data.table)
splitids <- 1L # t0.id
out <- rbindlist(list(
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t0")],
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t1")],
dt[!t0.id %in% splitids,]),
use.names = TRUE, fill = TRUE)
out
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 1 a b NA <NA> <NA>
# 2: NA <NA> <NA> 3 c d
# 3: 2 e f 4 g h
如果您逐一查看它们可能更有意义:
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t0")]
# t0.id t0.V1 t0.V2
# <int> <char> <char>
# 1: 1 a b
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t1")]
# t1.id t1.V1 t1.V2
# <int> <char> <char>
# 1: 3 c d
dt[!t0.id %in% splitids,]
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 2 e f 4 g h
如果您需要空白 ""
而不是 NA
,那么可以对 character
列执行此操作,但不能对 t*.id
列执行此操作,因为那样会将它们从 integer
转换为 character
.
ischr <- which(sapply(dt, inherits, "character"))
ischr
# t0.V1 t0.V2 t1.V1 t1.V2
# 2 3 5 6
out[, (ischr) := lapply(.SD, fcoalesce, ""), .SDcols = ischr][]
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 1 a b NA
# 2: NA 3 c d
# 3: 2 e f 4 g h
我有一个 table 如下:
library(data.table)
dt <- data.table(t0.id=1:2,t0.V1=c("a","e"),t0.V2=c("b","f"),t1.id=3:4,t1.V1=c("c","g"),t1.V2=c("d","h"))
dt
t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
1: 1 a b 3 c d
2: 2 e f 4 g h
并且我想将数据从第一行复制到新行,如下所示:
t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
1: 1 a b
2: 3 c d
3: 2 e f 4 g h
我知道如何复制该行(我见过
我想这可以通过行索引来完成,但我的真实 table 有数千行,我认为这不是最好的方法。
谢谢
编辑:
- 行的最终顺序无关紧要,也就是说,最后的第 1 行和第 2 行不需要彼此相邻。
- I 'manually'(通过查看一些变量)确定哪些行需要拆分。因此,唯一要应用的条件是基于 't0.id'.
library(data.table)
splitids <- 1L # t0.id
out <- rbindlist(list(
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t0")],
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t1")],
dt[!t0.id %in% splitids,]),
use.names = TRUE, fill = TRUE)
out
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 1 a b NA <NA> <NA>
# 2: NA <NA> <NA> 3 c d
# 3: 2 e f 4 g h
如果您逐一查看它们可能更有意义:
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t0")]
# t0.id t0.V1 t0.V2
# <int> <char> <char>
# 1: 1 a b
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t1")]
# t1.id t1.V1 t1.V2
# <int> <char> <char>
# 1: 3 c d
dt[!t0.id %in% splitids,]
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 2 e f 4 g h
如果您需要空白 ""
而不是 NA
,那么可以对 character
列执行此操作,但不能对 t*.id
列执行此操作,因为那样会将它们从 integer
转换为 character
.
ischr <- which(sapply(dt, inherits, "character"))
ischr
# t0.V1 t0.V2 t1.V1 t1.V2
# 2 3 5 6
out[, (ischr) := lapply(.SD, fcoalesce, ""), .SDcols = ischr][]
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 1 a b NA
# 2: NA 3 c d
# 3: 2 e f 4 g h