将列添加到内部联接 (data.table)

Question

我正在向内部联接添加列，但不理解结果。

考虑表 A、B：

A <- data.table(id=c(1,2,3), x_val = c("x1", "x2", "x3"))

    id x_val  
# 1:  1    x1  
# 2:  2    x2 
# 3:  3    x3

B <- data.table(id=c(1,2,4), y_val = c("y1", "y2", "y3"))

#    id y_val
# 1:  1    y1
# 2:  2    y2
# 3:  4    y3

现在考虑这些联接，前两个完全有意义。

A[B, on=.(id)]

# rows=3 This join is what I expect.  The last row of B is included, but col A no has match. 

#       id  x_val  y_val
#    <num> <char> <char>
# 1:     1     x1     y1
# 2:     2     x2     y2
# 3:     4   <NA>     y3
#

A[B, on=.(id), nomatch=NULL]

#   rows=2 To remove the unmatching row use nomatch=NULL (ie inner join)

#       id  x_val  y_val
#    <num> <char> <char>
# 1:     1     x1     y1
# 2:     2     x2     y2

现在惊喜。
完成行，现在关注 columns:
在下面的案例中，每个 columns 和标签都是预期的，但 行数不是 。我希望每种情况下有 2 行。

我错过了什么？

A[B, .(A.id = A$id), on=.(id), nomatch=NULL]

#     A.id
#    <num>
# 1:     1
# 2:     2
# 3:     3
A[B, .(B$id), on=.(id), nomatch=NULL]#
#       V1
#    <num>
# 1:     1
# 2:     2
# 3:     4
A[B, .(A.id = A$id, B.id= B$id,  A.x_val = A$x_val, B$y_val), on=.(id), nomatch=NULL]
#     A.id  B.id A.x_val     V4
#    <num> <num>  <char> <char>
# 1:     1     1      x1     y1
# 2:     2     2      x2     y2
# 3:     3     4      x3     y3

Answer 1

到 select 列，同时合并两个 data.table 列（就像你正在做的那样），你不应该使用美元符号。您可以在 A 和 B 中的列名称（在 A[B] 的合并中）分别加上 x. 和 i. 前缀（见下文）。
您缺少的是，在您的示例中，您正在 select 原始数据集（有 3 行）中的列，而不是（内部）连接数据集中有 2 行的列。

A[B, .(A.id = x.id), on=.(id), nomatch=NULL]  # prefixing id with x. select id from A
#     A.id
# 1:     1
# 2:     2
A[B, .(i.id), on=.(id), nomatch=NULL]         # prefixing id with i. select id from B
#     i.id
# 1:     1
# 2:     2
A[B, .(A.id = x.id, B.id= i.id,  A.x_val = x.x_val, i.y_val), on=.(id), nomatch=NULL]
#     A.id  B.id A.x_val i.y_val
# 1:     1     1      x1      y1
# 2:     2     2      x2      y2

将列添加到内部联接 (data.table)

Adding columns to inner join (data.table)

join

r

data.table