根据公共行匹配2个数据框,并保留行名的顺序

Match 2 data frames based on common rows, and preserving the order of rownames

我需要修改数据框“DF1”,将其第 1(也是唯一)列与“DF2”的第 2 列匹配,并通过保留行名的顺序[=28=打印匹配的列] 在DF1中。我还需要用 0 替换不匹配的行。这是我拥有的数据帧的两个示例:

“DF1”

Ccd
Kkl
Sop
Mnn
Msg
Xxy
Zxz
Ccd
Msg

“DF2”

3   Ab
5   Abc
5   Ccd
9   Kkl
5   Msg
13  Sop
19  Klj

代码

read.table("a.txt")->DF1
read.table("b.txt")->DF2
colnames(DF1)<-c("b")
colnames(DF2)<-c("a", "b")
DF3 <- merge(DF1,DF2, by="b", all.x=TRUE) #
DF3$a[is.na(DF3$a)] <- 0 #substitute NA with 0

我从上面的代码得到的输出是:

 b  a
Ccd  5
Ccd  5
Kkl  9
Mnn  0
Msg  5
Msg  5
Sop 13
Xxy  0
Zxz  0

我实际需要的输出是:

Ccd  5
Kkl  9
Sop  13
Mnn  0
Msg  5
Xxy  0
Zxz  0
Ccd  5
Msg  5

使用 data.table,您可以这样做:

library(data.table)
setDT(df2)[setDT(df1),,on="b"][is.na(a), a:=0][]

输出:

    a   b
1:  5 Ccd
2:  9 Kkl
3: 13 Sop
4:  0 Mnn
5:  5 Msg
6:  0 Xxy
7:  0 Zxz
8:  5 Ccd
9:  5 Msg

dplyr:

library(dplyr)
left_join(df1,df2, by="b") %>% mutate(a=if_else(is.na(a),0,as.double(a)))

输出:

     b  a
1: Ccd  5
2: Kkl  9
3: Sop 13
4: Mnn  0
5: Msg  5
6: Xxy  0
7: Zxz  0
8: Ccd  5
9: Msg  5

输入:

df1 <- structure(list(b = c("Ccd", "Kkl", "Sop", "Mnn", "Msg", "Xxy", 
"Zxz", "Ccd", "Msg")), row.names = c(NA, -9L), class = "data.frame")

df2 <- structure(list(a = c(3L, 5L, 5L, 9L, 5L, 13L, 19L), b = c("Ab", 
"Abc", "Ccd", "Kkl", "Msg", "Sop", "Klj")), row.names = c(NA, 
-7L), class = "data.frame")