根据公共行匹配2个数据框,并保留行名的顺序
Match 2 data frames based on common rows, and preserving the order of rownames
我需要修改数据框“DF1”,将其第 1(也是唯一)列与“DF2”的第 2 列匹配,并通过保留行名的顺序[=28=打印匹配的列] 在DF1中。我还需要用 0 替换不匹配的行。这是我拥有的数据帧的两个示例:
“DF1”
Ccd
Kkl
Sop
Mnn
Msg
Xxy
Zxz
Ccd
Msg
“DF2”
3 Ab
5 Abc
5 Ccd
9 Kkl
5 Msg
13 Sop
19 Klj
代码
read.table("a.txt")->DF1
read.table("b.txt")->DF2
colnames(DF1)<-c("b")
colnames(DF2)<-c("a", "b")
DF3 <- merge(DF1,DF2, by="b", all.x=TRUE) #
DF3$a[is.na(DF3$a)] <- 0 #substitute NA with 0
我从上面的代码得到的输出是:
b a
Ccd 5
Ccd 5
Kkl 9
Mnn 0
Msg 5
Msg 5
Sop 13
Xxy 0
Zxz 0
我实际需要的输出是:
Ccd 5
Kkl 9
Sop 13
Mnn 0
Msg 5
Xxy 0
Zxz 0
Ccd 5
Msg 5
使用 data.table,您可以这样做:
library(data.table)
setDT(df2)[setDT(df1),,on="b"][is.na(a), a:=0][]
输出:
a b
1: 5 Ccd
2: 9 Kkl
3: 13 Sop
4: 0 Mnn
5: 5 Msg
6: 0 Xxy
7: 0 Zxz
8: 5 Ccd
9: 5 Msg
或 dplyr
:
library(dplyr)
left_join(df1,df2, by="b") %>% mutate(a=if_else(is.na(a),0,as.double(a)))
输出:
b a
1: Ccd 5
2: Kkl 9
3: Sop 13
4: Mnn 0
5: Msg 5
6: Xxy 0
7: Zxz 0
8: Ccd 5
9: Msg 5
输入:
df1 <- structure(list(b = c("Ccd", "Kkl", "Sop", "Mnn", "Msg", "Xxy",
"Zxz", "Ccd", "Msg")), row.names = c(NA, -9L), class = "data.frame")
df2 <- structure(list(a = c(3L, 5L, 5L, 9L, 5L, 13L, 19L), b = c("Ab",
"Abc", "Ccd", "Kkl", "Msg", "Sop", "Klj")), row.names = c(NA,
-7L), class = "data.frame")
我需要修改数据框“DF1”,将其第 1(也是唯一)列与“DF2”的第 2 列匹配,并通过保留行名的顺序[=28=打印匹配的列] 在DF1中。我还需要用 0 替换不匹配的行。这是我拥有的数据帧的两个示例:
“DF1”
Ccd
Kkl
Sop
Mnn
Msg
Xxy
Zxz
Ccd
Msg
“DF2”
3 Ab
5 Abc
5 Ccd
9 Kkl
5 Msg
13 Sop
19 Klj
代码
read.table("a.txt")->DF1
read.table("b.txt")->DF2
colnames(DF1)<-c("b")
colnames(DF2)<-c("a", "b")
DF3 <- merge(DF1,DF2, by="b", all.x=TRUE) #
DF3$a[is.na(DF3$a)] <- 0 #substitute NA with 0
我从上面的代码得到的输出是:
b a
Ccd 5
Ccd 5
Kkl 9
Mnn 0
Msg 5
Msg 5
Sop 13
Xxy 0
Zxz 0
我实际需要的输出是:
Ccd 5
Kkl 9
Sop 13
Mnn 0
Msg 5
Xxy 0
Zxz 0
Ccd 5
Msg 5
使用 data.table,您可以这样做:
library(data.table)
setDT(df2)[setDT(df1),,on="b"][is.na(a), a:=0][]
输出:
a b
1: 5 Ccd
2: 9 Kkl
3: 13 Sop
4: 0 Mnn
5: 5 Msg
6: 0 Xxy
7: 0 Zxz
8: 5 Ccd
9: 5 Msg
或 dplyr
:
library(dplyr)
left_join(df1,df2, by="b") %>% mutate(a=if_else(is.na(a),0,as.double(a)))
输出:
b a
1: Ccd 5
2: Kkl 9
3: Sop 13
4: Mnn 0
5: Msg 5
6: Xxy 0
7: Zxz 0
8: Ccd 5
9: Msg 5
输入:
df1 <- structure(list(b = c("Ccd", "Kkl", "Sop", "Mnn", "Msg", "Xxy",
"Zxz", "Ccd", "Msg")), row.names = c(NA, -9L), class = "data.frame")
df2 <- structure(list(a = c(3L, 5L, 5L, 9L, 5L, 13L, 19L), b = c("Ab",
"Abc", "Ccd", "Kkl", "Msg", "Sop", "Klj")), row.names = c(NA,
-7L), class = "data.frame")