如果一个数据框的元素与另一个数据框匹配,如何替换它们,同时保留不匹配的元素?

How to replace the elements of one data frame if they match another data frame while keeping the ones that do not match?

如果我的数据框中的某些元素与另一个数据框中的元素匹配,我将尝试替换它们。

df1:

      V1        V2    V3
10 JP_00267-008 JP_00267-008 Line
11 JP_00302-049 JP_00302-049 Line
12      4FP3188      4FP3188 Line
13 JP_00284-029 JP_00284-029 Line
14 JP_00268-005 JP_00268-005 Line
15 JP_00265-057 JP_00265-057 Line
16 JP_00286-010 JP_00286-010 Line
17 JP_00283-008 JP_00283-008 Line
18 JP_00330-298 JP_00330-298 Line
19 JP_00269-035 JP_00269-035 Line
20 JP_00300-106 JP_00300-106 Line

df2:

      V1   V2
10 JP_00267-008 4FP3428 
11 JP_00302-049 4FP5103 
13 JP_00284-029 4FP4137 
14 JP_00268-005 4FP3465 
15 JP_00265-057 4FP3367 
16 JP_00286-010 4FP4245 
17 JP_00283-008 4FP4085 
18 JP_00330-298 4PP3992 
19 JP_00269-035 4FP3575 
20 JP_00300-106 4FP4963

我想要的输出是:

      V1    V2  V3
10  4FP3428 JP_00267-008 Line
11  4FP5103 JP_00302-049 Line
12  4FP3188      4FP3188 Line
13  4FP4137 JP_00284-029 Line
14  4FP3465 JP_00268-005 Line
15  4FP3367 JP_00265-057 Line
16  4FP4245 JP_00286-010 Line
17  4FP4085 JP_00283-008 Line
18  4PP3992 JP_00330-298 Line
19  4FP3575 JP_00269-035 Line
20  4FP4963 JP_00300-106 Line

但我得到的是:

      V1       V2         V3
10  4FP3428 JP_00267-008 Line
11  4FP5103 JP_00302-049 Line
12     <NA>      4FP3188 Line
13  4FP4137 JP_00284-029 Line
14  4FP3465 JP_00268-005 Line
15  4FP3367 JP_00265-057 Line
16  4FP4245 JP_00286-010 Line
17  4FP4085 JP_00283-008 Line
18  4PP3992 JP_00330-298 Line
19  4FP3575 JP_00269-035 Line
20  4FP4963 JP_00300-106 Line

这是我使用的代码:

df1[,1] <- df2[match(as.character(unlist(df1[,1])), as.character(df2[[1]])), 2]

谁能帮我去掉 NA 而用原始元素代替?

提前致谢

如果您想坚持使用基础 R,请使用

# an index which includes missing values
idx <- match(as.character(unlist(df1[,1])), as.character(df2[[1]]))

# an index of the non-missing values in `idx`
idx_not_missing <- !is.na(idx)

# push the data only when the index `idx` is not missing 
df1[idx_not_missing,1] <- df2[idx[idx_not_missing], 2]

这是一个使用data.table

的选项
 library(data.table)
 setkey(setDT(df1), V1)[df2, V1:=i.V2][]
 #       V1           V2   V3
 # 1: 4FP3188      4FP3188 Line
 #2: 4FP3367 JP_00265-057 Line
 #3: 4FP3428 JP_00267-008 Line
 #4: 4FP3465 JP_00268-005 Line
 #5: 4FP3575 JP_00269-035 Line
 #6: 4FP4085 JP_00283-008 Line
 #7: 4FP4137 JP_00284-029 Line
 #8: 4FP4245 JP_00286-010 Line
 #9: 4FP4963 JP_00300-106 Line
#10: 4FP5103 JP_00302-049 Line
#11: 4PP3992 JP_00330-298 Line

或使用dplyr

 library(dplyr)
 left_join(df1, df2, by='V1') %>% 
           mutate(V2.y= ifelse(is.na(V2.y), V1, V2.y)) %>%
           select(-V1) %>% 
           rename(V1=V2.y, V2=V2.x)
 #            V2   V3      V1
 #1  JP_00267-008 Line 4FP3428
 #2  JP_00302-049 Line 4FP5103
 #3       4FP3188 Line 4FP3188
 #4  JP_00284-029 Line 4FP4137
 #5  JP_00268-005 Line 4FP3465
 #6  JP_00265-057 Line 4FP3367
 #7  JP_00286-010 Line 4FP4245
 #8  JP_00283-008 Line 4FP4085
 #9  JP_00330-298 Line 4PP3992
 #10 JP_00269-035 Line 4FP3575
 #11 JP_00300-106 Line 4FP4963