使用 rbind 和 lapply 有条件地复制 R 中具有相同列的两个数据帧之间的行
Using rbind and lapply to conditionally copy rows between two dataframes with identical columns in R
我已经阅读了有关此主题的其他 Whosebug 问题,但我仍然迷路了。我想我需要使用 lapply,但我没有看到确切的操作方法。
假设我有以下两个数据帧:
df1 <- data.frame(Color = c("Red", "Green", "Blue", "Green", "Purple", "Red"),
Year = c(1999, 2008, 2010, 2018, 2017, 2018),
License = c("123ABC", "544HGB", "923LWD", "443JFD", "889WER", "932OIF"))
df2 <- data.frame(Color = c("White", "Green", "Black", "Silver", "Purple", "Blue"),
Year = c(2013, 2008, 2004, 2012, 2017, 2019),
License = c("342UDD", "544HGB", "398KJX", "654KIR", "889WER", "874SSD"))
我想让 R 遍历 df2,每次它在 df1 的许可证列中遇到不的许可证列中的值时,它应该附加整个包含从 df2 到 df1 顶部的许可证值的行(即使其成为顶行)。任何人都可以建议正确的方法吗?
基础 R
rbind(df2[ ! df2$License %in% df1$License, ], df1)
# Color Year License
# 1 White 2013 342UDD
# 3 Black 2004 398KJX
# 4 Silver 2012 654KIR
# 6 Blue 2019 874SSD
# 11 Red 1999 123ABC
# 2 Green 2008 544HGB
# 31 Blue 2010 923LWD
# 41 Green 2018 443JFD
# 5 Purple 2017 889WER
# 61 Red 2018 932OIF
数据
df1 <- structure(list(Color = c("Red", "Green", "Blue", "Green", "Purple", "Red"), Year = c(1999, 2008, 2010, 2018, 2017, 2018), License = c("123ABC", "544HGB", "923LWD", "443JFD", "889WER", "932OIF")), class = "data.frame", row.names = c(NA, -6L))
df2 <- structure(list(Color = c("White", "Green", "Black", "Silver", "Purple", "Blue"), Year = c(2013, 2008, 2004, 2012, 2017, 2019), License = c("342UDD", "544HGB", "398KJX", "654KIR", "889WER", "874SSD")), class = "data.frame", row.names = c(NA, -6L))
您可以使用 anti_join
获取 df2
中不在 df1
中的行,并将它们与 df1
绑定。
library(dplyr)
result <- bind_rows(df1, anti_join(df2, df1, by = 'License'))
result
# Color Year License
#1 Red 1999 123ABC
#2 Green 2008 544HGB
#3 Blue 2010 923LWD
#4 Green 2018 443JFD
#5 Purple 2017 889WER
#6 Red 2018 932OIF
#7 White 2013 342UDD
#8 Black 2004 398KJX
#9 Silver 2012 654KIR
#10 Blue 2019 874SSD
我已经阅读了有关此主题的其他 Whosebug 问题,但我仍然迷路了。我想我需要使用 lapply,但我没有看到确切的操作方法。
假设我有以下两个数据帧:
df1 <- data.frame(Color = c("Red", "Green", "Blue", "Green", "Purple", "Red"),
Year = c(1999, 2008, 2010, 2018, 2017, 2018),
License = c("123ABC", "544HGB", "923LWD", "443JFD", "889WER", "932OIF"))
df2 <- data.frame(Color = c("White", "Green", "Black", "Silver", "Purple", "Blue"),
Year = c(2013, 2008, 2004, 2012, 2017, 2019),
License = c("342UDD", "544HGB", "398KJX", "654KIR", "889WER", "874SSD"))
我想让 R 遍历 df2,每次它在 df1 的许可证列中遇到不的许可证列中的值时,它应该附加整个包含从 df2 到 df1 顶部的许可证值的行(即使其成为顶行)。任何人都可以建议正确的方法吗?
基础 R
rbind(df2[ ! df2$License %in% df1$License, ], df1)
# Color Year License
# 1 White 2013 342UDD
# 3 Black 2004 398KJX
# 4 Silver 2012 654KIR
# 6 Blue 2019 874SSD
# 11 Red 1999 123ABC
# 2 Green 2008 544HGB
# 31 Blue 2010 923LWD
# 41 Green 2018 443JFD
# 5 Purple 2017 889WER
# 61 Red 2018 932OIF
数据
df1 <- structure(list(Color = c("Red", "Green", "Blue", "Green", "Purple", "Red"), Year = c(1999, 2008, 2010, 2018, 2017, 2018), License = c("123ABC", "544HGB", "923LWD", "443JFD", "889WER", "932OIF")), class = "data.frame", row.names = c(NA, -6L))
df2 <- structure(list(Color = c("White", "Green", "Black", "Silver", "Purple", "Blue"), Year = c(2013, 2008, 2004, 2012, 2017, 2019), License = c("342UDD", "544HGB", "398KJX", "654KIR", "889WER", "874SSD")), class = "data.frame", row.names = c(NA, -6L))
您可以使用 anti_join
获取 df2
中不在 df1
中的行,并将它们与 df1
绑定。
library(dplyr)
result <- bind_rows(df1, anti_join(df2, df1, by = 'License'))
result
# Color Year License
#1 Red 1999 123ABC
#2 Green 2008 544HGB
#3 Blue 2010 923LWD
#4 Green 2018 443JFD
#5 Purple 2017 889WER
#6 Red 2018 932OIF
#7 White 2013 342UDD
#8 Black 2004 398KJX
#9 Silver 2012 654KIR
#10 Blue 2019 874SSD