如果多列匹配,R 从一个数据框复制到另一个数据框
R copy from one dataframe to another if multiple columns match
我有两个具有相似信息的不同数据框。一个 (df2) 有一个更好的 UNIQFIREID 列表,第二个 (df1) 是我需要使用的数据框,因为它包含我正在使用的 shapefile。如果 df1 的 UNIQFIREID 为 NA 并且两个数据帧之间的多个列匹配,我希望能够将 df2 中的 UNIQFIREID 复制并粘贴到 df1 中,在本例中为 FIRENAME、DISCOVERDATETIME 和 TOTALACRES。然后忽略那些没有 NA 或不匹配的。我在下面放置了小样本数据框。
到目前为止,我尝试过的方法(例如使用合并、匹配、连接和 ifelse 方法)只会造成一堆令人费解的混乱,因为我不确定自己在做什么。我在 Stack Overflow 上发现了一些类似的问题,但它们要简单得多,而且我找不到组合方法的方法。任何建议将不胜感激。
df1 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", NA, "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df1$DISCOVERYDATETIME <- as.POSIXct(df1$DISCOVERYDATETIME)
df2 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", "1985-AZASF-000286", "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df2$DISCOVERYDATETIME <- as.POSIXct(df2$DISCOVERYDATETIME)
这是一堆垃圾,我正在努力让它发挥作用。我不会建议 运行 任何一个,但它更像是一个例子,看看我把事情搞得一团糟。
SW_Fire_Perimeters_1985test$UNIQFIREID[is.na(SW_Fire_Perimeters_1985test$UNIQFIREID)] <-
SW_Fire_Occurrences_1985[match(paste(SW_Fire_Perimeters_1985test$DISCOVERYDATETIME,
SW_Fire_Perimeters_1985test$FIRENAME,
SW_Fire_Perimeters_1985test$TOTALACRES),
paste(SW_Fire_Occurrences_1985$DISCOVERYDATETIME,
SW_Fire_Occurrences_1985$FIRENAME,
SW_Fire_Occurrences_1985$TOTALACRES)),"UNIQFIREID"]
ifelse(is.na(SW_Fire_Perimeters_1985test$UNIQFIREID),
SW_Fire_Occurrences_1985[match(paste(SW_Fire_Perimeters_1985test$DISCOVERYDATETIME,
SW_Fire_Perimeters_1985test$FIRENAME,
SW_Fire_Perimeters_1985test$TOTALACRES),
paste(SW_Fire_Occurrences_1985$DISCOVERYDATETIME, SW_Fire_Occurrences_1985$FIRENAME,
SW_Fire_Occurrences_1985$TOTALACRES)),"UNIQFIREID"])
SW_Fire_Perimeters_1985test$UNIQFIREID2 <-
SW_Fire_Occurrences_1985[match(paste(SW_Fire_Perimeters_1985test$DISCOVERYDATETIME,
SW_Fire_Perimeters_1985test$FIRENAME,
SW_Fire_Perimeters_1985test$TOTALACRES),
paste(SW_Fire_Occurrences_1985$DISCOVERYDATETIME, SW_Fire_Occurrences_1985$FIRENAME,
SW_Fire_Occurrences_1985$TOTALACRES)),"UNIQFIREID"]
# Merges two dataframes into fire perimeters dataframe based on "DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"
# https://docs.tibco.com/pub/enterprise-runtime-for-R/4.0.0/doc/html/Language_Reference/base/merge.html
SW_Fire_Merge_1985 <- merge(SW_Fire_Perimeters_1985, SW_Fire_Occurrences_1985, on = c( "DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"), nomatch = 0L)
SW_Fire_join_1985 <- full_join(SW_Fire_Perimeters_1985,SW_Fire_Occurrences_1985,
copy = TRUE,
# by.x = c("DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"),
# by.y = c("DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"),
# all.x = TRUE),
# by.y = c("UNIQFIREID"))
if(is.na(SW_Fire_Merge_1985$UNIQFIREID.x, paste(SW_Fire_Merge_1985$UNIQFIREID.y)))
如果您想查看完整的数据集(14 Mb 压缩)以及我所在的位置,您可以使用以下代码。只需将“目录...”替换为您要下载该数据和打开文件的位置。它选择向下到 1985 年的更小的集合
# Insert path to Geospatial data needed, and desired download location
FireH <- download.file("http://www.fs.fed.us/r3/gis/gisdata/Fire_History.zip", "Directory.../Fire_History.zip")
# Insert File path of downloaded zip file, overwrite is currently enabled for coding purposes, for exdir insert desired file path for geodatabase.
FireH2 <- unzip("Directory.../Fire_History.zip", overwrite = TRUE, exdir = "Directory...")
# Assigning Geodatabase a name
FireHGDB <- "Direcrory.../Fire_History.gdb"
# Brings Fire perimeters and occurrences out of GDB
SW_Fire_Perimeters <- st_read(FireHGDB, "FirePerimeter") #require_geomType="wkbPolygon")
SW_Fire_Occurrences <- st_read(FireHGDB, "FireOccurrence") #require_geomType="wkbPolygon")
# Removes invalid naming characters
# https://www.journaldev.com/43690/sub-and-gsub-function-r#the-gsub-function-in-r
SW_Fire_Perimeters$FIRENAME <- gsub(" ", "_", SW_Fire_Perimeters$FIRENAME)
SW_Fire_Occurrences$FIRENAME <- gsub(" ", "_", SW_Fire_Occurrences$FIRENAME)
SW_Fire_Perimeters$FIRENAME <- gsub("#", "_", SW_Fire_Perimeters$FIRENAME)
SW_Fire_Occurrences$FIRENAME <- gsub("#", "_", SW_Fire_Occurrences$FIRENAME)
SW_Fire_Perimeters$FIRENAME <- gsub("\.", "", SW_Fire_Perimeters$FIRENAME)
SW_Fire_Occurrences$FIRENAME <- gsub("\.", "", SW_Fire_Occurrences$FIRENAME)
# Removes NAs from fire occurrences UNIQFIREID column
SW_Fire_Occurrences <- SW_Fire_Occurrences[!is.na(SW_Fire_Occurrences$UNIQFIREID),]
# Removes incomplete UNIQFIREIDs for fire occurrences
SW_Fire_Occurrences <- subset(SW_Fire_Occurrences, nchar(as.character(UNIQFIREID)) == 17)
# Removes geometries from fire occurrences so they can be merged to perimeters (Error with two sf objects when merged)
SW_Fire_Occurrences <- st_drop_geometry(SW_Fire_Occurrences)
# Filters tables to only contain FIREYEARs 1985 - 2019
SW_Fire_Perimeters_1985_2019 <- filter(SW_Fire_Perimeters, FIREYEAR >= 1985, FIREYEAR <= 2019)
SW_Fire_Occurrences_1985_2019 <- filter(SW_Fire_Occurrences, FIREYEAR >= 1985, FIREYEAR <= 2019)
# Make a new row (UniqLength) with the string length of UNIQFIREID (it should be 17 characters long)
SW_Fire_Perimeters_1985_2019$UniqLength <- str_count(SW_Fire_Perimeters_1985_2019$UNIQFIREID)
# Set NAs is UniqLength to 0
#
SW_Fire_Perimeters_1985_2019[c("UniqLength")][is.na(SW_Fire_Perimeters_1985_2019[c("UniqLength")])] <- FALSE
# Replace any UNIQFIREIDs with NA when UNIQFIREID (UniqLength) not equal to 17
#
SW_Fire_Perimeters_1985_2019[SW_Fire_Perimeters_1985_2019$UniqLength !=17,c("UNIQFIREID")] <- NA
# Filter to FIREYEAR 1985 only
SW_Fire_Perimeters_1985 <- filter(SW_Fire_Perimeters_1985_2019, FIREYEAR == 1985)
SW_Fire_Occurrences_1985 <- filter(SW_Fire_Occurrences_1985_2019, FIREYEAR == 1985)
如果我没理解错的话,你可以...
- 做一个完全连接
by=
是所有列,但 "UNIQFIREID"
- 结果将保留值...
df1$UNIQFIREID
在 <RESULT>$UNIQFIREID.x
df2$UNIQFIREID
在 <RESULT>$UNIQFIREID.y
- 使用
ifelse()
(或其亲属)创建一个新的 "UNIQFIREID"
列,以根据需要从 <RESULT>$UNIQFIREID.x
和 <RESULT>$UNIQFIREID.y
中提取值
- 删除
<RESULT>$UNIQFIREID
is.na()
. 所在的行
您的数据:
df1 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", NA, "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df1$DISCOVERYDATETIME <- as.POSIXct(df1$DISCOVERYDATETIME)
df2 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", "1985-AZASF-000286", "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df2$DISCOVERYDATETIME <- as.POSIXct(df2$DISCOVERYDATETIME)
使用{base}
:
combo_base <- merge(df1, df2, all = TRUE,
by = c("FIRENAME", "DISCOVERYDATETIME", "TOTALACRES"))
combo_base$UNIQFIREID <- ifelse(is.na(combo_base$UNIQFIREID.x),
combo_base$UNIQFIREID.y, combo_base$UNIQFIREID.x)
combo_base <- combo_base[!is.na(combo_base$UNIQFIREID),
!names(combo_base) %in% c("UNIQFIREID.x", "UNIQFIREID.y"),
drop = FALSE]
combo_base
#> FIRENAME DISCOVERYDATETIME TOTALACRES UNIQFIREID
#> 1 Gold 1985-03-28 60 1985-AZASF-000285
#> 2 Green_1 1985-03-31 90 1985-AZASF-000288
#> 3 Tank 1985-03-30 80 1985-AZASF-000287
#> 4 Tree 1985-03-29 70 1985-AZASF-000286
使用{data.table}
:
library(data.table)
combo_datatable <- merge(
as.data.table(df1), df2,
by = c("FIRENAME", "DISCOVERYDATETIME", "TOTALACRES"),
all = TRUE
)[, UNIQFIREID := fifelse(is.na(UNIQFIREID.x), UNIQFIREID.y, UNIQFIREID.x)
][!is.na(UNIQFIREID), !c("UNIQFIREID.x", "UNIQFIREID.y")
]
combo_datatable
#> FIRENAME DISCOVERYDATETIME TOTALACRES UNIQFIREID
#> 1: Gold 1985-03-28 60 1985-AZASF-000285
#> 2: Green_1 1985-03-31 90 1985-AZASF-000288
#> 3: Tank 1985-03-30 80 1985-AZASF-000287
#> 4: Tree 1985-03-29 70 1985-AZASF-000286
使用{dplyr}
:
library(dplyr, warn.conflicts = FALSE)
combo_dplyr <- df1 %>%
full_join(df2, by = c("FIRENAME", "DISCOVERYDATETIME", "TOTALACRES")) %>%
mutate(UNIQFIREID = if_else(is.na(UNIQFIREID.x), UNIQFIREID.y, UNIQFIREID.x)) %>%
select(-UNIQFIREID.x, -UNIQFIREID.y) %>%
filter(!is.na(UNIQFIREID))
combo_dplyr
#> FIRENAME DISCOVERYDATETIME TOTALACRES UNIQFIREID
#> 1 Gold 1985-03-28 60 1985-AZASF-000285
#> 2 Tree 1985-03-29 70 1985-AZASF-000286
#> 3 Tank 1985-03-30 80 1985-AZASF-000287
#> 4 Green_1 1985-03-31 90 1985-AZASF-000288
完整性检查:
identical(combo_base, as.data.frame(combo_datatable))
#> [1] TRUE
identical(combo_base, combo_dplyr %>% arrange(FIRENAME))
#> [1] TRUE
我有两个具有相似信息的不同数据框。一个 (df2) 有一个更好的 UNIQFIREID 列表,第二个 (df1) 是我需要使用的数据框,因为它包含我正在使用的 shapefile。如果 df1 的 UNIQFIREID 为 NA 并且两个数据帧之间的多个列匹配,我希望能够将 df2 中的 UNIQFIREID 复制并粘贴到 df1 中,在本例中为 FIRENAME、DISCOVERDATETIME 和 TOTALACRES。然后忽略那些没有 NA 或不匹配的。我在下面放置了小样本数据框。
到目前为止,我尝试过的方法(例如使用合并、匹配、连接和 ifelse 方法)只会造成一堆令人费解的混乱,因为我不确定自己在做什么。我在 Stack Overflow 上发现了一些类似的问题,但它们要简单得多,而且我找不到组合方法的方法。任何建议将不胜感激。
df1 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", NA, "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df1$DISCOVERYDATETIME <- as.POSIXct(df1$DISCOVERYDATETIME)
df2 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", "1985-AZASF-000286", "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df2$DISCOVERYDATETIME <- as.POSIXct(df2$DISCOVERYDATETIME)
这是一堆垃圾,我正在努力让它发挥作用。我不会建议 运行 任何一个,但它更像是一个例子,看看我把事情搞得一团糟。
SW_Fire_Perimeters_1985test$UNIQFIREID[is.na(SW_Fire_Perimeters_1985test$UNIQFIREID)] <-
SW_Fire_Occurrences_1985[match(paste(SW_Fire_Perimeters_1985test$DISCOVERYDATETIME,
SW_Fire_Perimeters_1985test$FIRENAME,
SW_Fire_Perimeters_1985test$TOTALACRES),
paste(SW_Fire_Occurrences_1985$DISCOVERYDATETIME,
SW_Fire_Occurrences_1985$FIRENAME,
SW_Fire_Occurrences_1985$TOTALACRES)),"UNIQFIREID"]
ifelse(is.na(SW_Fire_Perimeters_1985test$UNIQFIREID),
SW_Fire_Occurrences_1985[match(paste(SW_Fire_Perimeters_1985test$DISCOVERYDATETIME,
SW_Fire_Perimeters_1985test$FIRENAME,
SW_Fire_Perimeters_1985test$TOTALACRES),
paste(SW_Fire_Occurrences_1985$DISCOVERYDATETIME, SW_Fire_Occurrences_1985$FIRENAME,
SW_Fire_Occurrences_1985$TOTALACRES)),"UNIQFIREID"])
SW_Fire_Perimeters_1985test$UNIQFIREID2 <-
SW_Fire_Occurrences_1985[match(paste(SW_Fire_Perimeters_1985test$DISCOVERYDATETIME,
SW_Fire_Perimeters_1985test$FIRENAME,
SW_Fire_Perimeters_1985test$TOTALACRES),
paste(SW_Fire_Occurrences_1985$DISCOVERYDATETIME, SW_Fire_Occurrences_1985$FIRENAME,
SW_Fire_Occurrences_1985$TOTALACRES)),"UNIQFIREID"]
# Merges two dataframes into fire perimeters dataframe based on "DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"
# https://docs.tibco.com/pub/enterprise-runtime-for-R/4.0.0/doc/html/Language_Reference/base/merge.html
SW_Fire_Merge_1985 <- merge(SW_Fire_Perimeters_1985, SW_Fire_Occurrences_1985, on = c( "DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"), nomatch = 0L)
SW_Fire_join_1985 <- full_join(SW_Fire_Perimeters_1985,SW_Fire_Occurrences_1985,
copy = TRUE,
# by.x = c("DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"),
# by.y = c("DISCOVERYDATETIME", "FIRENAME", "TOTALACRES"),
# all.x = TRUE),
# by.y = c("UNIQFIREID"))
if(is.na(SW_Fire_Merge_1985$UNIQFIREID.x, paste(SW_Fire_Merge_1985$UNIQFIREID.y)))
如果您想查看完整的数据集(14 Mb 压缩)以及我所在的位置,您可以使用以下代码。只需将“目录...”替换为您要下载该数据和打开文件的位置。它选择向下到 1985 年的更小的集合
# Insert path to Geospatial data needed, and desired download location
FireH <- download.file("http://www.fs.fed.us/r3/gis/gisdata/Fire_History.zip", "Directory.../Fire_History.zip")
# Insert File path of downloaded zip file, overwrite is currently enabled for coding purposes, for exdir insert desired file path for geodatabase.
FireH2 <- unzip("Directory.../Fire_History.zip", overwrite = TRUE, exdir = "Directory...")
# Assigning Geodatabase a name
FireHGDB <- "Direcrory.../Fire_History.gdb"
# Brings Fire perimeters and occurrences out of GDB
SW_Fire_Perimeters <- st_read(FireHGDB, "FirePerimeter") #require_geomType="wkbPolygon")
SW_Fire_Occurrences <- st_read(FireHGDB, "FireOccurrence") #require_geomType="wkbPolygon")
# Removes invalid naming characters
# https://www.journaldev.com/43690/sub-and-gsub-function-r#the-gsub-function-in-r
SW_Fire_Perimeters$FIRENAME <- gsub(" ", "_", SW_Fire_Perimeters$FIRENAME)
SW_Fire_Occurrences$FIRENAME <- gsub(" ", "_", SW_Fire_Occurrences$FIRENAME)
SW_Fire_Perimeters$FIRENAME <- gsub("#", "_", SW_Fire_Perimeters$FIRENAME)
SW_Fire_Occurrences$FIRENAME <- gsub("#", "_", SW_Fire_Occurrences$FIRENAME)
SW_Fire_Perimeters$FIRENAME <- gsub("\.", "", SW_Fire_Perimeters$FIRENAME)
SW_Fire_Occurrences$FIRENAME <- gsub("\.", "", SW_Fire_Occurrences$FIRENAME)
# Removes NAs from fire occurrences UNIQFIREID column
SW_Fire_Occurrences <- SW_Fire_Occurrences[!is.na(SW_Fire_Occurrences$UNIQFIREID),]
# Removes incomplete UNIQFIREIDs for fire occurrences
SW_Fire_Occurrences <- subset(SW_Fire_Occurrences, nchar(as.character(UNIQFIREID)) == 17)
# Removes geometries from fire occurrences so they can be merged to perimeters (Error with two sf objects when merged)
SW_Fire_Occurrences <- st_drop_geometry(SW_Fire_Occurrences)
# Filters tables to only contain FIREYEARs 1985 - 2019
SW_Fire_Perimeters_1985_2019 <- filter(SW_Fire_Perimeters, FIREYEAR >= 1985, FIREYEAR <= 2019)
SW_Fire_Occurrences_1985_2019 <- filter(SW_Fire_Occurrences, FIREYEAR >= 1985, FIREYEAR <= 2019)
# Make a new row (UniqLength) with the string length of UNIQFIREID (it should be 17 characters long)
SW_Fire_Perimeters_1985_2019$UniqLength <- str_count(SW_Fire_Perimeters_1985_2019$UNIQFIREID)
# Set NAs is UniqLength to 0
#
SW_Fire_Perimeters_1985_2019[c("UniqLength")][is.na(SW_Fire_Perimeters_1985_2019[c("UniqLength")])] <- FALSE
# Replace any UNIQFIREIDs with NA when UNIQFIREID (UniqLength) not equal to 17
#
SW_Fire_Perimeters_1985_2019[SW_Fire_Perimeters_1985_2019$UniqLength !=17,c("UNIQFIREID")] <- NA
# Filter to FIREYEAR 1985 only
SW_Fire_Perimeters_1985 <- filter(SW_Fire_Perimeters_1985_2019, FIREYEAR == 1985)
SW_Fire_Occurrences_1985 <- filter(SW_Fire_Occurrences_1985_2019, FIREYEAR == 1985)
如果我没理解错的话,你可以...
- 做一个完全连接
by=
是所有列,但"UNIQFIREID"
- 结果将保留值...
df1$UNIQFIREID
在<RESULT>$UNIQFIREID.x
df2$UNIQFIREID
在<RESULT>$UNIQFIREID.y
- 使用
ifelse()
(或其亲属)创建一个新的"UNIQFIREID"
列,以根据需要从<RESULT>$UNIQFIREID.x
和<RESULT>$UNIQFIREID.y
中提取值 - 删除
<RESULT>$UNIQFIREID
is.na()
. 所在的行
您的数据:
df1 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", NA, "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df1$DISCOVERYDATETIME <- as.POSIXct(df1$DISCOVERYDATETIME)
df2 <- data.frame(FIRENAME = c("Gold", "Tree", "Tank", "Green_1"),
UNIQFIREID = c("1985-AZASF-000285", "1985-AZASF-000286", "1985-AZASF-000287", "1985-AZASF-000288"),
DISCOVERYDATETIME = c("1985-03-28", "1985-03-29", "1985-03-30", "1985-03-31"),
TOTALACRES = c(60, 70, 80, 90))
df2$DISCOVERYDATETIME <- as.POSIXct(df2$DISCOVERYDATETIME)
使用{base}
:
combo_base <- merge(df1, df2, all = TRUE,
by = c("FIRENAME", "DISCOVERYDATETIME", "TOTALACRES"))
combo_base$UNIQFIREID <- ifelse(is.na(combo_base$UNIQFIREID.x),
combo_base$UNIQFIREID.y, combo_base$UNIQFIREID.x)
combo_base <- combo_base[!is.na(combo_base$UNIQFIREID),
!names(combo_base) %in% c("UNIQFIREID.x", "UNIQFIREID.y"),
drop = FALSE]
combo_base
#> FIRENAME DISCOVERYDATETIME TOTALACRES UNIQFIREID
#> 1 Gold 1985-03-28 60 1985-AZASF-000285
#> 2 Green_1 1985-03-31 90 1985-AZASF-000288
#> 3 Tank 1985-03-30 80 1985-AZASF-000287
#> 4 Tree 1985-03-29 70 1985-AZASF-000286
使用{data.table}
:
library(data.table)
combo_datatable <- merge(
as.data.table(df1), df2,
by = c("FIRENAME", "DISCOVERYDATETIME", "TOTALACRES"),
all = TRUE
)[, UNIQFIREID := fifelse(is.na(UNIQFIREID.x), UNIQFIREID.y, UNIQFIREID.x)
][!is.na(UNIQFIREID), !c("UNIQFIREID.x", "UNIQFIREID.y")
]
combo_datatable
#> FIRENAME DISCOVERYDATETIME TOTALACRES UNIQFIREID
#> 1: Gold 1985-03-28 60 1985-AZASF-000285
#> 2: Green_1 1985-03-31 90 1985-AZASF-000288
#> 3: Tank 1985-03-30 80 1985-AZASF-000287
#> 4: Tree 1985-03-29 70 1985-AZASF-000286
使用{dplyr}
:
library(dplyr, warn.conflicts = FALSE)
combo_dplyr <- df1 %>%
full_join(df2, by = c("FIRENAME", "DISCOVERYDATETIME", "TOTALACRES")) %>%
mutate(UNIQFIREID = if_else(is.na(UNIQFIREID.x), UNIQFIREID.y, UNIQFIREID.x)) %>%
select(-UNIQFIREID.x, -UNIQFIREID.y) %>%
filter(!is.na(UNIQFIREID))
combo_dplyr
#> FIRENAME DISCOVERYDATETIME TOTALACRES UNIQFIREID
#> 1 Gold 1985-03-28 60 1985-AZASF-000285
#> 2 Tree 1985-03-29 70 1985-AZASF-000286
#> 3 Tank 1985-03-30 80 1985-AZASF-000287
#> 4 Green_1 1985-03-31 90 1985-AZASF-000288
完整性检查:
identical(combo_base, as.data.frame(combo_datatable))
#> [1] TRUE
identical(combo_base, combo_dplyr %>% arrange(FIRENAME))
#> [1] TRUE