如何检查 df2 中的对是否与 R 中的 df1(含)成对?
How to check if pairs from df2 are in pairs of df1 (inclusive) in R?
我有两个数据帧,我想将数据帧对 b
与数据帧对 a
进行比较,看看 b
中的数据帧对是否落在(包括)a
中的 pairs/range。例如,见下文:
df_1 <- data.frame(x= c(-82.38319, -82.38318, -82.40397, -82.40417, -82.40423),
y= c(29.61212, 29.61125, 29.61130, 29.61134, 29.61167))
#Output:
# x y
# 1 -82.38319 29.61212
# 2 -82.38318 29.61125
# 3 -82.40397 29.61130
# 4 -82.40417 29.61134
# 5 -82.40423 29.61167
df_2 <- data.frame(o= c(-82.38320,-82.38317,-82.40397,-82.40416,-82.40424),
t= c(29.61212, 29.6114, 29.61130, 29.61133, 29.61167))
#Output:
# o t
# 1 -82.38320 29.61212
# 2 -82.38317 29.61140
# 3 -82.40397 29.61130
# 4 -82.40416 29.61133
# 5 -82.40424 29.61167
#made this dataframe as an example only.
desired_output <- data.frame(lat= df_2$o, lon= df_2$t, exists= c(NA, "YES","YES","YES",NA))
#Output I seek:
# lat lon exists
# 1 -82.38320 29.61212 <NA>
# 2 -82.38317 29.61140 YES
# 3 -82.40397 29.61130 YES
# 4 -82.40416 29.61133 YES
# 5 -82.40424 29.61167 <NA>
#explanation:
#1- even though 82.38320 is OK & is in rows 3,4,5 in df_1, 29.61212 is out of bounds with their co-pairings.
#2- row 2 of df_2 is within the row 5 of df_1.
#3- row 3 of df_2 matches to row 3 of df_1 thus inclusive
#4- row 4 pair matches and its co_pair is less than those pair of row 4 in df_1
#5- This pair at row 5 is out of bounds in all of the rows of df_1
#Column "exists" can be appended to dataframe b, result matters only, neatness is not an issue.
我已经在 Stack Overflow 中进行了挖掘,但除了 之外一无所获。但是这个人比较的是单个值与成对,而不是成对与成对或成对中的成对。我对两个数据框都做了 cbind
并使用它进行了比较。但我失败了。
接下来我可以尝试什么?
我们可以使用 mapply
来比较 df_2
和 df_1
的 o
和 t
值,并检查 any
值是否是范围并相应地分配 "YES"
或 NA
。
df_2$exists <- c(NA, "YES")[mapply(function(x, y)
any(df_1$x <= x & df_1$y >= y), df_2$o, df_2$t) + 1]
df_2
# o t exists
#1 -82.38320 29.61212 <NA>
#2 -82.38317 29.61140 YES
#3 -82.40397 29.61130 YES
#4 -82.40416 29.61133 YES
#5 -82.40424 29.61167 <NA>
我们可以在 data.table
中使用非相等连接
library(data.table)
setDT(df_2)[df_1, exists := "YES", on = .(o >= x, t < y), mult = 'first']
我有两个数据帧,我想将数据帧对 b
与数据帧对 a
进行比较,看看 b
中的数据帧对是否落在(包括)a
中的 pairs/range。例如,见下文:
df_1 <- data.frame(x= c(-82.38319, -82.38318, -82.40397, -82.40417, -82.40423),
y= c(29.61212, 29.61125, 29.61130, 29.61134, 29.61167))
#Output:
# x y
# 1 -82.38319 29.61212
# 2 -82.38318 29.61125
# 3 -82.40397 29.61130
# 4 -82.40417 29.61134
# 5 -82.40423 29.61167
df_2 <- data.frame(o= c(-82.38320,-82.38317,-82.40397,-82.40416,-82.40424),
t= c(29.61212, 29.6114, 29.61130, 29.61133, 29.61167))
#Output:
# o t
# 1 -82.38320 29.61212
# 2 -82.38317 29.61140
# 3 -82.40397 29.61130
# 4 -82.40416 29.61133
# 5 -82.40424 29.61167
#made this dataframe as an example only.
desired_output <- data.frame(lat= df_2$o, lon= df_2$t, exists= c(NA, "YES","YES","YES",NA))
#Output I seek:
# lat lon exists
# 1 -82.38320 29.61212 <NA>
# 2 -82.38317 29.61140 YES
# 3 -82.40397 29.61130 YES
# 4 -82.40416 29.61133 YES
# 5 -82.40424 29.61167 <NA>
#explanation:
#1- even though 82.38320 is OK & is in rows 3,4,5 in df_1, 29.61212 is out of bounds with their co-pairings.
#2- row 2 of df_2 is within the row 5 of df_1.
#3- row 3 of df_2 matches to row 3 of df_1 thus inclusive
#4- row 4 pair matches and its co_pair is less than those pair of row 4 in df_1
#5- This pair at row 5 is out of bounds in all of the rows of df_1
#Column "exists" can be appended to dataframe b, result matters only, neatness is not an issue.
我已经在 Stack Overflow 中进行了挖掘,但除了 cbind
并使用它进行了比较。但我失败了。
接下来我可以尝试什么?
我们可以使用 mapply
来比较 df_2
和 df_1
的 o
和 t
值,并检查 any
值是否是范围并相应地分配 "YES"
或 NA
。
df_2$exists <- c(NA, "YES")[mapply(function(x, y)
any(df_1$x <= x & df_1$y >= y), df_2$o, df_2$t) + 1]
df_2
# o t exists
#1 -82.38320 29.61212 <NA>
#2 -82.38317 29.61140 YES
#3 -82.40397 29.61130 YES
#4 -82.40416 29.61133 YES
#5 -82.40424 29.61167 <NA>
我们可以在 data.table
library(data.table)
setDT(df_2)[df_1, exists := "YES", on = .(o >= x, t < y), mult = 'first']