我如何保留两个数据中都存在 "ID" 的观察结果?
How I retain observations that "ID" exist in both data?
我有两个数据集,都包含 "ID" 变量。在这两个数据集中,我如何才能观察到 ID 存在于两个数据集中?我正在使用 R.
比如
df1 <- structure(list(CustomerId = c(1, 2, 3, 4, 5, 8, 9), Product = structure(c(4L,
4L, 4L, 3L, 3L, 1L, 2L), .Label = c("abc", "def", "Radio", "Toaster"
), class = "factor")), .Names = c("CustomerId", "Product"), row.names = c(NA,
-7L), class = "data.frame")
df2 <-
structure(list(CustomerId = c(2, 4, 6, 7), State = structure(c(2L,
2L, 3L, 1L), .Label = c("aaa", "Alabama", "Ohio"), class = "factor")), .Names = c("CustomerId",
"State"), row.names = c(NA, -4L), class = "data.frame")
在两个数据集中,我想保留两个数据中都存在的观察结果。 (这两个数据集中的 ID 为 2 和 4。)
你可以只使用像
这样的基本子集
subset(df1, CustomerId %in% df2$CustomerId)
subset(df2, CustomerId %in% df1$CustomerId)
of 如果你使用 dplyr
这称为 semi_join
library(dplyr)
semi_join(df1, df2)
semi_join(df2, df1)
merge() 将是最简单的解决方案之一。这是内连接等效项,如果需要外连接,请检查其他参数。
merge(df1, df2, by="CustomerId")[,1:2]
CustomerId Product
1 2 Toaster
2 4 Radio
merge(df2, df1, by="CustomerId")[,1:2]
CustomerId State
1 2 Alabama
2 4 Alabama
我有两个数据集,都包含 "ID" 变量。在这两个数据集中,我如何才能观察到 ID 存在于两个数据集中?我正在使用 R.
比如
df1 <- structure(list(CustomerId = c(1, 2, 3, 4, 5, 8, 9), Product = structure(c(4L,
4L, 4L, 3L, 3L, 1L, 2L), .Label = c("abc", "def", "Radio", "Toaster"
), class = "factor")), .Names = c("CustomerId", "Product"), row.names = c(NA,
-7L), class = "data.frame")
df2 <-
structure(list(CustomerId = c(2, 4, 6, 7), State = structure(c(2L,
2L, 3L, 1L), .Label = c("aaa", "Alabama", "Ohio"), class = "factor")), .Names = c("CustomerId",
"State"), row.names = c(NA, -4L), class = "data.frame")
在两个数据集中,我想保留两个数据中都存在的观察结果。 (这两个数据集中的 ID 为 2 和 4。)
你可以只使用像
这样的基本子集subset(df1, CustomerId %in% df2$CustomerId)
subset(df2, CustomerId %in% df1$CustomerId)
of 如果你使用 dplyr
这称为 semi_join
library(dplyr)
semi_join(df1, df2)
semi_join(df2, df1)
merge() 将是最简单的解决方案之一。这是内连接等效项,如果需要外连接,请检查其他参数。
merge(df1, df2, by="CustomerId")[,1:2]
CustomerId Product
1 2 Toaster
2 4 Radio
merge(df2, df1, by="CustomerId")[,1:2]
CustomerId State
1 2 Alabama
2 4 Alabama