组上两列的测试条件
Test condition of two columns on groups
我正在尝试创建一个新列来检查组(id 和编号)是否两列具有相同的观察结果(分类和分类-1")。
这是原始数据框:
reprex <- tribble(~"id", ~"number", ~"year", ~"classification", ~"classification-1",
5, 7020, 2015, "Trading de servicios", "Servicios empresariales",
2, 4649, 2015, "Trading", "Comercial",
2, 4649, 2015, "Comercial", "Trading",
2, 4649, 2016, "Trading", "Comercial",
2, 4649, 2016, "Comercial", "Trading",
3, 4651, 2015, "Trading", "Comercial",
3, 4651, 2015, "Trading", "Comisiones",
3, 4651, 2015, "Comercial", "Trading",
3, 4651, 2015, "Comercial", "Comisiones")
我想要这个:
reprex <- tribble(~"id", ~"number", ~"year", ~"classification", ~"classification-1", ~"check",
5, 7020, 2015, "Trading de servicios", "Servicios empresariales", T,
2, 4649, 2015, "Trading", "Comercial", T
2, 4649, 2015, "Comercial", "Trading", T
2, 4649, 2016, "Trading", "Comercial", T
2, 4649, 2016, "Comercial", "Trading", T
3, 4651, 2015, "Trading", "Comercial", F
3, 4651, 2015, "Trading", "Comisiones", F
3, 4651, 2015, "Comercial", "Trading", F
3, 4651, 2015, "Comercial", "Comisiones", F)
也许这会有所帮助
library(dplyr)
reprex %>%
group_by(id, number) %>%
mutate(check = length(intersect(classification, `classification-1`)) > 0)
of如果我们需要检查all
和unique
元素,那么在按'id'、'number'分组后,得到两者的unique
元素classification
, classification-1
, 检查它们是否等于 setequal
reprex %>%
group_by(id, number) %>%
mutate(check = setequal(sort(unique(classification)),
sort(unique(`classification-1`))))
我正在尝试创建一个新列来检查组(id 和编号)是否两列具有相同的观察结果(分类和分类-1")。
这是原始数据框:
reprex <- tribble(~"id", ~"number", ~"year", ~"classification", ~"classification-1",
5, 7020, 2015, "Trading de servicios", "Servicios empresariales",
2, 4649, 2015, "Trading", "Comercial",
2, 4649, 2015, "Comercial", "Trading",
2, 4649, 2016, "Trading", "Comercial",
2, 4649, 2016, "Comercial", "Trading",
3, 4651, 2015, "Trading", "Comercial",
3, 4651, 2015, "Trading", "Comisiones",
3, 4651, 2015, "Comercial", "Trading",
3, 4651, 2015, "Comercial", "Comisiones")
我想要这个:
reprex <- tribble(~"id", ~"number", ~"year", ~"classification", ~"classification-1", ~"check",
5, 7020, 2015, "Trading de servicios", "Servicios empresariales", T,
2, 4649, 2015, "Trading", "Comercial", T
2, 4649, 2015, "Comercial", "Trading", T
2, 4649, 2016, "Trading", "Comercial", T
2, 4649, 2016, "Comercial", "Trading", T
3, 4651, 2015, "Trading", "Comercial", F
3, 4651, 2015, "Trading", "Comisiones", F
3, 4651, 2015, "Comercial", "Trading", F
3, 4651, 2015, "Comercial", "Comisiones", F)
也许这会有所帮助
library(dplyr)
reprex %>%
group_by(id, number) %>%
mutate(check = length(intersect(classification, `classification-1`)) > 0)
of如果我们需要检查all
和unique
元素,那么在按'id'、'number'分组后,得到两者的unique
元素classification
, classification-1
, 检查它们是否等于 setequal
reprex %>%
group_by(id, number) %>%
mutate(check = setequal(sort(unique(classification)),
sort(unique(`classification-1`))))