按列的无序组合对 tibble 行进行分组
group tibble rows by unordered combination of columns
鉴于以下标题
tibble(sample = c(1:6),
string = c("ABC","ABC","CBA","FED","DEF","DEF"),
x = c("a","a","b","e","d","d"),
y = c("b","b","a","d","e","e"))
# A tibble: 6 × 4
sample string x y
<int> <chr> <chr> <chr>
1 1 ABC a b
2 2 ABC a b
3 3 CBA b a
4 4 FED e d
5 5 DEF d e
6 6 DEF d e
我想按列 x,y
的无序组合对行进行分组,然后在 x,y
的情况下翻转 x
⇔ y
并反转 string
] 相对于组中的第一行倒置。期望的输出:
# A tibble: 6 × 5
sample string x y group
<int> <chr> <chr> <chr> <dbl>
1 1 ABC a b 1
2 2 ABC a b 1
3 3 ABC a b 1
4 4 FED e d 2
5 5 FED e d 2
6 6 FED e d 2
strSort <- function(x) sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")
dat %>%
group_by(group = data.table::rleid(strSort(string))) %>%
mutate(across(string:y, first))
# A tibble: 6 x 5
# Groups: group [2]
sample string x y group
<int> <chr> <chr> <chr> <int>
1 1 ABC a b 1
2 2 ABC a b 1
3 3 ABC a b 1
4 4 FED e d 2
5 5 FED e d 2
6 6 FED e d 2
上一个回答
这是一种同时使用 tidyverse
和 apply
方法的方法。首先,对 x 和 y 列的行进行排序,然后 group_by
x 和 y,必要时创建 cur_group_id
和 stri_reverse
。
library(tidyverse)
library(stringi)
#Sort by row
dat[, c("x", "y")] <- t(apply(dat[, c("x", "y")], 1, sort))
dat %>%
group_by(x, y) %>%
mutate(group = cur_group_id(),
string = ifelse(str_sub(string, 1, 1) == toupper(x), string, stri_reverse(string)))
# A tibble: 6 x 5
# Groups: x, y [2]
sample string x y group
<int> <chr> <chr> <chr> <int>
1 1 ABC a b 1
2 2 ABC a b 1
3 3 ABC a b 1
4 4 DEF d e 2
5 5 DEF d e 2
6 6 DEF d e 2
鉴于以下标题
tibble(sample = c(1:6),
string = c("ABC","ABC","CBA","FED","DEF","DEF"),
x = c("a","a","b","e","d","d"),
y = c("b","b","a","d","e","e"))
# A tibble: 6 × 4
sample string x y
<int> <chr> <chr> <chr>
1 1 ABC a b
2 2 ABC a b
3 3 CBA b a
4 4 FED e d
5 5 DEF d e
6 6 DEF d e
我想按列 x,y
的无序组合对行进行分组,然后在 x,y
的情况下翻转 x
⇔ y
并反转 string
] 相对于组中的第一行倒置。期望的输出:
# A tibble: 6 × 5
sample string x y group
<int> <chr> <chr> <chr> <dbl>
1 1 ABC a b 1
2 2 ABC a b 1
3 3 ABC a b 1
4 4 FED e d 2
5 5 FED e d 2
6 6 FED e d 2
strSort <- function(x) sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")
dat %>%
group_by(group = data.table::rleid(strSort(string))) %>%
mutate(across(string:y, first))
# A tibble: 6 x 5
# Groups: group [2]
sample string x y group
<int> <chr> <chr> <chr> <int>
1 1 ABC a b 1
2 2 ABC a b 1
3 3 ABC a b 1
4 4 FED e d 2
5 5 FED e d 2
6 6 FED e d 2
上一个回答
这是一种同时使用 tidyverse
和 apply
方法的方法。首先,对 x 和 y 列的行进行排序,然后 group_by
x 和 y,必要时创建 cur_group_id
和 stri_reverse
。
library(tidyverse)
library(stringi)
#Sort by row
dat[, c("x", "y")] <- t(apply(dat[, c("x", "y")], 1, sort))
dat %>%
group_by(x, y) %>%
mutate(group = cur_group_id(),
string = ifelse(str_sub(string, 1, 1) == toupper(x), string, stri_reverse(string)))
# A tibble: 6 x 5
# Groups: x, y [2]
sample string x y group
<int> <chr> <chr> <chr> <int>
1 1 ABC a b 1
2 2 ABC a b 1
3 3 ABC a b 1
4 4 DEF d e 2
5 5 DEF d e 2
6 6 DEF d e 2