将具有 n 个重复元素的字符串拆分为 n 个子字符串
Split string with n repetitive elements into n sub-strings
我有一个字符串,它是 m 种可能类型的元素的串联 - 为了简单起见,m = 4 与 A、B、C 和 D。
每当单个元素出现不止一次时,我就必须拆分字符串,这样就没有重复项了。但是,我想生成所有可能的字符串而不重复。
为了更清楚一点,这里有一个例子:
对于 A B A C D
- 字符串:A B C D
- 字符串:B A C D
当多个不同的元素出现不止一次时,这会变得更加复杂:
对于 A B A C B D
- 字符串:A B C D
- 字符串:A C B D
- 字符串:B A C D
- 字符串:A C B D
在 R 中有没有聪明的方法来计算这个?
vec <- c("A","B","A","C","B","D")
combs <- lapply(setNames(nm = unique(vec)), function(a) which(vec == a))
eg <- do.call(expand.grid, combs)
out <- t(apply(eg, 1, function(r) names(eg)[order(r)]))
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "D"
# [2,] "B" "A" "C" "D"
# [3,] "A" "C" "B" "D"
# [4,] "A" "C" "B" "D"
out
第一个向量:
vec <- c("A","B","A","C","D")
# ...
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "D"
# [2,] "B" "A" "C" "D"
如果您以字符串副向量开始和结束,那么知道您可以将上面的内容包装为:
strsplit("ABACBD", "")[[1]]
# [1] "A" "B" "A" "C" "B" "D"
apply(out, 1, paste, collapse = "")
# [1] "ABCD" "BACD" "ACBD" "ACBD"
我有一个字符串,它是 m 种可能类型的元素的串联 - 为了简单起见,m = 4 与 A、B、C 和 D。
每当单个元素出现不止一次时,我就必须拆分字符串,这样就没有重复项了。但是,我想生成所有可能的字符串而不重复。
为了更清楚一点,这里有一个例子: 对于 A B A C D
- 字符串:A B C D
- 字符串:B A C D
当多个不同的元素出现不止一次时,这会变得更加复杂: 对于 A B A C B D
- 字符串:A B C D
- 字符串:A C B D
- 字符串:B A C D
- 字符串:A C B D
在 R 中有没有聪明的方法来计算这个?
vec <- c("A","B","A","C","B","D")
combs <- lapply(setNames(nm = unique(vec)), function(a) which(vec == a))
eg <- do.call(expand.grid, combs)
out <- t(apply(eg, 1, function(r) names(eg)[order(r)]))
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "D"
# [2,] "B" "A" "C" "D"
# [3,] "A" "C" "B" "D"
# [4,] "A" "C" "B" "D"
out
第一个向量:
vec <- c("A","B","A","C","D")
# ...
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "D"
# [2,] "B" "A" "C" "D"
如果您以字符串副向量开始和结束,那么知道您可以将上面的内容包装为:
strsplit("ABACBD", "")[[1]]
# [1] "A" "B" "A" "C" "B" "D"
apply(out, 1, paste, collapse = "")
# [1] "ABCD" "BACD" "ACBD" "ACBD"