R按索引创建分割向量列表

R Creating list of divided vector by index

使用如下示例的代码:

set.seed(11)
x <- sample(letters)
x
 [1] "h" "a" "m" "y" "b" "u" "v" "f" "p" "c" "q" "g" "x" "l" "i" "o" "e" "r" "t" "d" "z" "k" "s" "j" "w" "n"

并提供了这个值向量

y <- c(4, 13, 20)

我想通过 y 向量将 x 向量拆分为 'slicing' 索引。并将结果分组为列表。期望的输出:

z <- list(c("h", "a", "m", "y"),c("b", "u", "v", "f", "p", "c", "q", "g", "x"), c("l", "i", "o", "e", "r", "t", "d"), c("z", "k", "s", "j", "w", "n") )
z
[[1]]
[1] "h" "a" "m" "y"

[[2]]
[1] "b" "u" "v" "f" "p" "c" "q" "g" "x"

[[3]]
[1] "l" "i" "o" "e" "r" "t" "d"

[[4]]
[1] "z" "k" "s" "j" "w" "n"

我们可以创建一个等于 'x' 长度的“0”向量,使用 y 作为数字索引,将 v1 中的元素替换为 1,cumsum 结果并将其用作分组向量以拆分 'x'

v1 <- numeric(length(x))
v1[y+1] <- 1
split(x,cumsum(v1))

或者我们可以通过对 tabulate

的结果执行 cumsum 来获得分组向量
 split(x,cumsum(tabulate(y+1, length(x))))

或使用match

split(x,cumsum(c(TRUE,!is.na(match(seq_along(x), y)[-length(x)]))))

%in%

 split(x,cumsum(seq_along(x) %in% (y+1)))

为了好玩,创建分裂向量的另一种方法是使用 cut:

split(x, cut(seq_along(x), c(-Inf, y, Inf)))
# $`(-Inf,4]`
# [1] "h" "a" "m" "y"
# 
# $`(4,13]`
# [1] "b" "u" "v" "f" "p" "c" "q" "g" "x"
# 
# $`(13,20]`
# [1] "l" "i" "o" "e" "r" "t" "d"
# 
# $`(20, Inf]`
# [1] "z" "k" "s" "j" "w" "n"

它甚至会告诉您数据属于哪个组:-)


推而广之,这也意味着 findInterval 可以工作:

split(x, findInterval(seq_along(x), y+1))

在这两种情况下,我们都在查看从 1 到输入向量长度 "x" 的值属于哪个 bin,其中端点由 "y" 定义。