R按索引创建分割向量列表
R Creating list of divided vector by index
使用如下示例的代码:
set.seed(11)
x <- sample(letters)
x
[1] "h" "a" "m" "y" "b" "u" "v" "f" "p" "c" "q" "g" "x" "l" "i" "o" "e" "r" "t" "d" "z" "k" "s" "j" "w" "n"
并提供了这个值向量
y <- c(4, 13, 20)
我想通过 y
向量将 x
向量拆分为 'slicing' 索引。并将结果分组为列表。期望的输出:
z <- list(c("h", "a", "m", "y"),c("b", "u", "v", "f", "p", "c", "q", "g", "x"), c("l", "i", "o", "e", "r", "t", "d"), c("z", "k", "s", "j", "w", "n") )
z
[[1]]
[1] "h" "a" "m" "y"
[[2]]
[1] "b" "u" "v" "f" "p" "c" "q" "g" "x"
[[3]]
[1] "l" "i" "o" "e" "r" "t" "d"
[[4]]
[1] "z" "k" "s" "j" "w" "n"
我们可以创建一个等于 'x' 长度的“0”向量,使用 y
作为数字索引,将 v1
中的元素替换为 1,cumsum
结果并将其用作分组向量以拆分 'x'
v1 <- numeric(length(x))
v1[y+1] <- 1
split(x,cumsum(v1))
或者我们可以通过对 tabulate
的结果执行 cumsum
来获得分组向量
split(x,cumsum(tabulate(y+1, length(x))))
或使用match
split(x,cumsum(c(TRUE,!is.na(match(seq_along(x), y)[-length(x)]))))
或%in%
split(x,cumsum(seq_along(x) %in% (y+1)))
为了好玩,创建分裂向量的另一种方法是使用 cut
:
split(x, cut(seq_along(x), c(-Inf, y, Inf)))
# $`(-Inf,4]`
# [1] "h" "a" "m" "y"
#
# $`(4,13]`
# [1] "b" "u" "v" "f" "p" "c" "q" "g" "x"
#
# $`(13,20]`
# [1] "l" "i" "o" "e" "r" "t" "d"
#
# $`(20, Inf]`
# [1] "z" "k" "s" "j" "w" "n"
它甚至会告诉您数据属于哪个组:-)
推而广之,这也意味着 findInterval
可以工作:
split(x, findInterval(seq_along(x), y+1))
在这两种情况下,我们都在查看从 1 到输入向量长度 "x" 的值属于哪个 bin,其中端点由 "y" 定义。
使用如下示例的代码:
set.seed(11)
x <- sample(letters)
x
[1] "h" "a" "m" "y" "b" "u" "v" "f" "p" "c" "q" "g" "x" "l" "i" "o" "e" "r" "t" "d" "z" "k" "s" "j" "w" "n"
并提供了这个值向量
y <- c(4, 13, 20)
我想通过 y
向量将 x
向量拆分为 'slicing' 索引。并将结果分组为列表。期望的输出:
z <- list(c("h", "a", "m", "y"),c("b", "u", "v", "f", "p", "c", "q", "g", "x"), c("l", "i", "o", "e", "r", "t", "d"), c("z", "k", "s", "j", "w", "n") )
z
[[1]]
[1] "h" "a" "m" "y"
[[2]]
[1] "b" "u" "v" "f" "p" "c" "q" "g" "x"
[[3]]
[1] "l" "i" "o" "e" "r" "t" "d"
[[4]]
[1] "z" "k" "s" "j" "w" "n"
我们可以创建一个等于 'x' 长度的“0”向量,使用 y
作为数字索引,将 v1
中的元素替换为 1,cumsum
结果并将其用作分组向量以拆分 'x'
v1 <- numeric(length(x))
v1[y+1] <- 1
split(x,cumsum(v1))
或者我们可以通过对 tabulate
cumsum
来获得分组向量
split(x,cumsum(tabulate(y+1, length(x))))
或使用match
split(x,cumsum(c(TRUE,!is.na(match(seq_along(x), y)[-length(x)]))))
或%in%
split(x,cumsum(seq_along(x) %in% (y+1)))
为了好玩,创建分裂向量的另一种方法是使用 cut
:
split(x, cut(seq_along(x), c(-Inf, y, Inf)))
# $`(-Inf,4]`
# [1] "h" "a" "m" "y"
#
# $`(4,13]`
# [1] "b" "u" "v" "f" "p" "c" "q" "g" "x"
#
# $`(13,20]`
# [1] "l" "i" "o" "e" "r" "t" "d"
#
# $`(20, Inf]`
# [1] "z" "k" "s" "j" "w" "n"
它甚至会告诉您数据属于哪个组:-)
推而广之,这也意味着 findInterval
可以工作:
split(x, findInterval(seq_along(x), y+1))
在这两种情况下,我们都在查看从 1 到输入向量长度 "x" 的值属于哪个 bin,其中端点由 "y" 定义。