如何获取滑动 windows 的组 ID 但仅在最早开始 window 完成后

Question

直说吧，我解释起来可能有点复杂。

让我们假设我有一个运行长度的向量（向前）滑动 window 大小。

xx <- c(3L, 2L, 1L, 4L, 4L, 3L, 3L, 1L, 2L, 1L, 2L, 3L, 4L, 3L, 2L, 1L)
xx
[1] 3 2 1 4 4 3 3 1 2 1 2 3 4 3 2 1

第一个元素是3，这意味着window它的大小（向前）的长度是3。依此类推。
我想为前三个（因为第一个元素是 3）元素分配相同的组号。说 1
现在我想保留第二个和第三个元素 - 因为它们包含在第一个 window 中，因此这些元素被分配到相同的组号。即 1
然后选择第4个元素，它的window大小为4。然后将接下来的四个元素（包括这个）分配到另一个组号。说 2.
现在选择第8个元素（3+4已经完成）。并分配唯一组不说 3 到其大小的元素计数，仅 1。
接下来选择第 9 个元素，依此类推。
保证最后一组会自动用完否则就是1

我想要的输出如下：

c(1, 1, 1, 2, 2, 2, 2, 3, 4, 4, 5, 5, 6, 6, 6, 6)

Answer 1

带有丑陋 while 循环的选项：

xx <- c(3L, 2L, 1L, 4L, 4L, 3L, 3L, 1L, 2L, 1L, 2L, 3L, 4L, 3L, 2L, 1L)
#Initialise output vector
yy <- integer(length(xx))
#Assign the 1st group
yy[1:xx[1]] <- 1
#Set the current position
i <- xx[1] + 1
#Initialise the group number
group <- 2

#While all the groups have been assigned
while(any(yy == 0)) {
  #Assign the next group number
  yy[i:(i+xx[i] - 1)] <- group
  #Increment the group number
  group <- group + 1
  #Increment the current position.
  i <- i+xx[i]
}
yy

#[1] 1 1 1 2 2 2 2 3 4 4 5 5 6 6 6 6

Answer 2

这是使用 Rcpp 的可能方法：

Rcpp::cppFunction("
IntegerVector decode_rle(IntegerVector x) {
    const int n = x.size();
    IntegerVector res(n);
    int cnt = 0;
    int rle = x[0];
    int gcnt = 1;
    for(int i = 0; i < n; i ++){
        cnt++;
        if(cnt <= rle){
            res[i] = gcnt;
        }else{
            rle = x[i];
            cnt = 1;
            res[i] = ++gcnt;
        }
    }
    return res;
}")

xx <- c(3, 2, 1, 4, 4, 3, 3, 1, 2, 1, 2, 3, 4, 3, 2, 1)
decode_rle(xx)
# [1] 1 1 1 2 2 2 2 3 4 4 5 5 6 6 6 6

Answer 3

您可以使用 Reduce 和 return 您从中跳转到的累积索引。 as.factor 和 as.integer 用于获取数字 1, 2, 3, ...

as.integer(as.factor(Reduce(function(i, j) if(i > j) i else i + xx[i+1],
 seq_len(length(xx)-1), xx[1], accumulate = TRUE)))
# [1] 1 1 1 2 2 2 2 3 4 4 5 5 6 6 6 6

另一种选择是使用递归函数。

f <- function(i) {
  if(i >= length(xx)) length(xx)
  else c(i, f(i + xx[i + 1]))
}

x <- diff(f(0))
rep(seq(x), x)
# [1] 1 1 1 2 2 2 2 3 4 4 5 5 6 6 6 6

Answer 4

从@GKi 的精彩中获取线索，我将其翻译为 purrr::accumulate

accumulate(seq_len(length(xx)-1), .init = xx[1], ~ifelse(.x > .y, .x, .x + xx[.x +1]))

[1]  3  3  3  7  7  7  7  8 10 10 12 12 16 16 16 16

Answer 5

虽然不是您一开始想要的，但我终于找到了一种通过 recursion 获得所需输出的方法，这是我最喜欢的编程技术之一。我也试图在我的方法中尽可能简洁。希望你喜欢：

xx <- c(3L, 2L, 1L, 4L, 4L, 3L, 3L, 1L, 2L, 1L, 2L, 3L, 4L, 3L, 2L, 1L)

i <- 1
out <- c()
fn <- function(x) {
  out <<- c(out, rep(i, x[1]))
  x <- x[-(1:x[1])]
  if(length(x) != 0) {
    i <<- i + 1
  } else {
    return(out)
  }
  fn(x)
}

fn(xx)
[1] 1 1 1 2 2 2 2 3 4 4 5 5 6 6 6 6

如何获取滑动 windows 的组 ID 但仅在最早开始 window 完成后

How to get group id of sliding windows but only after completion of earliest started window

r

rolling-computation