如何在 R 中以滚动方式匹配倒数第三个(或其他)?

How to match third (or whatever) from the bottom in a rolling fashion in R?

这是我的示例数据框,具有预期的输出。

data.frame(index=c("3435pear","3435grape","3435apple","3435avocado","3435orange","3435kiwi","3436grapefruit","3436apple","3436banana","3436grape","3437apple","3437grape","3437avocado","3437orange","3438apple","3439apple","3440apple"),output=c("na","na","na","na","na","na","na","na","na","na","na","na","na","na","3435apple","3436apple","3437apple"))

                index    output
1        3435pear        na
2       3435grape        na
3       3435apple        na
4     3435avocado        na
5      3435orange        na
6        3435kiwi        na
7  3436grapefruit        na
8       3436apple        na
9      3436banana        na
10      3436grape        na
11      3437apple        na
12      3437grape        na
13    3437avocado        na
14     3437orange        na
15      3438apple 3435apple
16      3439apple 3436apple
17      3440apple 3437apple

我想匹配倒数第三个水果。如果之前没有三个水果,它应该 return NA。一旦第 4 个苹果出现,它与它之前的 3 个苹果匹配,然后第 5 个苹果出现,它与前一个 3 匹配,依此类推。

我试图使用 rollapply、match 和 tail 来完成这项工作,但我不知道如何引用当前行进行匹配。在 excel 中,我会使用 large、if 和 row 函数来执行此操作。 Excel 让我的电脑磨了几个小时来计算所有东西,我知道 R 可以在几分钟(几秒?)内完成。

你可以这样做:

library(dplyr) 

df %>% 
  mutate(fruit = gsub("[0-9]", "", index)) %>% 
  group_by(fruit) %>% 
  mutate(new_output = lag(index, 3)) %>% 
  select(-fruit) %>%
  ungroup

每组 fruit,你的 new_output 给你的 index 值滞后 3。我保留了 output 列并将我的结果保存在 new_output 以便您进行比较。