如何在 R 中以滚动方式匹配倒数第三个(或其他)?
How to match third (or whatever) from the bottom in a rolling fashion in R?
这是我的示例数据框,具有预期的输出。
data.frame(index=c("3435pear","3435grape","3435apple","3435avocado","3435orange","3435kiwi","3436grapefruit","3436apple","3436banana","3436grape","3437apple","3437grape","3437avocado","3437orange","3438apple","3439apple","3440apple"),output=c("na","na","na","na","na","na","na","na","na","na","na","na","na","na","3435apple","3436apple","3437apple"))
index output
1 3435pear na
2 3435grape na
3 3435apple na
4 3435avocado na
5 3435orange na
6 3435kiwi na
7 3436grapefruit na
8 3436apple na
9 3436banana na
10 3436grape na
11 3437apple na
12 3437grape na
13 3437avocado na
14 3437orange na
15 3438apple 3435apple
16 3439apple 3436apple
17 3440apple 3437apple
我想匹配倒数第三个水果。如果之前没有三个水果,它应该 return NA。一旦第 4 个苹果出现,它与它之前的 3 个苹果匹配,然后第 5 个苹果出现,它与前一个 3 匹配,依此类推。
我试图使用 rollapply、match 和 tail 来完成这项工作,但我不知道如何引用当前行进行匹配。在 excel 中,我会使用 large、if 和 row 函数来执行此操作。 Excel 让我的电脑磨了几个小时来计算所有东西,我知道 R 可以在几分钟(几秒?)内完成。
你可以这样做:
library(dplyr)
df %>%
mutate(fruit = gsub("[0-9]", "", index)) %>%
group_by(fruit) %>%
mutate(new_output = lag(index, 3)) %>%
select(-fruit) %>%
ungroup
每组 fruit
,你的 new_output
给你的 index
值滞后 3。我保留了 output
列并将我的结果保存在 new_output
以便您进行比较。
这是我的示例数据框,具有预期的输出。
data.frame(index=c("3435pear","3435grape","3435apple","3435avocado","3435orange","3435kiwi","3436grapefruit","3436apple","3436banana","3436grape","3437apple","3437grape","3437avocado","3437orange","3438apple","3439apple","3440apple"),output=c("na","na","na","na","na","na","na","na","na","na","na","na","na","na","3435apple","3436apple","3437apple"))
index output
1 3435pear na
2 3435grape na
3 3435apple na
4 3435avocado na
5 3435orange na
6 3435kiwi na
7 3436grapefruit na
8 3436apple na
9 3436banana na
10 3436grape na
11 3437apple na
12 3437grape na
13 3437avocado na
14 3437orange na
15 3438apple 3435apple
16 3439apple 3436apple
17 3440apple 3437apple
我想匹配倒数第三个水果。如果之前没有三个水果,它应该 return NA。一旦第 4 个苹果出现,它与它之前的 3 个苹果匹配,然后第 5 个苹果出现,它与前一个 3 匹配,依此类推。
我试图使用 rollapply、match 和 tail 来完成这项工作,但我不知道如何引用当前行进行匹配。在 excel 中,我会使用 large、if 和 row 函数来执行此操作。 Excel 让我的电脑磨了几个小时来计算所有东西,我知道 R 可以在几分钟(几秒?)内完成。
你可以这样做:
library(dplyr)
df %>%
mutate(fruit = gsub("[0-9]", "", index)) %>%
group_by(fruit) %>%
mutate(new_output = lag(index, 3)) %>%
select(-fruit) %>%
ungroup
每组 fruit
,你的 new_output
给你的 index
值滞后 3。我保留了 output
列并将我的结果保存在 new_output
以便您进行比较。