删除 dplyr 链中具有序列的行组

Question

我正在尝试从 data.frame 中删除一些行。这些独特的行位于我的 data.frame 的每第 3 组中。这是示例 df

set.seed(1)

AC <- rep(rep(c(78,110),each=1),times=18)
state <- rep(rep(c("Group 1","Group 2"),3),each=12)
V <- rep(seq(100,400,100),times=9)
R = sort(replicate(9, sample(5000:6000,4)))
df <- data.frame(AC,V,R,state)

head(df)

   AC   V    R   state
1  78 100 5001 Group 1
2 110 200 5054 Group 1
3  78 300 5064 Group 1
4 110 400 5069 Group 1
5  78 100 5117 Group 1
6 110 200 5123 Group 1

它们按 V 列中的更改排序。 V 列在每个 100:400 序列中发生变化。因此，每 state 行更改中就有 3 个组。我想在每个组中删除这个 3rd 组。
我认为添加 No 列以删除此 3rd 列可能也很有用。因为我在这里提供的示例已经是 group_by，所以我只需要添加新的 No 列并删除 data.frame 组的第 3、6、9nth...。

我想在 dplyr 链中执行此过程。因为我在 dplyr 链中处理我的真实数据。但我欢迎任何其他功能来执行此操作。

我寻找的输出

   No  AC   V    R   state
    1  78 100 5001 Group 1
    1 110 200 5008 Group 1
    1  78 300 5022 Group 1
    1 110 400 5055 Group 1
    2  78 100 5133 Group 1
    2 110 200 5163 Group 1
    2  78 300 5187 Group 1
    2 110 400 5189 Group 1
    4  78 100 5459 Group 2
    4 110 200 5467 Group 2
    4  78 300 5471 Group 2
    4 110 400 5501 Group 2
    5  78 100 5515 Group 2
    5 110 200 5531 Group 2
    5  78 300 5540 Group 2
    5 110 400 5553 Group 2
    7  78 100 5686 Group 1
    7 110 200 5717 Group 1
    7  78 300 5726 Group 1
    7 110 400 5755 Group 1
   ***********************

Answer 1

有点难以理解你的问题，因为当我复制你的可重现示例时，我的数据框不等于你的。但据我了解，您只是想对每组 4 行进行编号（根据 V）并删除每三分之一行。

在那种情况下，尝试：

df %>% 
  mutate(No = cumsum(df$V == 100)) %>%
  subset(No %% 3 != 0)

第二行使用df$V == 100表示"start of a new block of Vs"，以便分配编号。

为了回应您对 Q 的评论，我没有在此处使用 state 列（在您的可重现示例中，每个 (state, V) 都有 9 行不是问题中所述的 3...)

请注意，这假设 df 已经按照您上面的问题进行了排序（V 的顺序为 (100,200,300,400)，并且状态每 12 行交替一次）

删除 dplyr 链中具有序列的行组

Removing group of rows with sequence in dplyr chain

r

dataframe

window-functions

dplyr