根据 R 中多行的条件进行变异

Question

如何评估 dplyr 中多行的条件？我有一个数据集，我想根据多个时间段内发生的条件（转换）进行变异。

按照下面的例子，如果一个人通过了一个不好的状态，它必须被视为整体不好。我试过 mutate_if 但它不起作用，或者我可能无法理解语法

df <-
  data.frame(ID = c(1,1,1,2,2,2,3,3,3),
             Date= c(1,2,3,1,2,3,1,2,3),
             Money = c(500,400,500,100,100,100,200,300,300), 
             Status = c("Good", "Bad", "Good", "Good","Good","Good", "Bad","Good","Good"))

你能给我提供一个解决方案来达到以下结果吗？如果可能的话，我更愿意呆在 dplyr 的范围内，尽管我知道 datatable

可以进行一些很好的治疗

result <- 
  data.frame(ID = c(1,1,1,2,2,2,3,3,3), 
                     Date= c(1,2,3,1,2,3,1,2,3),
                     Money = c(500,400,500,100,100,100,200,300,300),
                     Status = c("Good", "Bad", "Good", "Good","Good","Good", "Bad","Good","Good"),
                     Status_overall = c("Bad", "Bad", "Bad", "Good","Good","Good", "Bad","Bad","Bad"))

Answer 1

你可以 return 'Bad' 如果 any Status 是 'Bad' 在 ID.

library(dplyr)

df %>%
  group_by(ID) %>%
  mutate(Status_overall = if(any(Status == 'Bad')) 'Bad' else 'Good')
  #Without if/else
  #mutate(Status_overall = c('Good', 'Bad')[any(Status == 'Bad') + 1])

#    ID  Date Money Status Status_overall
#  <dbl> <dbl> <dbl> <chr>  <chr>         
#1     1     1   500 Good   Bad           
#2     1     2   400 Bad    Bad           
#3     1     3   500 Good   Bad           
#4     2     1   100 Good   Good          
#5     2     2   100 Good   Good          
#6     2     3   100 Good   Good          
#7     3     1   200 Bad    Bad           
#8     3     2   300 Good   Bad           
#9     3     3   300 Good   Bad

这可以用基数 R 和 data.table 写成：

df$Status_overall <- with(df, ifelse(ave(Status == 'Bad', ID, FUN = any), 'Bad', 'Good'))

library(data.table)
setDT(df)[, Status_overall := if(any(Status == 'Bad')) 'Bad' else 'Good', ID]

Answer 2

这个有用吗：

library(dplyr)
df %>% group_by(ID) %>% 
       mutate(Status_overall = case_when('Bad' %in% Status ~ 'Bad', TRUE ~ 'Good'))
# A tibble: 9 x 5
# Groups:   ID [3]
     ID  Date Money Status Status_overall
  <dbl> <dbl> <dbl> <chr>  <chr>         
1     1     1   500 Good   Bad           
2     1     2   400 Bad    Bad           
3     1     3   500 Good   Bad           
4     2     1   100 Good   Good          
5     2     2   100 Good   Good          
6     2     3   100 Good   Good          
7     3     1   200 Bad    Bad           
8     3     2   300 Good   Bad           
9     3     3   300 Good   Bad

Answer 3

我们可以使用if和 %in%

library(dplyr)
df %>%
      group_by(ID) %>%
      mutate(Status_overall = if('Bad' %in% Status) 'Bad' else 'Good')

根据 R 中多行的条件进行变异

Mutate according to conditions on multiple rows in R

transition

r

conditional-statements

dplyr