在 R 中将 Group by 与 mutate、case_when、any() 和 all() 函数一起使用

Use Group by with mutate, case_when, any() and all() function in R

我有一个 status_df 每个阶段的 ID 和状态:

id stage status
15 1 Pending
15 2 Not Sent
16 1 Approved
16 2 Rejected
16 3 Not Sent
16 4 Not Sent
20 1 Approved
20 2 Approved
20 3 Approved

我正在尝试做一个 group_by ID 并应用以下逻辑:

我正在尝试这个(不工作):

final_status_df = status_df %>% select(id, status) %>% group_by(id) %>%
mutate(final_status = case_when(any(status)=="Pending" ~ "Pending",
any(status)=="Rejected" ~ "Rejected", 
all(status)=="Approved" ~ "Approved"))

预期输出 (final_status_df):

id final_status
15 Pending
16 Rejected
20 Approved

我们可以使用summarise代替mutate(因为mutate returns输出列与输入列相同length并且它是习惯于 create/modify 专栏而不是总结。

此外,一个更简单的选择是将自定义订单中指定的 levels 转换为 factor,删除未使用的级别 (droplevels) 和 select first levels 按 'id'

分组后
library(dplyr)
status_df %>%
    group_by(id) %>%
    summarise(final_status = first(levels(droplevels(factor(status, 
          levels = c("Pending", "Rejected", "Approved"))))), .groups = 'drop')

-输出

# A tibble: 3 x 2
#     id final_status
#  <int> <chr>       
#1    15 Pending     
#2    16 Rejected    
#3    20 Approved    

在 OP 的代码中,any(status) returns NA,它应该被包装在一个逻辑向量上,即 any(status == "Pending")。另外,如上所述,它将是 summarise 而不是 mutate

数据

status_df <- structure(list(id = c(15L, 15L, 16L, 16L, 16L, 16L, 20L, 20L, 
20L), stage = c(1L, 2L, 1L, 2L, 3L, 4L, 1L, 2L, 3L), status = c("Pending", 
"Not Sent", "Approved", "Rejected", "Not Sent", "Not Sent", "Approved", 
"Approved", "Approved")), class = "data.frame", row.names = c(NA, 
-9L))

您的尝试方向正确,但是,您在比较 (==) 之前提前关闭了 any/all 括号。此外,由于每个 id 只需要 1 行,因此可以使用 summarise 而不是 mutate,这也将避免使用 select.

library(dplyr)

status_df %>% 
  group_by(id) %>%
  summarise(final_status = case_when(any(status == "Pending") ~ "Pending",
                                     any(status == "Rejected") ~ "Rejected", 
                                     all(status == "Approved") ~ "Approved"))

#    id final_status
#* <int> <chr>       
#1    15 Pending     
#2    16 Rejected    
#3    20 Approved