使用 case_when 跨列创建新列
Use case_when across columns to make a new column
我有一个大型数据集,其中包含许多状态列。我想创建一个包含参与者当前状态的新专栏。我正在尝试在 dplyr 中使用 case_when,但我不确定如何跨列。数据集的列太多,我无法输入每一列。这是数据示例:
library(dplyr)
problem <- tibble(name = c("sally", "jane", "austin", "mike"),
status1 = c("registered", "completed", "registered", "no action"),
status2 = c("completed", "completed", "registered", "no action"),
status3 = c("completed", "completed", "withdrawn", "no action"),
status4 = c("withdrawn", "completed", "no action", "registered"))
对于代码,我想要一个新列来说明参与者的最终状态;但是,如果他们的状态 ever 已完成,那么我希望它说完成,无论他们的最终状态是什么。对于此数据,答案如下所示:
answer <- tibble(name = c("sally", "jane", "austin", "mike"),
status1 = c("registered", "completed", "registered", "no action"),
status2 = c("completed", "completed", "registered", "no action"),
status3 = c("completed", "completed", "withdrawn", "no action"),
status4 = c("withdrawn", "completed", "no action", "registered"),
finalstatus = c("completed", "completed", "no action", "registered"))
此外,如果您能对您的代码进行任何解释,我将不胜感激!如果您的解决方案也可以使用 contains("status"),那将特别有用,因为在我的真实数据集中,状态列非常混乱(即 summary_status_5292019、sum_status_07012018 等) .
谢谢!
选项pmap
library(tidyverse)
problem %>%
mutate(finalstatus = pmap_chr(select(., starts_with('status')), ~
case_when(any(c(...) == "completed")~ "completed",
any(c(...) == "withdrawn") ~ "no action",
TRUE ~ "registered")))
下面是执行这种 "row matching" 操作的函数。与 case_when 类似,您可以按特定顺序放置 checks
向量,以便在找到一个元素的匹配项时,例如'completed'
在数据中,不考虑后面元素的匹配。
row_match <- function(data, checks, labels){
matches <- match(unlist(data), checks)
dim(matches) <- dim(data)
labels[apply(matches, 1, min, na.rm = T)]
}
df %>%
mutate(final.stat = row_match(
data = select(df, starts_with('status')),
checks = c('completed', 'withdrawn', 'registered'),
labels = c('completed', 'no action', 'registered')))
# # A tibble: 4 x 6
# name status1 status2 status3 status4 final.stat
# <chr> <chr> <chr> <chr> <chr> <chr>
# 1 sally registered completed completed withdrawn completed
# 2 jane completed completed completed completed completed
# 3 austin registered registered withdrawn no action no action
# 4 mike no action no action no action registered registered
我有一个大型数据集,其中包含许多状态列。我想创建一个包含参与者当前状态的新专栏。我正在尝试在 dplyr 中使用 case_when,但我不确定如何跨列。数据集的列太多,我无法输入每一列。这是数据示例:
library(dplyr)
problem <- tibble(name = c("sally", "jane", "austin", "mike"),
status1 = c("registered", "completed", "registered", "no action"),
status2 = c("completed", "completed", "registered", "no action"),
status3 = c("completed", "completed", "withdrawn", "no action"),
status4 = c("withdrawn", "completed", "no action", "registered"))
对于代码,我想要一个新列来说明参与者的最终状态;但是,如果他们的状态 ever 已完成,那么我希望它说完成,无论他们的最终状态是什么。对于此数据,答案如下所示:
answer <- tibble(name = c("sally", "jane", "austin", "mike"),
status1 = c("registered", "completed", "registered", "no action"),
status2 = c("completed", "completed", "registered", "no action"),
status3 = c("completed", "completed", "withdrawn", "no action"),
status4 = c("withdrawn", "completed", "no action", "registered"),
finalstatus = c("completed", "completed", "no action", "registered"))
此外,如果您能对您的代码进行任何解释,我将不胜感激!如果您的解决方案也可以使用 contains("status"),那将特别有用,因为在我的真实数据集中,状态列非常混乱(即 summary_status_5292019、sum_status_07012018 等) .
谢谢!
选项pmap
library(tidyverse)
problem %>%
mutate(finalstatus = pmap_chr(select(., starts_with('status')), ~
case_when(any(c(...) == "completed")~ "completed",
any(c(...) == "withdrawn") ~ "no action",
TRUE ~ "registered")))
下面是执行这种 "row matching" 操作的函数。与 case_when 类似,您可以按特定顺序放置 checks
向量,以便在找到一个元素的匹配项时,例如'completed'
在数据中,不考虑后面元素的匹配。
row_match <- function(data, checks, labels){
matches <- match(unlist(data), checks)
dim(matches) <- dim(data)
labels[apply(matches, 1, min, na.rm = T)]
}
df %>%
mutate(final.stat = row_match(
data = select(df, starts_with('status')),
checks = c('completed', 'withdrawn', 'registered'),
labels = c('completed', 'no action', 'registered')))
# # A tibble: 4 x 6
# name status1 status2 status3 status4 final.stat
# <chr> <chr> <chr> <chr> <chr> <chr>
# 1 sally registered completed completed withdrawn completed
# 2 jane completed completed completed completed completed
# 3 austin registered registered withdrawn no action no action
# 4 mike no action no action no action registered registered