dplyr mutate stringr str_detect 具有多个条件参数和相应的输出

dplyr mutate stringr str_detect with multiple conditional arguments and corresponding output

我想根据格式以不同方式改变字符串。此示例有 2 种基于包含某些标点符号的格式。向量的每个元素都包含与格式唯一关联的特定词。

我尝试了多种使用 ifelse 和 casewhen 的方法,但没有得到想要的结果,即 "keep" 字符串的最后一部分。

我正在尝试使用简单的动词,但不精通 grex。接受任何有关有效通用方法的建议。

library(dplyr)
library(stringr)
df <- data.frame(KPI = c("xxxxx.x...Alpha...Keep.1",
                     "xxxxx.x...Alpha..Keep.2",
                     "Bravo...Keep3",
                     "Bravo...Keep4",
                     "xxxxx...Charlie...Keep.5",
                     "xxxxx...Charlie...Keep.6"))

dot3dot3split <- function(x) strsplit(x,  "..." , fixed = TRUE)[[1]][3]
dot3dot3split("xxxxx.x...Alpha...Keep.1") # returns as expected
"Keep.1"

dot3split <- function(x) strsplit(x,  "..." , fixed = TRUE)[[1]][2]
dot3split("Bravo...Keep3") # returns as expected
"Keep3"

df1 <- df %>% mutate_if(is.factor, as.character) %>%
        mutate(KPI.v2 = ifelse(str_detect(KPI, paste(c("Alpha", "Charlie"), collapse = '|')), dot3dot3split(KPI), 
                               ifelse(str_detect(KPI, "Bravo"), dot3split(KPI), KPI))) # not working as expected

df1$KPI.v2 "Keep.1" "Keep.1" "Alpha" "Alpha" "Keep.1" "Keep.1"

您设计的函数(dot3dot3splitdot3split)无法向量化操作。例如,如果有多个元素,则只返回第一个。这可能会导致一些问题。

dot3dot3split(c("xxxxx.x...Alpha...Keep.1", "xxxxx.x...Alpha..Keep.2"))
# [1] "Keep.1" 

既然你使用的是,我建议你可以使用str_extract来提取你想要的字符串,而不需要使用ifelse或者可以进行向量化操作的函数

df <- data.frame(KPI = c("xxxxx.x...Alpha...apples",
                         "xxxxx.x...Alpha..bananas",
                         "Bravo...oranges",
                         "Bravo...grapes",
                         "xxxxx...Charlie...cherries",
                         "xxxxx...Charlie...guavas"))

library(dplyr)
library(stringr)

df1 <- df %>%
  mutate_if(is.factor, as.character) %>%
  mutate(KPI.v2 = str_extract(KPI, "[A-Za-z]*$"))
df1
#                          KPI   KPI.v2
# 1   xxxxx.x...Alpha...apples   apples
# 2   xxxxx.x...Alpha..bananas  bananas
# 3            Bravo...oranges  oranges
# 4             Bravo...grapes   grapes
# 5 xxxxx...Charlie...cherries cherries
# 6   xxxxx...Charlie...guavas   guavas