基于 dplyr 合并结果的条件变异

Conditional Mutate Based on Results of Coalesce in dplyr

我一直在从头开始自学 R,基本上是通过做一些事情,然后阅读这些帖子,并在此基础上反复试验。有时撞墙伸手

我碰壁了。我安装了 dplyr 0.7。我有一个带有列的 tibble - 称之为 contract_key - 我通过将 mutate(coalesce()) 应用于 tibble 中的其他三个列来添加。这是示例数据:

product <- c("655393265191","655393265191","168145850127","168145850127","350468621217","350468621217","977939797847","NA","928893912852")
supplier <- c("person5","person3","person10","person5","person11","person5","person11","person14","person5")
vendor <- c("org2","org3","org3","org2","org1","org2","org1","org5","org2")
quantity <- c(7,5,6,1,2,1,18,2,2)
gross <- c(0.0419,0.0193,0.0439,0.0069,0.0027,0.0055,0.0233,NA,0.0004)

df <- data_frame(product,supplier,vendor,quantity,gross)

我是这样生成的 contract_key:

df <- df %>% 
  mutate(contract_key = coalesce(product,supplier,vendor))

我现在想添加另一列,根据提供内容的三列中的哪一列(通过 coalesce())对 contract_key 的内容进行分类。因此,如果 contract_key ="person5",例如,新列 contract_level 将是 "supplier"。 contract_key="org2" 会映射到 contract_level = "vendor",等等

基本上,我将使用 contract_level 作为另一个 tibble 的连接变量。

我被难住了。我试过 if_else,我发现我不应该尝试 case_when(因为它在 mutate() 内部)。我也试过嵌套 if_else 无济于事。

这可能是我不知道的基本 R 语法。与点符号和语法有关。如果有人提供了答案,我会回溯直到我弄清楚你做了什么。 (而且我会在 R 中学到新的一课!)

谢谢!

这个怎么样:

df %>% mutate(contract_key = coalesce(product,supplier,vendor),
              contract_level = case_when(contract_key %in% product ~ "product",
                                         contract_key %in% supplier ~ "supplier",
                                         contract_key %in% vendor ~ "vendor",
                                         TRUE ~ "none"))
       product supplier vendor quantity  gross contract_key contract_level
1 655393265191  person5   org2        7 0.0419 655393265191        product
2 655393265191  person3   org3        5 0.0193 655393265191        product
3 168145850127 person10   org3        6 0.0439 168145850127        product
4 168145850127  person5   org2        1 0.0069 168145850127        product
5 350468621217 person11   org1        2 0.0027 350468621217        product
6 350468621217  person5   org2        1 0.0055 350468621217        product
7 977939797847 person11   org1       18 0.0233 977939797847        product
8         <NA> person14   org5        2     NA     person14       supplier
9 928893912852  person5   org2        2 0.0004 928893912852        product

需要较少代码的其他选项:

df %>% mutate(contract_key = coalesce(product,supplier,vendor),
              contract_level = if_else(!is.na(product), 'product', 
                                       if_else(!is.na(supplier), 'supplier', 'vendor')))

df %>% mutate(contract_key = coalesce(product,supplier,vendor),
              contract_level = apply(., 1, function(x) names(.)[min(which(!is.na(x)))]))