使用 R dbplyr 根据 oracle table 中的另一个字符 col 变异新字符 col

Mutate new character col based on another character col in oracle table using R dbplyr

我有一个 oracle table 和一个 col COMPLAINT_REASON

complaints_tbl %>% head() %>% select(COMPLAINT_REASON)

# Source:   lazy query [?? x 1]
# Database: Oracle 12.01.0020[user@user_db/]
  COMPLAINT_REASON         
  <chr>                    
1 Payment Related          
2 Bill Related          
3 Order Management          
4 Repair/Connection related
5 Broadband
6 Product fault   

我正在尝试创建一个名为 primary_reason 但具有不同值的新列,即如果 COMPLAINT_REASON = Payment Related 那么 primary_reason 应该具有 Payments.如果 none 匹配,则具有 primary_reason 列中的值。

在正常情况下,我会使用 data.table:

做这样的事情
complaints_tbl <- complaints_tbl[,primary_reason := forcats::fct_recode(COMPLAINT_REASON,
    "Payments"    = "Payment Related",
    "Billing"    = "Bill Related",
    "Orders"    = "Order Management",
    "Billing"    = "Billing/Payment Enquiry")]

如您所见,不可用的将按原样归入 primary reason。 (Product fault, Broadband, Repair/Connection related)Payment Relatedprimary_reason 中变为 Payments 等。

我试过:

complaints_tbl %>% mutate(primary_reason = forcats::fct_recode(COMPLAINT_REASON
    "Payments"    = "Payment Related",
    "Billing"    = "Bill Related",
    "Orders"    = "Order Management",
    "Billing"    = "Billing/Payment Enquiry"))

但是出现错误:

Error in check_factor(.f) : object 'COMPLAINT_REASON' not found

最后,最好将新列推回到 oracle 中现有的 table 以备将来使用。

有什么指点吗? 干杯

我在 SQL 服务器而不是 Oracle 工作,但我认为这里的挑战是确保 dbplyr 可以将您的命令翻译成数据库语言,无论您选择何种语言。

一般来说,dbplyr 很难翻译 dplyrtidyverse 集合之外的命令。因此,为什么 forcats::fct_recode 不适合你。

使用 ifelse 的示例解决方案,它在我的环境中得到了正确的翻译:

complaints_tbl %>%
  # create column for ease of changes
  mutate(primary_reason = NA) %>%
  # one mutate per match/rename
  mutate(primary_reason = ifelse(COMPLAINT_REASON = "Payment Related",
                                 yes = "Payments", no = primary_reason)) %>%
  mutate(primary_reason = ifelse(COMPLALINT_REASON = "Billing Related",
                                 yes = "Billing", no = primary_reason)) %>%
  # if none are matched
  mutate(primary_reason = ifelse(is.na(primary_reason),
                                 yes = COMPLAINT_REASON, no = primary_reason))

您可以使用 case_when.

而不是使用 ifelse 进行多次变异