使用 R dbplyr 根据 oracle table 中的另一个字符 col 变异新字符 col
Mutate new character col based on another character col in oracle table using R dbplyr
我有一个 oracle
table 和一个 col COMPLAINT_REASON
complaints_tbl %>% head() %>% select(COMPLAINT_REASON)
# Source: lazy query [?? x 1]
# Database: Oracle 12.01.0020[user@user_db/]
COMPLAINT_REASON
<chr>
1 Payment Related
2 Bill Related
3 Order Management
4 Repair/Connection related
5 Broadband
6 Product fault
我正在尝试创建一个名为 primary_reason
但具有不同值的新列,即如果 COMPLAINT_REASON
= Payment Related
那么 primary_reason
应该具有 Payments
.如果 none
匹配,则具有 primary_reason
列中的值。
在正常情况下,我会使用 data.table
:
做这样的事情
complaints_tbl <- complaints_tbl[,primary_reason := forcats::fct_recode(COMPLAINT_REASON,
"Payments" = "Payment Related",
"Billing" = "Bill Related",
"Orders" = "Order Management",
"Billing" = "Billing/Payment Enquiry")]
如您所见,不可用的将按原样归入 primary reason
。 (Product fault, Broadband, Repair/Connection related)
和 Payment Related
在 primary_reason
中变为 Payments
等。
我试过:
complaints_tbl %>% mutate(primary_reason = forcats::fct_recode(COMPLAINT_REASON
"Payments" = "Payment Related",
"Billing" = "Bill Related",
"Orders" = "Order Management",
"Billing" = "Billing/Payment Enquiry"))
但是出现错误:
Error in check_factor(.f) : object 'COMPLAINT_REASON' not found
最后,最好将新列推回到 oracle
中现有的 table 以备将来使用。
有什么指点吗?
干杯
我在 SQL 服务器而不是 Oracle 工作,但我认为这里的挑战是确保 dbplyr
可以将您的命令翻译成数据库语言,无论您选择何种语言。
一般来说,dbplyr
很难翻译 dplyr
或 tidyverse
集合之外的命令。因此,为什么 forcats::fct_recode
不适合你。
使用 ifelse
的示例解决方案,它在我的环境中得到了正确的翻译:
complaints_tbl %>%
# create column for ease of changes
mutate(primary_reason = NA) %>%
# one mutate per match/rename
mutate(primary_reason = ifelse(COMPLAINT_REASON = "Payment Related",
yes = "Payments", no = primary_reason)) %>%
mutate(primary_reason = ifelse(COMPLALINT_REASON = "Billing Related",
yes = "Billing", no = primary_reason)) %>%
# if none are matched
mutate(primary_reason = ifelse(is.na(primary_reason),
yes = COMPLAINT_REASON, no = primary_reason))
您可以使用 case_when
.
而不是使用 ifelse
进行多次变异
我有一个 oracle
table 和一个 col COMPLAINT_REASON
complaints_tbl %>% head() %>% select(COMPLAINT_REASON)
# Source: lazy query [?? x 1]
# Database: Oracle 12.01.0020[user@user_db/]
COMPLAINT_REASON
<chr>
1 Payment Related
2 Bill Related
3 Order Management
4 Repair/Connection related
5 Broadband
6 Product fault
我正在尝试创建一个名为 primary_reason
但具有不同值的新列,即如果 COMPLAINT_REASON
= Payment Related
那么 primary_reason
应该具有 Payments
.如果 none
匹配,则具有 primary_reason
列中的值。
在正常情况下,我会使用 data.table
:
complaints_tbl <- complaints_tbl[,primary_reason := forcats::fct_recode(COMPLAINT_REASON,
"Payments" = "Payment Related",
"Billing" = "Bill Related",
"Orders" = "Order Management",
"Billing" = "Billing/Payment Enquiry")]
如您所见,不可用的将按原样归入 primary reason
。 (Product fault, Broadband, Repair/Connection related)
和 Payment Related
在 primary_reason
中变为 Payments
等。
我试过:
complaints_tbl %>% mutate(primary_reason = forcats::fct_recode(COMPLAINT_REASON
"Payments" = "Payment Related",
"Billing" = "Bill Related",
"Orders" = "Order Management",
"Billing" = "Billing/Payment Enquiry"))
但是出现错误:
Error in check_factor(.f) : object 'COMPLAINT_REASON' not found
最后,最好将新列推回到 oracle
中现有的 table 以备将来使用。
有什么指点吗? 干杯
我在 SQL 服务器而不是 Oracle 工作,但我认为这里的挑战是确保 dbplyr
可以将您的命令翻译成数据库语言,无论您选择何种语言。
一般来说,dbplyr
很难翻译 dplyr
或 tidyverse
集合之外的命令。因此,为什么 forcats::fct_recode
不适合你。
使用 ifelse
的示例解决方案,它在我的环境中得到了正确的翻译:
complaints_tbl %>%
# create column for ease of changes
mutate(primary_reason = NA) %>%
# one mutate per match/rename
mutate(primary_reason = ifelse(COMPLAINT_REASON = "Payment Related",
yes = "Payments", no = primary_reason)) %>%
mutate(primary_reason = ifelse(COMPLALINT_REASON = "Billing Related",
yes = "Billing", no = primary_reason)) %>%
# if none are matched
mutate(primary_reason = ifelse(is.na(primary_reason),
yes = COMPLAINT_REASON, no = primary_reason))
您可以使用 case_when
.
ifelse
进行多次变异