编写自定义函数时将列名传递给 dplyr::coalesce()
Pass column names to dplyr::coalesce() when writing a custom function
我正在尝试编写一个函数来包装 dplyr::coalesce()
,并将接收数据对象和列名以合并。到目前为止,我的尝试都失败了。
示例数据
library(dplyr)
df <-
data.frame(col_a = c("bob", NA, "bob", NA, "bob"),
col_b = c(NA, "danny", NA, NA, NA),
col_c = c("paul", NA, NA, "paul", NA))
## col_a col_b col_c
## 1 bob <NA> paul
## 2 <NA> danny <NA>
## 3 bob <NA> <NA>
## 4 <NA> <NA> paul
## 5 bob <NA> <NA>
在编写自定义函数时存根
coalesce_plus_1 <- function(data, vars) {
data %>%
mutate(coalesced_col = coalesce(!!! rlang::syms(tidyselect::vars_select(names(.), vars))))
}
coalesce_plus_2 <- function(data, vars) {
data %>%
mutate(coalesced_col = coalesce(!!! rlang::syms(vars)))
}
coalesce_plus_3 <- function(data, vars) {
data %>%
mutate(coalesced_col = coalesce({{ vars }}))
}
结果...
coalesce_plus_1()
df %>%
coalesce_plus_1(data = ., vars = c(col_a, col_b, col_c))
Error: object 'col_a' not found.
但是:
df %>%
coalesce_plus_1(data = ., vars = all_of(starts_with("col")))
## col_a col_b col_c coalesced_col
## 1 <NA> <NA> paul paul
## 2 <NA> danny <NA> danny
## 3 bob <NA> <NA> bob
## 4 <NA> <NA> paul paul
## 5 bob <NA> <NA> bob
coalesce_plus_2()
df %>%
coalesce_plus_2(data = ., vars = c(col_a, col_b, col_c))
Error in lapply(.x, .f, ...) : object 'col_a' not found
还有
df %>%
coalesce_plus_2(data = ., vars = all_of(starts_with("col")))
Error: starts_with()
must be used within a selecting function.
i See https://tidyselect.r-lib.org/reference/faq-selection-context.html.
Run rlang::last_error()
to see where the error occurred.
coalesce_plus_3()
df %>%
coalesce_plus_3(data = ., vars = c(col_a, col_b, col_c))
Error: Problem with mutate()
input coalesced_col
. x Input
coalesced_col
can't be recycled to size 5. i Input coalesced_col
is coalesce(c(col_a, col_b, col_c))
. i Input coalesced_col
must be
size 5 or 1, not 15.
还有
df %>%
coalesce_plus_3(data = ., vars = all_of(starts_with("col")))
Error: Problem with mutate()
input coalesced_col
.
x starts_with()
must be used within a selecting function.
i See https://tidyselect.r-lib.org/reference/faq-selection-context.html.
i Input coalesced_col
is coalesce(all_of(starts_with("col")))
.
底线
我如何为 coalesce()
编写一个自定义函数,它将接受一个数据对象和特定的列名来合并,允许特定的命名,例如 c(col_a, col_b, col_c)
和辅助函数,例如 starts_with("col")
在函数的 vars
参数中?
这是一个简单的实现,它只会 return select 列,但可以相当容易地扩展以保留所有列(我会 bind_cols
最后将它们重新打开。 ..).
这很简单,因为我们依靠 select
为我们完成工作,正如 Implementing tidyselect vignette
开头所建议的那样
# edited to keep all columns
coalesce_df = function(data, ...) {
data %>%
select(...) %>%
transmute(result = invoke(coalesce, .)) %>%
bind_cols(data, .)
}
df %>%
coalesce_df(everything())
# col_a col_b col_c result
# 1 bob <NA> paul bob
# 2 <NA> danny <NA> danny
# 3 bob <NA> <NA> bob
# 4 <NA> <NA> paul paul
# 5 bob <NA> <NA> bob
df %>% coalesce_df(col_a, col_b)
# col_a col_b col_c result
# 1 bob <NA> paul bob
# 2 <NA> danny <NA> danny
# 3 bob <NA> <NA> bob
# 4 <NA> <NA> paul <NA>
# 5 bob <NA> <NA> bob
实际上,您的第一个函数可以工作,只需将 vars
写为一个字符即可。看:
df %>% coalesce_plus_1(data = ., vars = c("col_a","col_b","col_c"))
这是另一个不错的选择:
library(dplyr)
df <- data.frame(col_a = c("bob", NA, "bob", NA, "bob"),
col_b = c(NA, "danny", NA, NA, NA),
col_c = c("paul", NA, NA, "paul", NA))
coalesce_plus <- function(data,vars){
x <- as.list(select(data,vars))
data.frame(data, coalesced_col=coalesce(!!!x))
}
df %>% coalesce_plus(data = ., vars = c("col_a","col_b","col_c"))
我正在尝试编写一个函数来包装 dplyr::coalesce()
,并将接收数据对象和列名以合并。到目前为止,我的尝试都失败了。
示例数据
library(dplyr)
df <-
data.frame(col_a = c("bob", NA, "bob", NA, "bob"),
col_b = c(NA, "danny", NA, NA, NA),
col_c = c("paul", NA, NA, "paul", NA))
## col_a col_b col_c
## 1 bob <NA> paul
## 2 <NA> danny <NA>
## 3 bob <NA> <NA>
## 4 <NA> <NA> paul
## 5 bob <NA> <NA>
在编写自定义函数时存根
coalesce_plus_1 <- function(data, vars) {
data %>%
mutate(coalesced_col = coalesce(!!! rlang::syms(tidyselect::vars_select(names(.), vars))))
}
coalesce_plus_2 <- function(data, vars) {
data %>%
mutate(coalesced_col = coalesce(!!! rlang::syms(vars)))
}
coalesce_plus_3 <- function(data, vars) {
data %>%
mutate(coalesced_col = coalesce({{ vars }}))
}
结果...
coalesce_plus_1()
df %>%
coalesce_plus_1(data = ., vars = c(col_a, col_b, col_c))
Error: object 'col_a' not found.
但是:
df %>%
coalesce_plus_1(data = ., vars = all_of(starts_with("col")))
## col_a col_b col_c coalesced_col
## 1 <NA> <NA> paul paul
## 2 <NA> danny <NA> danny
## 3 bob <NA> <NA> bob
## 4 <NA> <NA> paul paul
## 5 bob <NA> <NA> bob
coalesce_plus_2()
df %>%
coalesce_plus_2(data = ., vars = c(col_a, col_b, col_c))
Error in lapply(.x, .f, ...) : object 'col_a' not found
还有
df %>%
coalesce_plus_2(data = ., vars = all_of(starts_with("col")))
Error:
starts_with()
must be used within a selecting function. i See https://tidyselect.r-lib.org/reference/faq-selection-context.html. Runrlang::last_error()
to see where the error occurred.
coalesce_plus_3()
df %>%
coalesce_plus_3(data = ., vars = c(col_a, col_b, col_c))
Error: Problem with
mutate()
inputcoalesced_col
. x Inputcoalesced_col
can't be recycled to size 5. i Inputcoalesced_col
iscoalesce(c(col_a, col_b, col_c))
. i Inputcoalesced_col
must be size 5 or 1, not 15.
还有
df %>%
coalesce_plus_3(data = ., vars = all_of(starts_with("col")))
Error: Problem with
mutate()
inputcoalesced_col
. xstarts_with()
must be used within a selecting function. i See https://tidyselect.r-lib.org/reference/faq-selection-context.html. i Inputcoalesced_col
iscoalesce(all_of(starts_with("col")))
.
底线
我如何为 coalesce()
编写一个自定义函数,它将接受一个数据对象和特定的列名来合并,允许特定的命名,例如 c(col_a, col_b, col_c)
和辅助函数,例如 starts_with("col")
在函数的 vars
参数中?
这是一个简单的实现,它只会 return select 列,但可以相当容易地扩展以保留所有列(我会 bind_cols
最后将它们重新打开。 ..).
这很简单,因为我们依靠 select
为我们完成工作,正如 Implementing tidyselect vignette
# edited to keep all columns
coalesce_df = function(data, ...) {
data %>%
select(...) %>%
transmute(result = invoke(coalesce, .)) %>%
bind_cols(data, .)
}
df %>%
coalesce_df(everything())
# col_a col_b col_c result
# 1 bob <NA> paul bob
# 2 <NA> danny <NA> danny
# 3 bob <NA> <NA> bob
# 4 <NA> <NA> paul paul
# 5 bob <NA> <NA> bob
df %>% coalesce_df(col_a, col_b)
# col_a col_b col_c result
# 1 bob <NA> paul bob
# 2 <NA> danny <NA> danny
# 3 bob <NA> <NA> bob
# 4 <NA> <NA> paul <NA>
# 5 bob <NA> <NA> bob
实际上,您的第一个函数可以工作,只需将 vars
写为一个字符即可。看:
df %>% coalesce_plus_1(data = ., vars = c("col_a","col_b","col_c"))
这是另一个不错的选择:
library(dplyr)
df <- data.frame(col_a = c("bob", NA, "bob", NA, "bob"),
col_b = c(NA, "danny", NA, NA, NA),
col_c = c("paul", NA, NA, "paul", NA))
coalesce_plus <- function(data,vars){
x <- as.list(select(data,vars))
data.frame(data, coalesced_col=coalesce(!!!x))
}
df %>% coalesce_plus(data = ., vars = c("col_a","col_b","col_c"))