使用 dplyr 编程时如何使用括号符号(或替代符号)
how to use bracket notation (or an alternative) while programming with dplyr
我正在尝试编写一个函数来计算 toplines(通常用于投票数据)。它需要包含 "percent" 和 "valid percent" 列。
这是一个例子
library(tidyverse)
# prepare some data
d <- gss_cat %>%
mutate(tvhours2 = tvhours,
tvhours2 = replace(tvhours2, tvhours > 5 , "6-8"),
tvhours2 = replace(tvhours2, tvhours > 8 , "9+"),
tvhours2 = fct_explicit_na(tvhours2),
# make a weight variable
fakeweight = rnorm(n(), mean = 1, sd = .25))
以下函数就其本身而言有效:
make.topline <- function(variable, data, weight){
variable <- enquo(variable)
weight <- enquo(weight)
table <- data %>%
# calculate denominator
mutate(total = sum(!!weight)) %>%
# calculate proportions
group_by(!!variable) %>%
summarise(pct = (sum(!!weight)/first(total))*100,
n = sum(!!weight))
table
}
make.topline(variable = tvhours2, data = d, weight = fakeweight)
我正在努力实现有效的百分比列。这是我试过的语法。
make.topline2 <- function(variable, data, weight){
variable <- enquo(variable)
weight <- enquo(weight)
table <- data %>%
# calculate denominator
mutate(total = sum(!!weight),
valid.total = sum(!!weight[!!variable != "(Missing)"])) %>%
# calculate proportions
group_by(!!variable) %>%
summarise(pct = (sum(!!weight)/first(total))*100,
valid.pct = (sum(!!weight)/first(valid.total))*100,
n = sum(!!weight))
table
}
make.topline2(variable = tvhours2, data = d, weight = fakeweight)
这会产生以下错误:
Error: Base operators are not defined for quosures.
Do you need to unquote the quosure?
# Bad:
myquosure != rhs
# Good:
!!myquosure != rhs
Call `rlang::last_error()` to see a backtrace
我知道问题出在这一行,但我不知道如何解决:
mutate(valid.total = sum(!!weight[!!variable != "(Missing)"]))
您可以在 !!weight
两边加上括号。我认为这是为了确保我们仅使用提取括号 after weight
未加引号(因此是操作顺序)。
那一行看起来像:
valid.total = sum((!!weight)[!!variable != "(Missing)"])
或者,您可以使用新的 curly-curly 运算符 ({{
),它代替 enquo()
/!!
组合用于像您这样的相对简单的情况。那么你的函数看起来像
make.topline <- function(variable, data, weight){
table <- data %>%
# calculate denominator
mutate(total = sum({{ weight }}),
valid.total = sum({{ weight }}[{{ variable }} != "(Missing)"])) %>%
# calculate proportions
group_by({{ variable }}) %>%
summarise(pct = (sum({{ weight }})/first(total))*100,
valid.pct = (sum({{ weight }})/first(valid.total))*100,
n = sum({{ weight }}))
table
}
与括号中的解决方案一样,运行时没有错误。
make.topline(variable = tvhours2, data = d, weight = fakeweight)
# A tibble: 9 x 4
tvhours2 pct valid.pct n
<fct> <dbl> <dbl> <dbl>
1 0 3.16 5.98 679.
2 1 10.9 20.6 2342.
3 2 14.1 26.6 3022.
4 3 9.10 17.2 1957.
5 4 6.67 12.6 1432.
6 5 3.24 6.13 696.
7 6-8 4.02 7.61 864.
8 9+ 1.67 3.16 358.
9 (Missing) 47.2 89.3 10140.
我正在尝试编写一个函数来计算 toplines(通常用于投票数据)。它需要包含 "percent" 和 "valid percent" 列。
这是一个例子
library(tidyverse)
# prepare some data
d <- gss_cat %>%
mutate(tvhours2 = tvhours,
tvhours2 = replace(tvhours2, tvhours > 5 , "6-8"),
tvhours2 = replace(tvhours2, tvhours > 8 , "9+"),
tvhours2 = fct_explicit_na(tvhours2),
# make a weight variable
fakeweight = rnorm(n(), mean = 1, sd = .25))
以下函数就其本身而言有效:
make.topline <- function(variable, data, weight){
variable <- enquo(variable)
weight <- enquo(weight)
table <- data %>%
# calculate denominator
mutate(total = sum(!!weight)) %>%
# calculate proportions
group_by(!!variable) %>%
summarise(pct = (sum(!!weight)/first(total))*100,
n = sum(!!weight))
table
}
make.topline(variable = tvhours2, data = d, weight = fakeweight)
我正在努力实现有效的百分比列。这是我试过的语法。
make.topline2 <- function(variable, data, weight){
variable <- enquo(variable)
weight <- enquo(weight)
table <- data %>%
# calculate denominator
mutate(total = sum(!!weight),
valid.total = sum(!!weight[!!variable != "(Missing)"])) %>%
# calculate proportions
group_by(!!variable) %>%
summarise(pct = (sum(!!weight)/first(total))*100,
valid.pct = (sum(!!weight)/first(valid.total))*100,
n = sum(!!weight))
table
}
make.topline2(variable = tvhours2, data = d, weight = fakeweight)
这会产生以下错误:
Error: Base operators are not defined for quosures.
Do you need to unquote the quosure?
# Bad:
myquosure != rhs
# Good:
!!myquosure != rhs
Call `rlang::last_error()` to see a backtrace
我知道问题出在这一行,但我不知道如何解决:
mutate(valid.total = sum(!!weight[!!variable != "(Missing)"]))
您可以在 !!weight
两边加上括号。我认为这是为了确保我们仅使用提取括号 after weight
未加引号(因此是操作顺序)。
那一行看起来像:
valid.total = sum((!!weight)[!!variable != "(Missing)"])
或者,您可以使用新的 curly-curly 运算符 ({{
),它代替 enquo()
/!!
组合用于像您这样的相对简单的情况。那么你的函数看起来像
make.topline <- function(variable, data, weight){
table <- data %>%
# calculate denominator
mutate(total = sum({{ weight }}),
valid.total = sum({{ weight }}[{{ variable }} != "(Missing)"])) %>%
# calculate proportions
group_by({{ variable }}) %>%
summarise(pct = (sum({{ weight }})/first(total))*100,
valid.pct = (sum({{ weight }})/first(valid.total))*100,
n = sum({{ weight }}))
table
}
与括号中的解决方案一样,运行时没有错误。
make.topline(variable = tvhours2, data = d, weight = fakeweight)
# A tibble: 9 x 4
tvhours2 pct valid.pct n
<fct> <dbl> <dbl> <dbl>
1 0 3.16 5.98 679.
2 1 10.9 20.6 2342.
3 2 14.1 26.6 3022.
4 3 9.10 17.2 1957.
5 4 6.67 12.6 1432.
6 5 3.24 6.13 696.
7 6-8 4.02 7.61 864.
8 9+ 1.67 3.16 358.
9 (Missing) 47.2 89.3 10140.