在 R 中的函数中向 data.table 添加新列
adding new columns to a data.table within a function in R
作为更大功能的一部分,我需要在 data.table 中创建两个新列(稍后用于创建绘图)。
这些是我的专栏名称:
names(freqSevDataAge)
[1] "ag5" "claims" "exposure"
[7] "severity" "frequency"
我正在尝试使这部分功能正常工作:
testDT <- function(data, xvar, yvar, yvarsec, groupvar, ...){
freqSevDataAge2 <- freqSevDataAge[!claims == 0][, ':=' (scaled = "yvarsec" * max("yvar")/max("yvarsec"),
param = max("yvar")/max("yvarsec"))]
}
testDT(freqSevDataAge, xvar = "ag5", yvar = "severity", yvarsec = "frequency", groupvar = "gender")
我得到的错误是:
Error in "yvarsec" * max("yvar") : non-numeric argument to binary
operator
编辑:
使用 get()
的建议解决方案有效,但是现在我在使用 ggplot 中新创建的列时遇到了问题。我收到一个错误:
Error in f(...) : object 'param' not found
我一步步检查函数,知道创建了列 param
,问题是在 ggplot 中调用它。我怎样才能
getSecPlot <- function(data, xvar, yvar, yvarsec, groupvar, ...){
if ("agegroup" %in% xvar) xvar <- get("agegroup")
data <- data[!claims == 0][, ':=' (scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
param = max(get(yvar))/max(get(yvarsec)))]
param <- unique(param)
sec_plot <- ggplot(data, aes_string (x = xvar, group = groupvar)) +
geom_col(aes_string(y = yvar, fill = groupvar, alpha = 0.5), position = "dodge") +
geom_line(aes(y = scaled, color = gender)) +
scale_y_continuous(sec.axis = sec_axis(~./(param),
name = paste0("average ", yvarsec), labels = function(x) format(x, big.mark = " ", scientific = FALSE))) +
labs(y = paste0("total ", yvar)) +
theme_pubclean()
}
我们可以通过删除函数内变量的引号来更改函数,使用get
获取对象的值
library(data.table)
testDT <- function(data, xvar, yvar, yvarsec, groupvar, ...){
freqSevDataAge[!claims == 0][, ':='
(scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
param = max(get(yvar))/max(get(yvarsec)))][]
}
testDT(freqSevDataAge, xvar = "ag5", yvar = "severity",
yvarsec = "frequency", groupvar = "gender")
# ag5 severity frequency gender claims scaled param
#1: 3 8 9 M 1 3.7500000 0.4166667
#2: 6 3 24 F 1 10.0000000 0.4166667
#3: 7 10 17 F 1 7.0833333 0.4166667
#4: 8 8 8 M 1 3.3333333 0.4166667
#5: 10 10 1 M 1 0.4166667 0.4166667
或者另一种选择是使用 as.symbol
转换为符号并使用 eval
进行评估
数据
set.seed(24)
freqSevDataAge <- data.table(ag5 = 1:10, severity = sample(1:10, 10,
replace = TRUE), frequency = sample(1:24, 10, replace = TRUE),
gender = sample(c("M", "F"), 10, replace = TRUE),
claims = sample(0:1, 10, replace = TRUE))
作为更大功能的一部分,我需要在 data.table 中创建两个新列(稍后用于创建绘图)。
这些是我的专栏名称:
names(freqSevDataAge)
[1] "ag5" "claims" "exposure"
[7] "severity" "frequency"
我正在尝试使这部分功能正常工作:
testDT <- function(data, xvar, yvar, yvarsec, groupvar, ...){
freqSevDataAge2 <- freqSevDataAge[!claims == 0][, ':=' (scaled = "yvarsec" * max("yvar")/max("yvarsec"),
param = max("yvar")/max("yvarsec"))]
}
testDT(freqSevDataAge, xvar = "ag5", yvar = "severity", yvarsec = "frequency", groupvar = "gender")
我得到的错误是:
Error in "yvarsec" * max("yvar") : non-numeric argument to binary operator
编辑:
使用 get()
的建议解决方案有效,但是现在我在使用 ggplot 中新创建的列时遇到了问题。我收到一个错误:
Error in f(...) : object 'param' not found
我一步步检查函数,知道创建了列 param
,问题是在 ggplot 中调用它。我怎样才能
getSecPlot <- function(data, xvar, yvar, yvarsec, groupvar, ...){
if ("agegroup" %in% xvar) xvar <- get("agegroup")
data <- data[!claims == 0][, ':=' (scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
param = max(get(yvar))/max(get(yvarsec)))]
param <- unique(param)
sec_plot <- ggplot(data, aes_string (x = xvar, group = groupvar)) +
geom_col(aes_string(y = yvar, fill = groupvar, alpha = 0.5), position = "dodge") +
geom_line(aes(y = scaled, color = gender)) +
scale_y_continuous(sec.axis = sec_axis(~./(param),
name = paste0("average ", yvarsec), labels = function(x) format(x, big.mark = " ", scientific = FALSE))) +
labs(y = paste0("total ", yvar)) +
theme_pubclean()
}
我们可以通过删除函数内变量的引号来更改函数,使用get
获取对象的值
library(data.table)
testDT <- function(data, xvar, yvar, yvarsec, groupvar, ...){
freqSevDataAge[!claims == 0][, ':='
(scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
param = max(get(yvar))/max(get(yvarsec)))][]
}
testDT(freqSevDataAge, xvar = "ag5", yvar = "severity",
yvarsec = "frequency", groupvar = "gender")
# ag5 severity frequency gender claims scaled param
#1: 3 8 9 M 1 3.7500000 0.4166667
#2: 6 3 24 F 1 10.0000000 0.4166667
#3: 7 10 17 F 1 7.0833333 0.4166667
#4: 8 8 8 M 1 3.3333333 0.4166667
#5: 10 10 1 M 1 0.4166667 0.4166667
或者另一种选择是使用 as.symbol
转换为符号并使用 eval
数据
set.seed(24)
freqSevDataAge <- data.table(ag5 = 1:10, severity = sample(1:10, 10,
replace = TRUE), frequency = sample(1:24, 10, replace = TRUE),
gender = sample(c("M", "F"), 10, replace = TRUE),
claims = sample(0:1, 10, replace = TRUE))