在 R 中的函数中向 data.table 添加新列

adding new columns to a data.table within a function in R

作为更大功能的一部分,我需要在 data.table 中创建两个新列(稍后用于创建绘图)。

这些是我的专栏名称:

names(freqSevDataAge)
 [1] "ag5"                           "claims"              "exposure"               
 [7] "severity"             "frequency" 

我正在尝试使这部分功能正常工作:

  testDT <- function(data, xvar, yvar, yvarsec, groupvar, ...){

  freqSevDataAge2 <- freqSevDataAge[!claims == 0][, ':=' (scaled = "yvarsec" * max("yvar")/max("yvarsec"),
                                                          param  = max("yvar")/max("yvarsec"))]
  }

  testDT(freqSevDataAge, xvar = "ag5", yvar = "severity", yvarsec = "frequency", groupvar = "gender")

我得到的错误是:

Error in "yvarsec" * max("yvar") : non-numeric argument to binary operator

编辑:

使用 get() 的建议解决方案有效,但是现在我在使用 ggplot 中新创建的列时遇到了问题。我收到一个错误:

Error in f(...) : object 'param' not found

我一步步检查函数,知道创建了列 param,问题是在 ggplot 中调用它。我怎样才能

getSecPlot <- function(data, xvar, yvar, yvarsec, groupvar, ...){

  if ("agegroup" %in% xvar) xvar <- get("agegroup")

  data <- data[!claims == 0][, ':=' (scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
                                     param  = max(get(yvar))/max(get(yvarsec)))]

param <- unique(param)

  sec_plot <- ggplot(data, aes_string (x = xvar, group = groupvar)) +
      geom_col(aes_string(y = yvar, fill = groupvar, alpha = 0.5), position = "dodge") +
      geom_line(aes(y = scaled,  color = gender)) +
      scale_y_continuous(sec.axis = sec_axis(~./(param),
                                             name = paste0("average ", yvarsec), labels = function(x) format(x, big.mark = " ", scientific = FALSE))) +
      labs(y = paste0("total ", yvar)) +
      theme_pubclean()
  }

我们可以通过删除函数内变量的引号来更改函数,使用get获取对象的值

library(data.table)
testDT <- function(data, xvar, yvar, yvarsec, groupvar, ...){

   freqSevDataAge[!claims == 0][, ':=' 
    (scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
                       param  = max(get(yvar))/max(get(yvarsec)))][]
    }


testDT(freqSevDataAge, xvar = "ag5", yvar = "severity", 
             yvarsec = "frequency", groupvar = "gender")
#    ag5 severity frequency gender claims     scaled     param
#1:   3        8         9      M      1  3.7500000 0.4166667
#2:   6        3        24      F      1 10.0000000 0.4166667
#3:   7       10        17      F      1  7.0833333 0.4166667
#4:   8        8         8      M      1  3.3333333 0.4166667
#5:  10       10         1      M      1  0.4166667 0.4166667

或者另一种选择是使用 as.symbol 转换为符号并使用 eval

进行评估

数据

set.seed(24)
freqSevDataAge <- data.table(ag5 = 1:10, severity = sample(1:10, 10,
   replace = TRUE), frequency = sample(1:24, 10, replace = TRUE),
   gender = sample(c("M", "F"), 10, replace = TRUE), 
   claims = sample(0:1, 10, replace = TRUE))