使用数据表删除异常值时出错

Error with outlier removal using DataTables

我正在尝试使用以下代码删除异常值

dat_outlier = dat
setDT(dat_outlier)

for (j in col_names){
  
  dat_outlier[, (j):= ifelse(!dat_outlier[[j]] %in% boxplot.stats(dat_outlier[[j]])$out,dat_outlier[[j]],NA), by=Comparison]
  
}

但是,我收到以下错误。请大家帮忙看看是什么原因,如何改正。

Error in `[.data.table`(dat_outlier, , `:=`((noquote(j)), ifelse(!dat_outlier[[j]] %in%  : 
  Supplied 62 items to be assigned to group 1 of size 9 in column 'CRP'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.

代码是通过修改 another question thread

中提到的代码生成的

问题是 dat_outlier[[j]] 为您提供了列的所有值,而不是按组 (Comparison)。您可以尝试使用 lapply -

library(data.table)

dat_outlier = dat
setDT(dat_outlier)

dat_outlier[, (j) := lapply(.SD, function(x) ifelse(x %in% boxplot.stats(dat_outlier[[j]])$out, x, NA)), 
            by = Comparison, .SDcols = col_names]