R 插入符号:{ 中的错误:任务 1 失败 - "No importance values available"

R caret: Error in { : task 1 failed - "No importance values available"

我使用 'ranger' 方法以及类似的其他方法在 R 中选择 RFE 特征时遇到此错误。我已经尝试消除高度相关的特征、nzv 过滤、更改方法,使用权重矩阵,但我总是有类似的错误。 RFE 运行s 几倍,但随后停止。

variable.sizes <- c(2,5,50,500)
control <- rfeControl(functions = caretFuncs, method = "cv",
                        verbose = TRUE, returnResamp = "all",
                        number = num.iters)
results.rfe <- rfe(x = featureVars, y = classVars,
                     sizes = variable.sizes,
                     rfeControl = control, trControl = trainControl(method = "cv"),
                     preProcess=c("scale","center"), method="ranger")

featureVars 是一个数据框,我也尝试过矩阵,有 334 行,classVars 是一个有 3 个级别和 334 个项目的因子。 rfe 执行通过解析阶段和 运行 几次,然后停止,如此输出。

+(rfe) fit Fold1 size: 992 
-(rfe) fit Fold1 size: 992 
+(rfe) imp Fold1 
+(rfe) fit Fold2 size: 992 
Error in { : task 1 failed - "No importance values available" 

这是sessionInfo,我已经更新了导入包的所有依赖项。

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ranger_0.12.1   dplyr_1.0.5     e1071_1.7-6     caret_6.0-87    ggplot2_3.3.3   lattice_0.20-41

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6           pillar_1.5.1         compiler_4.0.3       gower_0.2.2          plyr_1.8.6          
 [6] iterators_1.0.13     class_7.3-18         tools_4.0.3          rpart_4.1-15         ipred_0.9-11        
[11] lubridate_1.7.10     lifecycle_1.0.0      tibble_3.1.0         gtable_0.3.0         nlme_3.1-151        
[16] pkgconfig_2.0.3      rlang_0.4.10         Matrix_1.3-2         foreach_1.5.1        DBI_1.1.1           
[21] prodlim_2019.11.13   stringr_1.4.0        withr_2.4.1          pROC_1.17.0.1        generics_0.1.0      
[26] vctrs_0.3.7          recipes_0.1.15       stats4_4.0.3         nnet_7.3-15          grid_4.0.3          
[31] tidyselect_1.1.0     data.table_1.14.0    glue_1.4.2           R6_2.5.0             fansi_0.4.2         
[36] survival_3.2-7       lava_1.6.9           reshape2_1.4.4       purrr_0.3.4          magrittr_2.0.1      
[41] ModelMetrics_1.2.2.2 splines_4.0.3        MASS_7.3-53          scales_1.1.1         codetools_0.2-18    
[46] ellipsis_0.3.1       assertthat_0.2.1     timeDate_3043.102    colorspace_2.0-0     utf8_1.2.1          
[51] proxy_0.4-25         stringi_1.5.3        munsell_0.5.0        crayon_1.4.1        

使用 ranger,您需要指定重要性度量,以便计算重要性值,例如 importance = "impurity",请参阅 help page 了解更多信息。

下面我使用了一个指定了 importance 参数的示例数据集,您可以看到它有效:

x = cbind(iris[,1:4],matrix(rnorm(nrow(iris),6),ncol=6))
y = iris$Species

variable.sizes <- c(2,4,6)
control <- rfeControl(functions = caretFuncs, method = "cv",
                        verbose = TRUE, returnResamp = "all",
                        number = 5)
results.rfe <- rfe(x = x, y = y,
                   sizes = variable.sizes,
                   rfeControl = control,
                   trControl = trainControl(method = "cv"),
                    preProcess=c("scale","center"), method="ranger",
                    importance = "impurity")

输出:

+(rfe) 适合 Fold1 尺寸:10 -(rfe) 适合 Fold1 尺寸:10 +(rfe) imp Fold1 -(rfe) imp Fold1 +(rfe) fit Fold1 尺寸:6 -(rfe) 适合 Fold1 尺寸:6 +(rfe) fit Fold1 尺寸:4 -(rfe) 适合 Fold1 尺寸:4 +(rfe) fit Fold1 尺寸:2 注意:默认网格中只有 1 个独特的复杂性参数。将网格截断为 1 .

-(rfe) 适合 Fold1 尺寸:2 +(rfe) 适合 Fold2 尺寸:10 -(rfe) 适合 Fold2 尺寸:10 +(rfe) imp Fold2 -(rfe) imp Fold2 +(rfe) 适合 Fold2 尺寸:6 -(rfe) 适合 Fold2 尺寸:6 +(rfe) 适合 Fold2 尺寸:4 -(rfe) 适合 Fold2 尺寸:4 +(rfe) 适合 Fold2 尺寸:2 注意:默认网格中只有 1 个独特的复杂性参数。将网格截断为 1 .

-(rfe) fit Fold2 size:  2 
+(rfe) fit Fold3 size: 10 
-(rfe) fit Fold3 size: 10 
+(rfe) imp Fold3 
-(rfe) imp Fold3 
+(rfe) fit Fold3 size:  6 
-(rfe) fit Fold3 size:  6 
+(rfe) fit Fold3 size:  4 
-(rfe) fit Fold3 size:  4 
+(rfe) fit Fold3 size:  2 
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .

-(rfe) fit Fold3 size:  2 
+(rfe) fit Fold4 size: 10 
-(rfe) fit Fold4 size: 10 
+(rfe) imp Fold4 
-(rfe) imp Fold4 
+(rfe) fit Fold4 size:  6 
-(rfe) fit Fold4 size:  6 
+(rfe) fit Fold4 size:  4 
-(rfe) fit Fold4 size:  4 
+(rfe) fit Fold4 size:  2 
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .

-(rfe) fit Fold4 size:  2 
+(rfe) fit Fold5 size: 10 
-(rfe) fit Fold5 size: 10 
+(rfe) imp Fold5 
-(rfe) imp Fold5 
+(rfe) fit Fold5 size:  6 
-(rfe) fit Fold5 size:  6 
+(rfe) fit Fold5 size:  4 
-(rfe) fit Fold5 size:  4 
+(rfe) fit Fold5 size:  2 
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .

-(rfe) fit Fold5 size:  2 
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .