H2O XGBoost on Windows: Error: java.lang.UnsatisfiedLinkError: ml.dmlc.xgboost4j.java.XGBoostJNI.XGDMatrixCreateFromCSREx([J[I[FI[J)I

H2O XGBoost on Windows: Error: java.lang.UnsatisfiedLinkError: ml.dmlc.xgboost4j.java.XGBoostJNI.XGDMatrixCreateFromCSREx([J[I[FI[J)I

当我在 Windows 7 和 Windows 服务器 2008R2 上尝试 运行 XGboost 通过 h2o.xgboost() 和 H2O 3.12.01 我收到以下错误:

Error: java.lang.UnsatisfiedLinkError: ml.dmlc.xgboost4j.java.XGBoostJNI.XGDMatrixCreateFromCSREx([J[I[FI[J)I

这是一个可重现的例子:

library(h2o)
h2o.init(nthreads = -1)
h2o.no_progress() # Don't show progress bars in RMarkdown output

# Import a sample binary outcome train/test set into H2O
train <- h2o.importFile("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
test <- h2o.importFile("https://s3.amazonaws.com/erin-data/higgs/higgs_test_5k.csv")

# Identify predictors and response
y <- "response"
x <- setdiff(names(train), y)

# For binary classification, response should be a factor
train[,y] <- as.factor(train[,y])
test[,y] <- as.factor(test[,y])

# Number of CV folds (to generate level-one data for stacking)
nfolds <- 5

# Train & Cross-validate a (shallow) XGB-GBM
my_xgb1 <- h2o.xgboost(x = x,
                       y = y,
                       training_frame = train,
                       distribution = "bernoulli",
                       ntrees = 50,
                       max_depth = 3,
                       min_rows = 2,
                       learn_rate = 0.2,
                       nfolds = nfolds,
                       fold_assignment = "Modulo",
                       keep_cross_validation_predictions = TRUE,
                       seed = 1)
R version 3.4.0 Patched (2017-05-19 r72713)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] h2o_3.12.0.1

loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0    RCurl_1.95-4.8 jsonlite_1.5   bitops_1.0-6

3.12.01是h2o.ai主页上链接的最新开发版本,我在3.10找不到这个功能后升级到的。然而,@MarcoSandri 的评论表明他们的 Amazon AWS 上有更新的开发版本 (3.13),因此下载它并相应地升级集群和 R 包。

从 3.12 升级到 3.13 似乎很顺利,直到我尝试使用 h2o.xgboost() 功能。然后它抛出了一个不同的错误:

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 
ERROR MESSAGE:

-1

Error in fetch(key) : 
  lazy-load database 'E:/Program Files/R/R-3.4.0patched/library/h2o/help/h2o.rdb' is corrupt
H2O-3 XGBoost 不支持

Windows。作为参考,这里列出了 H2O-3 XGBoost 支持的 OS:

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/xgboost.html#limitations