创建 h2o 集成模型时,h2o-ensemble 收到错误请求(Temp ID RTMP_5 已经存在)时出现 R 错误

R error with h2o-ensemble getting a bad request (Temp ID RTMP_5 already exists) when creating h2o ensemble model even tough it worked before

嗨,我开始使用 h2o ensemble 包(此处:https://github.com/h2oai/h2o-3/tree/master/h2o-r/ensemble 进行一些数据分析并尝试了演示代码。

代码在之前运行良好:

## ## setting up h2o
library(h2oEnsemble)
nodes <- 2 ## number of processes
localH2O <-  h2o.init(nthreads=nodes)

## ## simulated data set
dat <- matrix(rnorm(6e3), ncol=3, dimnames=list(NULL, c("W", "X", "Y")))
dat <- as.data.frame(dat)
Z <- as.factor(rbinom(nrow(dat), size=1, prob=plogis(.2+.1*dat$W-.2*dat$X)))
dat <- cbind(dat, Z=Z)
## W,X,Y: Input
## Z: output
dat.app <- dat[1:1e3, ]
dat.val <- dat[1e3+(1:1e3), ]

## ## h2o procedure
dat.h2o.app <- as.h2o(localH2O, dat.app) ## learning
dat.h2o.val <- as.h2o(localH2O, dat.val) ## validation

library.h2o <- c("h2o.deeplearning.Tanh",
                 "h2o.randomForest.1000x100")

h2o.randomForest.1000x100 <- function(...,ntrees=1000,nbins=100) {
    h2oEnsemble::h2o.randomForest.wrapper(..., ntrees=ntrees, nbins=nbins,seed=1)
}
h2o.deeplearning.Tanh <- function(...,hidden=c(200, 200,200),activation="Tanh" ) {
    h2oEnsemble::h2o.deeplearning.wrapper(..., hidden=hidden,    activation=activation,seed=1)
}
h2o.model <- h2o.ensemble(y="Z", x=c("W", "X", "Y"),
                          training_frame=dat.h2o.app,
                          family="binomial",
                          learner=library.h2o,
                          cvControl=list(V=10, shuffle=TRUE),
                          metalearner="h2o.glm.wrapper") # getting the 400 bad request

h2o.pred.val <- predict(h2o.model, newdat=dat.h2o.val)$pred
table((h2o.pred.val>0.5)+0, dat.val$Z)

它突然向我抛出一个 400 错误请求(RTMP_5 已经存在)

R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> ## ## setting up h2o
> library(h2oEnsemble)
> nodes <- 2 ## number of processes
> localH2O <-  h2o.init(nthreads=nodes)
Successfully connected to http://127.0.0.1:54321/ 

R is connected to the H2O cluster: 
    H2O cluster uptime:         9 days 19 hours 
    H2O cluster version:        3.6.0.8 
    H2O cluster name:           H2O_started_from_R_root_afl027 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   6.98 GB 
    H2O cluster total cores:    6 
    H2O cluster allowed cores:  2 
    H2O cluster healthy:        TRUE 

> 
> ## ## simulated data set
> dat <- matrix(rnorm(6e3), ncol=3, dimnames=list(NULL, c("W", "X", "Y")))
> dat <- as.data.frame(dat)
> Z <- as.factor(rbinom(nrow(dat), size=1, prob=plogis(.2+.1*dat$W-.2*dat$X)))
> dat <- cbind(dat, Z=Z)
> ## W,X,Y: input
> ## Z: output
> dat.app <- dat[1:1e3, ]
> dat.val <- dat[1e3+(1:1e3), ]
> 
> ## ## h2o procedure
> dat.h2o.app <- as.h2o(dat.app) ## apprentissage

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
> dat.h2o.val <- as.h2o(dat.val) ## validation

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
> 
> library.h2o <- c("h2o.deeplearning.Tanh",
+                  "h2o.randomForest.1000x100")
> 
> h2o.model <- h2o.ensemble(y="Z", x=c("W", "X", "Y"),
+                           training_frame=dat.h2o.app,
+                           family="binomial",
+                           learner=library.h2o,
+                           cvControl=list(V=10, shuffle=TRUE),
+                           metalearner="h2o.glm.wrapper")

ERROR: Unexpected HTTP Status code: 400 Bad Request (url = http://127.0.0.1:54321/99/Rapids)

java.lang.IllegalArgumentException
 [1] "water.rapids.ASTTmpAssign.apply(ASTAssign.java:254)"                                  
 [2] "water.rapids.ASTTmpAssign.apply(ASTAssign.java:248)"                                  
 [3] "water.rapids.ASTExec.exec(ASTExec.java:46)"                                           
 [4] "water.rapids.Session.exec(Session.java:56)"                                           
 [5] "water.rapids.Exec.exec(Exec.java:63)"                                                 
 [6] "water.api.RapidsHandler.exec(RapidsHandler.java:23)"                                  
 [7] "sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)"                          
 [8] "sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)"
 [9] "java.lang.reflect.Method.invoke(Method.java:622)"                                     
[10] "water.api.Handler.handle(Handler.java:64)"                                            
[11] "water.api.RequestServer.handle(RequestServer.java:644)"                               
[12] "water.api.RequestServer.serve(RequestServer.java:585)"                                
[13] "water.JettyHTTPD$H2oDefaultServlet.doGeneric(JettyHTTPD.java:617)"                    
[14] "water.JettyHTTPD$H2oDefaultServlet.doPost(JettyHTTPD.java:565)"                       
[15] "javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"                         
[16] "javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"                         
[17] "org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"               

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
    Temp ID RTMP_5 already exists
Calls : h2o.ensemble ... .eval.driver -> .h2o.__remoteSend -> .h2o.doSafeREST
Execution halted

我有点迷路了,不明白它现在不起作用的原因,训练集应该是正确的格式。 有人遇到过这个问题吗?如果是,你是如何克服这个错误的?

这其实是最近版本的h2o R包的一个bug,已经修复了。它将在 h2o R 软件包的下一个稳定版本中进行修补,或者您可以在此处下载夜间版本:http://h2o-release.s3.amazonaws.com/h2o/master/latest.html

问题源于对 h2o.init 的多次调用。现在,您可以通过关闭所有 h2o 实例并在牢记这一点后重试来解决此错误。

更多信息在这里:https://groups.google.com/forum/#!topic/h2ostream/E6u9YbWmD6k