R H2o 对象未找到 H2OKeyNotFoundArgumentException

R H2o object not found H2OKeyNotFoundArgumentException

R 版本: R 版本 3.5.1 (2018-07-02)

H2O集群版本:3.20.0.2

此处使用的数据集可在 Kaggle(Home credit risk)上获得。在使用 h2o automl 之前,已经进行了必要的缺失值处理和相关分类变量的选择。你能帮我弄清楚这个错误的根本原因是什么吗? 谢谢

代码:

h2o.init()
 h2o.no_progress()
 # y_train_processed_tbl is the target variable
 # x_train_processed_tbl is the remaining data post dealing with Missing 
 #  values
 data_h2o <- as.h2o(bind_cols(y_train_processed_tbl, x_train_processed_tbl))
 splits_h2o <- h2o.splitFrame(data_h2o, ratios = c(0.7, 0.15), seed = 1234)
 train_h2o <- splits_h2o[[1]]
 valid_h2o <- splits_h2o[[2]]
 test_h2o  <- splits_h2o[[3]]

 y <- "TARGET"
 x <- setdiff(names(train_h2o), y)

 automl_models_h2o <- h2o.automl(x = x,y = y,
 training_frame    = train_h2o, validation_frame  = valid_h2o,
 leaderboard_frame = test_h2o,
 max_runtime_secs  = 90
 )

 automl_leader <- automl_models_h2o@leader
 # Error in performance_h2o 
 performance_h2o <- h2o.performance(automl_leader, newdata = test_h2o)


ERROR: Unexpected HTTP Status code: 404 Not Found

water.exceptions.H2OKeyNotFoundArgumentException
 [1] "water.exceptions.H2OKeyNotFoundArgumentException: Object 'dummy' not 
 found in function: predict for argument: model"
 [2] "    water.api.ModelMetricsHandler.score(ModelMetricsHandler.java:235)"  
 [3] "    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"                                                    
 [4] "    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)"                                                    
 [5] "    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)"                                                
 [6] "    java.lang.reflect.Method.invoke(Unknown Source)"                                                                
 [7] "    water.api.Handler.handle(Handler.java:63)"                                                                      
 [8] "    water.api.RequestServer.serve(RequestServer.java:451)"                                                          
 [9] "    water.api.RequestServer.doGeneric(RequestServer.java:296)"                                                      
[10] "    water.api.RequestServer.doPost(RequestServer.java:222)"                                                         
[11] "    javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"                                                   
[12] "    javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"                                                   
[13] "    org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"                                         
[14] "    org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)"                                     
[15] "    org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)"                             
[16] "    org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)"                                      
[17] "    org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)"                              
[18] "    org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)"                                  
[19] "    org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"                          
[20] "    org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"                                
[21] "    water.JettyHTTPD$LoginHandler.handle(JettyHTTPD.java:197)"                                                      
[22] "    org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"                          
[23] "    org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"                                
[24] "    org.eclipse.jetty.server.Server.handle(Server.java:370)"                                                        
[25] "    org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)"                 
[26] "    org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)"                  
[27] "    org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982)"                       
[28] "    org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043)"       
[29] "    org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)"                                               
[30] "    org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)"                                          
[31] "    org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)"                         
[32] "    org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)"                   
[33] "    org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)"                               
[34] "    org.eclipse.jetty.util.thread.QueuedThreadPool.run(QueuedThreadPool.java:543)"                                
[35] "    java.lang.Thread.run(Unknown Source)"                                                                           



Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = 
page,  : 

ERROR MESSAGE:

Object 'dummy' not found in function: predict for argument: model

这里的问题是您只给了 AutoML 90 秒给 运行,所以它甚至没有时间训练一个模型。在 H2O 的下一个稳定版本中,错误消息将消失,取而代之的是您只会得到一个没有行的排行榜(我们正在修复此问题,以便更优雅地处理它)。

而不是使用 max_runtime_secs = 90,您可以将其增加到更大的值(默认值为 3600 秒,或 1 小时)。或者,您可以通过设置 max_models = 20 来指定您想要的模型数量。

如果您使用 max_models,我建议将 max_runtime_secs 设置为较大的值(例如 999999999),这样您就不会 运行 超时。 AutoML 进程将在到达 max_modelsmax_runtime_secs 中的第一个时停止。

我发布了类似的回答 here

我的代码工作正常,然后我调整它并得到同样的错误。

要解决此问题,请使用 h2o.getModel() 保存领导者,而不是使用 automl_models_h2o@leader 保存 predictions/performance 的领导者。

更改您的 automl_leader 初始化:

...

# get model name from list
automl_models_h2o@leaderboard 

# change MODEL_NAME_HERE to a model name from your leaderboard list.
automl_leader <- h2o.getModel("MODEL_NAME_HERE") 

performance_h2o <- h2o.performance(automl_leader, newdata = test_h2o)

...