H2O anomaly per_feature = TRUE java.lang.OutOfMemoryError: Java heap space
H2O anomaly per_feature = TRUE java.lang.OutOfMemoryError: Java heap space
I 运行 H2O 异常 per_feature = TRUE,导致 Java 堆 Space 错误。在其他一些关于此错误消息的帖子中,我看到有人建议使用 h2o.remove(df) 来释放已用内存。但是,在我的例子中,我没有任何循环,似乎没有什么可以删除以释放一些已用内存。
这是我的代码:
library(h2o)
h2o.init(min_mem_size = "10G", max_mem_size = "15G")
data.hex <- as.h2o(data)
x <- names(data.hex)
random_seed <- 42
# Deeplearning Model
print("Deep learning model begins ...")
model.dl = h2o.deeplearning(x = x,
training_frame = data.hex,
autoencoder = TRUE,
activation = "Tanh",
hidden = c(5, 5, 5, 5, 5),
mini_batch_size = 64,
epochs = 100,
stopping_rounds = 15,
variable_importances = TRUE,
seed = random_seed)
# Calculating anomaly per feature
print('Calculating anomaly per feature ...')
errors_per_feature <- h2o.anomaly(model.dl, data.hex, per_feature = TRUE) # Anomaly Detection Algorithm
print('Converting from H2O frame to dataframe ...')
errors1_per_feature <- as.data.frame(errors_per_feature) # Convert back to data frame
这里是详细的错误信息:
[1] "Deep learning model begins ..."
|======================================================================| 100%
[1] "Calculating anomaly per feature ..."
ERROR: Unexpected HTTP Status code: 500 Server Error (url = http://localhost:54321/3/Predictions/models/DeepLearning_model_R_1594826474037_2/frames/Accesses_sid_a71f_1)
water.util.DistributedException
[1] "DistributedException from localhost/127.0.0.1:54321: 'Java heap space', caused by java.lang.OutOfMemoryError: Java heap space"
[2] " water.MRTask.getResult(MRTask.java:494)"
[3] " water.MRTask.getResult(MRTask.java:502)"
[4] " water.MRTask.doAll(MRTask.java:397)"
[5] " water.MRTask.doAll(MRTask.java:403)"
[6] " hex.deeplearning.DeepLearningModel.scoreAutoEncoder(DeepLearningModel.java:761)"
[7] " water.api.ModelMetricsHandler.predict(ModelMetricsHandler.java:469)"
[8] " java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"
[9] " java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)"
[10] " java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)"
[11] " java.base/java.lang.reflect.Method.invoke(Method.java:567)"
[12] " water.api.Handler.handle(Handler.java:60)"
[13] " water.api.RequestServer.serve(RequestServer.java:470)"
[14] " water.api.RequestServer.doGeneric(RequestServer.java:301)"
[15] " water.api.RequestServer.doPost(RequestServer.java:227)"
[16] " javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"
[17] " javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"
[18] " org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"
[19] " org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)"
[20] " org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)"
[21] " org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:427)"
[22] " org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)"
[23] " org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)"
[24] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[25] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[26] " water.webserver.jetty8.Jetty8ServerAdapter$LoginHandler.handle(Jetty8ServerAdapter.java:119)"
[27] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[28] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[29] " org.eclipse.jetty.server.Server.handle(Server.java:370)"
[30] " org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)"
[31] " org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)"
[32] " org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:984)"
[33] " org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1045)"
[34] " org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)"
[35] " org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:236)"
[36] " org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)"
[37] " org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)"
[38] " org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)"
[39] " org.eclipse.jetty.util.thread.QueuedThreadPool.run(QueuedThreadPool.java:543)"
[40] " java.base/java.lang.Thread.run(Thread.java:830)"
[41] "Caused by:java.lang.OutOfMemoryError: Java heap space"
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
DistributedException from localhost/127.0.0.1:54321: 'Java heap space'
Calls: h2o.anomaly -> .h2o.__remoteSend -> .h2o.doSafeREST
Execution halted
R 和 H2O 版本:
H2O cluster version: 3.30.0.6
H2O cluster total nodes: 1
H2O cluster total memory: 13.43 GB
H2O cluster total cores: 16
H2O cluster allowed cores: 16
H2O cluster healthy: TRUE
R Version: R version 3.6.3 (2020-02-29)
我的 macOS 有 16 GB 内存。
数据中有 6 个变量(列):5 个分类变量和 1 个数值变量。 5 个分类变量中唯一值的数量分别为 17、49、52、85 和 5032。行数约为 500k。数据文件大小为 44 MB(在 H2O 中编码之前)。
我可以做些什么来解决这个问题?如果我可以提供任何其他信息,请告诉我。感谢您的帮助!
[也在这里剪切并粘贴我对 h2ostream 邮件列表的回复...]
我怀疑大量分类级别导致内存爆炸。
尝试删除该变量并查看它是否至少完成。
如果是这样,请尝试以某种方式重新合并到较少的级别。
I 运行 H2O 异常 per_feature = TRUE,导致 Java 堆 Space 错误。在其他一些关于此错误消息的帖子中,我看到有人建议使用 h2o.remove(df) 来释放已用内存。但是,在我的例子中,我没有任何循环,似乎没有什么可以删除以释放一些已用内存。
这是我的代码:
library(h2o)
h2o.init(min_mem_size = "10G", max_mem_size = "15G")
data.hex <- as.h2o(data)
x <- names(data.hex)
random_seed <- 42
# Deeplearning Model
print("Deep learning model begins ...")
model.dl = h2o.deeplearning(x = x,
training_frame = data.hex,
autoencoder = TRUE,
activation = "Tanh",
hidden = c(5, 5, 5, 5, 5),
mini_batch_size = 64,
epochs = 100,
stopping_rounds = 15,
variable_importances = TRUE,
seed = random_seed)
# Calculating anomaly per feature
print('Calculating anomaly per feature ...')
errors_per_feature <- h2o.anomaly(model.dl, data.hex, per_feature = TRUE) # Anomaly Detection Algorithm
print('Converting from H2O frame to dataframe ...')
errors1_per_feature <- as.data.frame(errors_per_feature) # Convert back to data frame
这里是详细的错误信息:
[1] "Deep learning model begins ..."
|======================================================================| 100%
[1] "Calculating anomaly per feature ..."
ERROR: Unexpected HTTP Status code: 500 Server Error (url = http://localhost:54321/3/Predictions/models/DeepLearning_model_R_1594826474037_2/frames/Accesses_sid_a71f_1)
water.util.DistributedException
[1] "DistributedException from localhost/127.0.0.1:54321: 'Java heap space', caused by java.lang.OutOfMemoryError: Java heap space"
[2] " water.MRTask.getResult(MRTask.java:494)"
[3] " water.MRTask.getResult(MRTask.java:502)"
[4] " water.MRTask.doAll(MRTask.java:397)"
[5] " water.MRTask.doAll(MRTask.java:403)"
[6] " hex.deeplearning.DeepLearningModel.scoreAutoEncoder(DeepLearningModel.java:761)"
[7] " water.api.ModelMetricsHandler.predict(ModelMetricsHandler.java:469)"
[8] " java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"
[9] " java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)"
[10] " java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)"
[11] " java.base/java.lang.reflect.Method.invoke(Method.java:567)"
[12] " water.api.Handler.handle(Handler.java:60)"
[13] " water.api.RequestServer.serve(RequestServer.java:470)"
[14] " water.api.RequestServer.doGeneric(RequestServer.java:301)"
[15] " water.api.RequestServer.doPost(RequestServer.java:227)"
[16] " javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"
[17] " javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"
[18] " org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"
[19] " org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)"
[20] " org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)"
[21] " org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:427)"
[22] " org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)"
[23] " org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)"
[24] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[25] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[26] " water.webserver.jetty8.Jetty8ServerAdapter$LoginHandler.handle(Jetty8ServerAdapter.java:119)"
[27] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[28] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[29] " org.eclipse.jetty.server.Server.handle(Server.java:370)"
[30] " org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)"
[31] " org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)"
[32] " org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:984)"
[33] " org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1045)"
[34] " org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)"
[35] " org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:236)"
[36] " org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)"
[37] " org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)"
[38] " org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)"
[39] " org.eclipse.jetty.util.thread.QueuedThreadPool.run(QueuedThreadPool.java:543)"
[40] " java.base/java.lang.Thread.run(Thread.java:830)"
[41] "Caused by:java.lang.OutOfMemoryError: Java heap space"
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
DistributedException from localhost/127.0.0.1:54321: 'Java heap space'
Calls: h2o.anomaly -> .h2o.__remoteSend -> .h2o.doSafeREST
Execution halted
R 和 H2O 版本:
H2O cluster version: 3.30.0.6
H2O cluster total nodes: 1
H2O cluster total memory: 13.43 GB
H2O cluster total cores: 16
H2O cluster allowed cores: 16
H2O cluster healthy: TRUE
R Version: R version 3.6.3 (2020-02-29)
我的 macOS 有 16 GB 内存。
数据中有 6 个变量(列):5 个分类变量和 1 个数值变量。 5 个分类变量中唯一值的数量分别为 17、49、52、85 和 5032。行数约为 500k。数据文件大小为 44 MB(在 H2O 中编码之前)。
我可以做些什么来解决这个问题?如果我可以提供任何其他信息,请告诉我。感谢您的帮助!
[也在这里剪切并粘贴我对 h2ostream 邮件列表的回复...]
我怀疑大量分类级别导致内存爆炸。
尝试删除该变量并查看它是否至少完成。
如果是这样,请尝试以某种方式重新合并到较少的级别。