h2o 深度学习每个运行的不同结果

Question

我使用h2o deep learning using python在2个balanced classes "0" and "1"的数据上，调整参数如下：

prostate_dl = H2ODeepLearningEstimator(
     activation=,"Tanh"
     hidden=[50,50,50],
     distribution="multinomial",
    score_interval=10,
    epochs=1000,
    input_dropout_ratio=0.2
    ,adaptive_rate=True
    , rho=0.998, epsilon = 1e-8
    )

prostate_dl .train( 
x=x,
y=y,
training_frame =train,
validation_frame = test)

每次程序运行给出不同的混淆矩阵和准确度结果，能解释一下吗？结果怎么可能靠谱？

此外，所有运行都给出了多数预测 class“1”而不是“0”，他们有什么建议吗？

Answer 1

这个问题已经回答了，但是在Python（或h2o.deeplearning()中初始化H2ODeepLearningEstimator时需要设置reproducible=TRUE在 R).

即使在设置reproducible=TRUE之后，H2O深度学习的结果也只有在使用单核时才能重现；换句话说，当 h2o.init(nthreads = 1)。概述了这背后的原因 here。

此外，根据 H2O 深度学习 user guide：

Does each Mapper task work on a separate neural-net model that is combined during reduction, or is each Mapper manipulating a shared object that’s persistent across nodes?

Neither; there’s one model per compute node, so multiple Mappers/threads share one model, which is why H2O is not reproducible unless a small dataset is used and force_load_balance=F or reproducible=T, which effectively rebalances to a single chunk and leads to only one thread to launch a map(). The current behavior is simple model averaging; between-node model averaging via “Elastic Averaging” is currently in progress.

h2o 深度学习每个运行的不同结果

h2o deep learning different results per run

deep-learning

h2o

h2o 深度学习每个 运行 的不同结果

h2o deep learning different results per run

deep-learning

h2o

h2o 深度学习每个运行的不同结果