h2o 深度学习每个 运行 的不同结果

h2o deep learning different results per run

我使用h2o deep learning using python在2个balanced classes "0" and "1"的数据上,调整参数如下:

prostate_dl = H2ODeepLearningEstimator(
     activation=,"Tanh"
     hidden=[50,50,50],
     distribution="multinomial",
    score_interval=10,
    epochs=1000,
    input_dropout_ratio=0.2
    ,adaptive_rate=True
    , rho=0.998, epsilon = 1e-8
    )

prostate_dl .train( 
x=x,
y=y,
training_frame =train,
validation_frame = test) 

每次程序运行给出不同的混淆矩阵和准确度结果,能解释一下吗?结果怎么可能靠谱?

此外,所有运行都给出了多数预测 class“1”而不是“0”,他们有什么建议吗?

这个问题已经回答了,但是在Python(或h2o.deeplearning()中初始化H2ODeepLearningEstimator时需要设置reproducible=TRUE在 R).

即使在设置reproducible=TRUE之后,H2O深度学习的结果也只有在使用单核时才能重现;换句话说,当 h2o.init(nthreads = 1)。概述了这背后的原因 here

此外,根据 H2O 深度学习 user guide

Does each Mapper task work on a separate neural-net model that is combined during reduction, or is each Mapper manipulating a shared object that’s persistent across nodes?

Neither; there’s one model per compute node, so multiple Mappers/threads share one model, which is why H2O is not reproducible unless a small dataset is used and force_load_balance=F or reproducible=T, which effectively rebalances to a single chunk and leads to only one thread to launch a map(). The current behavior is simple model averaging; between-node model averaging via “Elastic Averaging” is currently in progress.