h2o deeplearning：输入变量 impact/coefficient 是什么？

Question

我正在尝试使用 h2o 深度学习模型预测美国机场的滑行时间：

#Deep learning neural network

  deep<-h2o.deeplearning(
    training_frame = train,
    validation_frame = valid,
    x=predictors,
    y=target,
    #distribution = "gaussian",
    #loss = "Automatic",
    hidden=c(200,200,200),
    epochs = 50,
    #activation="Rectifier",
    stopping_metric="deviance",
    stopping_tolerance=1e-4,      # stops when deviance does not improve by 
                                     >=0.0001 for 5 scoring events
  )

  summary(deep)

这是截断的变量重要性列表：

变量重要性：

         variable relative_importance scaled_importance percentage
1     Event_1.Fog            1.000000          1.000000   0.024205
2    Event_2.Rain            0.983211          0.983211   0.023799
3      CARRIER.NK            0.946493          0.946493   0.022910
4 Event_1.noevent            0.936131          0.936131   0.022659
5     cos_deptime            0.934558          0.934558   0.022621

我知道 "importance" 是根据变量的相对影响计算的，但我如何知道该变量是否有助于增加或减少滑出时间？ h2o 是否用符号显示每个变量的系数？我已阅读此文档 http://h2o-release.s3.amazonaws.com/h2o/latest_stable_doc.html，但它没有解释可变的雾或雨是否会增加或减少滑出时间以及增加或减少多少。

Answer 1

H2O 深度学习（或 RF 或 GBM，就此而言）的变量重要性与 GLM 中的系数幅度（可以是正数或负数）具有不同的解释，这就是您所描述的。可以解释为"how important is this variable in predicting the outcome"，测度是相对于模型中其他变量的。

如 H2O Deep Learning documentation 中所述，我们使用一种称为 Gedeon 方法的技术来计算此度量。（RF 和 GBM 使用不同的方法）。

h2o deeplearning：输入变量 impact/coefficient 是什么？

h2o deeplearning: what is the input variable impact/coefficient?

coefficients

h2o