在 H2O(深度学习)中交叉验证后未对齐的预测和响应列

Misalign predictions and the response column after crossvalidation in H2O (Deep Learning)

我一直对深度学习模型有疑问。我有一个在 rrc 数据框架上训练的模型,如果我这样做:

rrc['preds'] = dp.cross_validation_holdout_predictions().as_data_frame().predict 我总是错位响应列和预测。在数据框的顶部有对齐,但在某些时候它们似乎未对齐,如果我计算它们之间的相关性非常糟糕,因为这种未对齐。我已经尝试修复此问题 3 天多了,但我不知道该怎么做。

我正在使用 H2O 3.10.4.5。 模型本身:

dp = H2ODeepLearningEstimator(activation = "Tanh", hidden = [10, 10, 10], epochs = 10000, keep_cross_validation_predictions=True, ignored_columns = ['fn', 'pdb_id','pdb_id_chain', 'pdb_id_chain_source', 'source']) dp.train(x = list(set(rrch.col_names) - set(['rmsd_all'])), y ="rmsd_all", training_frame = rrch, fold_column="cv")

编辑:我想我发现了问题(单元格 #58)https://github.com/mmagnus/mmagnus.github.io/blob/master/mq-test.ipynb If I do rrc3 = rrc3[rrc3.rmsd_all < 10] to remove some rows that rmsd_all (the response column) value is higher than 10 and then I do rrc3h = h2o.H2OFrame(rrc3) caused the problem. I'm not sure why though. The dataset, 40mb https://www.dropbox.com/s/1et38o3xx47jw1m/rasp_rnakb_cv2.csv?dl=0

已解决:rrc3.reset_index(inplace=True) 会完成任务!