如何使用在 Python API 中计算和保存的 H2OModel
How to consume a H2OModel computed and saved in Python API
我已经阅读了一段时间的 H2O 文档,但我还没有找到一个明确的示例来说明如何使用 Python API 加载训练和保存的 model
.我正在关注下一个示例。
import h2o
from h2o.estimators.naive_bayes import H2ONaiveBayesEstimator
model = H2ONaiveBayesEstimator()
h2o_df = h2o.import_file("http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip")
model.train(y = "IsDepDelayed", x = ["Year", "Origin"],
training_frame = h2o_df,
family = "binomial",
lambda_search = True,
max_active_predictors = 10)
h2o.save_model(model, path=models)
但是,如果您查看官方 documentation,它表明您必须从流程 UI 中将模型下载为 POJO
。 这是唯一的方法吗? 或者,我可以通过 python 获得相同的结果吗? 仅供参考,我展示了文档的示例以下。我需要一些指导。
import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
public class main {
private static String modelClassName = "gbm_pojo_test";
public static void main(String[] args) throws Exception {
hex.genmodel.GenModel rawModel;
rawModel = (hex.genmodel.GenModel) Class.forName(modelClassName).newInstance();
EasyPredictModelWrapper model = new EasyPredictModelWrapper(rawModel);
//
// By default, unknown categorical levels throw PredictUnknownCategoricalLevelException.
// Optionally configure the wrapper to treat unknown categorical levels as N/A instead:
//
// EasyPredictModelWrapper model = new EasyPredictModelWrapper(
// new EasyPredictModelWrapper.Config()
// .setModel(rawModel)
// .setConvertUnknownCategoricalLevelsToNa(true));
RowData row = new RowData();
row.put("Year", "1987");
row.put("Month", "10");
row.put("DayofMonth", "14");
row.put("DayOfWeek", "3");
row.put("CRSDepTime", "730");
row.put("UniqueCarrier", "PS");
row.put("Origin", "SAN");
row.put("Dest", "SFO");
BinomialModelPrediction p = model.predictBinomial(row);
System.out.println("Label (aka prediction) is flight departure delayed: " + p.label);
System.out.print("Class probabilities: ");
for (int i = 0; i < p.classProbabilities.length; i++) {
if (i > 0) {
System.out.print(",");
}
System.out.print(p.classProbabilities[i]);
}
System.out.println("");
}
}
h2o.save_model 会将二进制模型保存到提供的文件系统中,但是,查看上面的 Java 应用程序似乎您想将模型用于基于 Java 的评分应用程序.
因此,您应该使用 h2o.download_pojo API 将模型与 genmodel jar 文件一起保存到本地文件系统。 API 记录如下:
download_pojo(model, path=u'', get_jar=True)
Download the POJO for this model to the directory specified by the path; if the path is "", then dump to screen.
:param model: the model whose scoring POJO should be retrieved.
:param path: an absolute path to the directory where POJO should be saved.
:param get_jar: retrieve the h2o-genmodel.jar also.
下载 POJO 后,您可以使用上面的示例应用程序执行评分并确保 POJO class 名称和 "modelClassName" 与模型类型相同。
我已经阅读了一段时间的 H2O 文档,但我还没有找到一个明确的示例来说明如何使用 Python API 加载训练和保存的 model
.我正在关注下一个示例。
import h2o
from h2o.estimators.naive_bayes import H2ONaiveBayesEstimator
model = H2ONaiveBayesEstimator()
h2o_df = h2o.import_file("http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip")
model.train(y = "IsDepDelayed", x = ["Year", "Origin"],
training_frame = h2o_df,
family = "binomial",
lambda_search = True,
max_active_predictors = 10)
h2o.save_model(model, path=models)
但是,如果您查看官方 documentation,它表明您必须从流程 UI 中将模型下载为 POJO
。 这是唯一的方法吗? 或者,我可以通过 python 获得相同的结果吗? 仅供参考,我展示了文档的示例以下。我需要一些指导。
import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
public class main {
private static String modelClassName = "gbm_pojo_test";
public static void main(String[] args) throws Exception {
hex.genmodel.GenModel rawModel;
rawModel = (hex.genmodel.GenModel) Class.forName(modelClassName).newInstance();
EasyPredictModelWrapper model = new EasyPredictModelWrapper(rawModel);
//
// By default, unknown categorical levels throw PredictUnknownCategoricalLevelException.
// Optionally configure the wrapper to treat unknown categorical levels as N/A instead:
//
// EasyPredictModelWrapper model = new EasyPredictModelWrapper(
// new EasyPredictModelWrapper.Config()
// .setModel(rawModel)
// .setConvertUnknownCategoricalLevelsToNa(true));
RowData row = new RowData();
row.put("Year", "1987");
row.put("Month", "10");
row.put("DayofMonth", "14");
row.put("DayOfWeek", "3");
row.put("CRSDepTime", "730");
row.put("UniqueCarrier", "PS");
row.put("Origin", "SAN");
row.put("Dest", "SFO");
BinomialModelPrediction p = model.predictBinomial(row);
System.out.println("Label (aka prediction) is flight departure delayed: " + p.label);
System.out.print("Class probabilities: ");
for (int i = 0; i < p.classProbabilities.length; i++) {
if (i > 0) {
System.out.print(",");
}
System.out.print(p.classProbabilities[i]);
}
System.out.println("");
}
}
h2o.save_model 会将二进制模型保存到提供的文件系统中,但是,查看上面的 Java 应用程序似乎您想将模型用于基于 Java 的评分应用程序.
因此,您应该使用 h2o.download_pojo API 将模型与 genmodel jar 文件一起保存到本地文件系统。 API 记录如下:
download_pojo(model, path=u'', get_jar=True)
Download the POJO for this model to the directory specified by the path; if the path is "", then dump to screen.
:param model: the model whose scoring POJO should be retrieved.
:param path: an absolute path to the directory where POJO should be saved.
:param get_jar: retrieve the h2o-genmodel.jar also.
下载 POJO 后,您可以使用上面的示例应用程序执行评分并确保 POJO class 名称和 "modelClassName" 与模型类型相同。