将在 python 中训练的 XGBoost 模型移植到另一个在 C/C++ 中编写的系统

Question

假设我已经在 python 中成功训练了一个 XGBoost 机器学习模型。

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=7)
model = XGBClassifier()
model.fit(x_train, y_train)
y_pred = model.predict(x_test)

我想将这个模型移植到另一个系统中，该系统将用 C/C++ 编写。为此，我需要了解 XGboost 训练模型的内部逻辑，并将其转化为一系列类似决策树的 if-then-else 语句，如果我没记错的话。

如何做到这一点？如何找出XGBoost训练好的模型的内部逻辑在另一个系统上实现？

我正在使用 python 3.7.

Answer 1

有人写了一个脚本来做这件事。查看 https://github.com/popcorn/xgb2cpp

Answer 2

m2cgen 是一个很棒的包，可以将 Scikit-Learn 兼容模型转换为原始代码。如果您正在使用 XGBoosts sklearn 包装器（看起来像您），那么您可以简单地调用如下内容：

model = XGBClassifier()
model.fit(x_train, y_train)
 ...
import m2cgen as m2c

with open('./model.c','w') as f:
    code = m2c.export_to_c(model)
    f.write(code)

这个包真正棒的地方在于它支持许多不同类型的模型，例如

线性
支持向量机
树
随机森林
提升

还有一件事。 m2cgen还支持

等多种语言

C
C#
飞镖
去
Haskell
Java
Java脚本
PHP
PowerShell
Python
R
Visual Basic

希望对您有所帮助！

Answer 3

使用任何 ml/dl 模型的推荐方法是使用 flask/bottle 制作简单的 RESTful API（这些是轻量级 python 框架）并使用它们全球使用任何语言。

您还可以将 RESTful API 与 docker 容器化，以防您正在开发包含大量模型的大型项目。即使是容器化的 Restful APIs 也用于在云上部署模型，例如 aws。

如果您有兴趣了解任何 ml 模型背后的逻辑，请始终查看其源代码（在 GitHub 上）。

将在 python 中训练的 XGBoost 模型移植到另一个在 C/C++ 中编写的系统

Port XGBoost model trained in python to another system written in C/C++

python

machine-learning

xgboost