自动机器学习 python 等效代码
Auto-Machine-Learning python equivalent code
有什么方法可以从 auto-sklearn 的独立 python 脚本中提取自动生成的机器学习管道?
这是一个使用 auto-sklearn 的示例代码:
import autosklearn.classification
import sklearn.cross_validation
import sklearn.datasets
import sklearn.metrics
digits = sklearn.datasets.load_digits()
X = digits.data
y = digits.target
X_train, X_test, y_train, y_test = sklearn.cross_validation.train_test_split(X, y, random_state=1)
automl = autosklearn.classification.AutoSklearnClassifier()
automl.fit(X_train, y_train)
y_hat = automl.predict(X_test)
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_hat))
如果能以某种方式自动生成等效的 python 代码就好了。
相比之下,在使用TPOT时我们可以得到如下的standalone pipeline:
from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, train_size=0.75, test_size=0.25)
tpot = TPOTClassifier(generations=5, population_size=20, verbosity=2)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))
tpot.export('tpot-mnist-pipeline.py')
并且在检查 tpot-mnist-pipeline.py
时可以看到整个 ML 管道:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import make_pipeline
# NOTE: Make sure that the class is labeled 'class' in the data file
tpot_data = np.recfromcsv('PATH/TO/DATA/FILE', delimiter='COLUMN_SEPARATOR')
features = tpot_data.view((np.float64, len(tpot_data.dtype.names)))
features = np.delete(features, tpot_data.dtype.names.index('class'), axis=1)
training_features, testing_features, training_classes, testing_classes = train_test_split(features, tpot_data['class'], random_state=42)
exported_pipeline = make_pipeline(
KNeighborsClassifier(n_neighbors=3, weights="uniform")
)
exported_pipeline.fit(training_features, training_classes)
results = exported_pipeline.predict(testing_features)
上面的例子与现有的 post 有关自动化一些浅层机器学习的发现 here。
没有自动化的方法。
您可以将对象存储为 pickle 格式并稍后加载。
with open('automl.pkl', 'wb') as output:
pickle.dump(automl,output)
您可以调试拟合或预测方法,看看发生了什么。
有什么方法可以从 auto-sklearn 的独立 python 脚本中提取自动生成的机器学习管道?
这是一个使用 auto-sklearn 的示例代码:
import autosklearn.classification
import sklearn.cross_validation
import sklearn.datasets
import sklearn.metrics
digits = sklearn.datasets.load_digits()
X = digits.data
y = digits.target
X_train, X_test, y_train, y_test = sklearn.cross_validation.train_test_split(X, y, random_state=1)
automl = autosklearn.classification.AutoSklearnClassifier()
automl.fit(X_train, y_train)
y_hat = automl.predict(X_test)
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_hat))
如果能以某种方式自动生成等效的 python 代码就好了。
相比之下,在使用TPOT时我们可以得到如下的standalone pipeline:
from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, train_size=0.75, test_size=0.25)
tpot = TPOTClassifier(generations=5, population_size=20, verbosity=2)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))
tpot.export('tpot-mnist-pipeline.py')
并且在检查 tpot-mnist-pipeline.py
时可以看到整个 ML 管道:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import make_pipeline
# NOTE: Make sure that the class is labeled 'class' in the data file
tpot_data = np.recfromcsv('PATH/TO/DATA/FILE', delimiter='COLUMN_SEPARATOR')
features = tpot_data.view((np.float64, len(tpot_data.dtype.names)))
features = np.delete(features, tpot_data.dtype.names.index('class'), axis=1)
training_features, testing_features, training_classes, testing_classes = train_test_split(features, tpot_data['class'], random_state=42)
exported_pipeline = make_pipeline(
KNeighborsClassifier(n_neighbors=3, weights="uniform")
)
exported_pipeline.fit(training_features, training_classes)
results = exported_pipeline.predict(testing_features)
上面的例子与现有的 post 有关自动化一些浅层机器学习的发现 here。
没有自动化的方法。 您可以将对象存储为 pickle 格式并稍后加载。
with open('automl.pkl', 'wb') as output:
pickle.dump(automl,output)
您可以调试拟合或预测方法,看看发生了什么。