Hyperopt 求解的最佳参数不合适
Best parameters solved by Hyperopt is unsuitable
我使用 hyperopt 搜索 SVM 分类器的最佳参数,但 Hyperopt 说最好 'kernel' 是“0”。 {'kernel': '0'}显然不合适
有谁知道是我的错还是一袋hyperopt造成的?
代码如下。
from hyperopt import fmin, tpe, hp, rand
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn import svm
from sklearn.cross_validation import StratifiedKFold
parameter_space_svc = {
'C':hp.loguniform("C", np.log(1), np.log(100)),
'kernel':hp.choice('kernel',['rbf','poly']),
'gamma': hp.loguniform("gamma", np.log(0.001), np.log(0.1)),
}
from sklearn import datasets
iris = datasets.load_digits()
train_data = iris.data
train_target = iris.target
count = 0
def function(args):
print(args)
score_avg = 0
skf = StratifiedKFold(train_target, n_folds=3, shuffle=True, random_state=1)
for train_idx, test_idx in skf:
train_X = iris.data[train_idx]
train_y = iris.target[train_idx]
test_X = iris.data[test_idx]
test_y = iris.target[test_idx]
clf = svm.SVC(**args)
clf.fit(train_X,train_y)
prediction = clf.predict(test_X)
score = accuracy_score(test_y, prediction)
score_avg += score
score_avg /= len(skf)
global count
count = count + 1
print("round %s" % str(count),score_avg)
return -score_avg
best = fmin(function, parameter_space_svc, algo=tpe.suggest, max_evals=100)
print("best estimate parameters",best)
输出如下。
best estimate parameters {'C': 13.271912841932233, 'gamma': 0.0017394328334592358, 'kernel': 0}
首先,您使用的是 sklearn.cross_validation
,它已从 0.18 版开始弃用。所以请将其更新为 sklearn.model_selection
.
现在是主要问题,fmin
中的 best
总是 returns 使用 hp.choice
定义的参数索引。
所以在你的情况下,'kernel':0
意味着第一个值 ('rbf'
) 被选为内核的最佳值。
看到这个问题证实了这一点:
要从 best
获取原始值,请使用 space_eval()
函数,如下所示:
from hyperopt import space_eval
space_eval(parameter_space_svc, best)
Output:
{'C': 13.271912841932233, 'gamma': 0.0017394328334592358, 'kernel': 'rbf'}
我使用 hyperopt 搜索 SVM 分类器的最佳参数,但 Hyperopt 说最好 'kernel' 是“0”。 {'kernel': '0'}显然不合适
有谁知道是我的错还是一袋hyperopt造成的?
代码如下。
from hyperopt import fmin, tpe, hp, rand
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn import svm
from sklearn.cross_validation import StratifiedKFold
parameter_space_svc = {
'C':hp.loguniform("C", np.log(1), np.log(100)),
'kernel':hp.choice('kernel',['rbf','poly']),
'gamma': hp.loguniform("gamma", np.log(0.001), np.log(0.1)),
}
from sklearn import datasets
iris = datasets.load_digits()
train_data = iris.data
train_target = iris.target
count = 0
def function(args):
print(args)
score_avg = 0
skf = StratifiedKFold(train_target, n_folds=3, shuffle=True, random_state=1)
for train_idx, test_idx in skf:
train_X = iris.data[train_idx]
train_y = iris.target[train_idx]
test_X = iris.data[test_idx]
test_y = iris.target[test_idx]
clf = svm.SVC(**args)
clf.fit(train_X,train_y)
prediction = clf.predict(test_X)
score = accuracy_score(test_y, prediction)
score_avg += score
score_avg /= len(skf)
global count
count = count + 1
print("round %s" % str(count),score_avg)
return -score_avg
best = fmin(function, parameter_space_svc, algo=tpe.suggest, max_evals=100)
print("best estimate parameters",best)
输出如下。
best estimate parameters {'C': 13.271912841932233, 'gamma': 0.0017394328334592358, 'kernel': 0}
首先,您使用的是 sklearn.cross_validation
,它已从 0.18 版开始弃用。所以请将其更新为 sklearn.model_selection
.
现在是主要问题,fmin
中的 best
总是 returns 使用 hp.choice
定义的参数索引。
所以在你的情况下,'kernel':0
意味着第一个值 ('rbf'
) 被选为内核的最佳值。
看到这个问题证实了这一点:
要从 best
获取原始值,请使用 space_eval()
函数,如下所示:
from hyperopt import space_eval
space_eval(parameter_space_svc, best)
Output:
{'C': 13.271912841932233, 'gamma': 0.0017394328334592358, 'kernel': 'rbf'}