选择关于 k 折交叉验证的最佳模型
Choosing best model regarding to k-fold cross validation
我想获取 Iris 数据并根据 GridSearchCV 函数选择最佳逻辑模型。
我目前的工作
import numpy as np
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
iris = datasets.load_iris()
X = iris.data[:, :2]
y = iris.target
# Logistic regression
reg_log = LogisticRegression()
# Penalties
pen = ['l1', 'l2','none']
#Regularization strength (numbers from -10 up to 10)
C = np.logspace(-10, 10, 100)
# Possibilities for those parameters
parameters= dict(C=C, penalty=pen)
# choosing best model based on 5-fold cross validation
Model = GridSearchCV(reg_log, parameters, cv=5)
# Fitting best model
Best_model = Model.fit(X, y)
而且我遇到了很多错误。你知道我做错了什么吗?
由于你选择了不同的正则化,你可以在help page上看到:
The ‘newton-cg’, ‘sag’, and ‘lbfgs’ solvers support only L2
regularization with primal formulation, or no regularization. The
‘liblinear’ solver supports both L1 and L2 regularization, with a dual
formulation only for the L2 penalty. The Elastic-Net regularization is
only supported by the ‘saga’ solver.
我不太确定你是否想使用 penalization = 'none' 和 penalization 分数进行网格搜索。因此,如果您使用 saga
并增加迭代:
reg_log = LogisticRegression(solver="saga",max_iter=1000)
pen = ['l1', 'l2']
C = [0.1,0.001]
parameters= dict(C=C, penalty=pen)
Model = GridSearchCV(reg_log, parameters, cv=5)
Best_model = Model.fit(X, y)
res = pd.DataFrame(Best_model.cv_results_)
res[['param_C','param_penalty','mean_test_score']]
param_C param_penalty mean_test_score
0 0.1 l1 0.753333
1 0.1 l2 0.833333
2 0.001 l1 0.333333
3 0.001 l2 0.700000
它工作得很好。如果您的惩罚值出现更多错误.. 尝试查看它们并确保它们不是一些疯狂的值。
我想获取 Iris 数据并根据 GridSearchCV 函数选择最佳逻辑模型。
我目前的工作
import numpy as np
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
iris = datasets.load_iris()
X = iris.data[:, :2]
y = iris.target
# Logistic regression
reg_log = LogisticRegression()
# Penalties
pen = ['l1', 'l2','none']
#Regularization strength (numbers from -10 up to 10)
C = np.logspace(-10, 10, 100)
# Possibilities for those parameters
parameters= dict(C=C, penalty=pen)
# choosing best model based on 5-fold cross validation
Model = GridSearchCV(reg_log, parameters, cv=5)
# Fitting best model
Best_model = Model.fit(X, y)
而且我遇到了很多错误。你知道我做错了什么吗?
由于你选择了不同的正则化,你可以在help page上看到:
The ‘newton-cg’, ‘sag’, and ‘lbfgs’ solvers support only L2 regularization with primal formulation, or no regularization. The ‘liblinear’ solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty. The Elastic-Net regularization is only supported by the ‘saga’ solver.
我不太确定你是否想使用 penalization = 'none' 和 penalization 分数进行网格搜索。因此,如果您使用 saga
并增加迭代:
reg_log = LogisticRegression(solver="saga",max_iter=1000)
pen = ['l1', 'l2']
C = [0.1,0.001]
parameters= dict(C=C, penalty=pen)
Model = GridSearchCV(reg_log, parameters, cv=5)
Best_model = Model.fit(X, y)
res = pd.DataFrame(Best_model.cv_results_)
res[['param_C','param_penalty','mean_test_score']]
param_C param_penalty mean_test_score
0 0.1 l1 0.753333
1 0.1 l2 0.833333
2 0.001 l1 0.333333
3 0.001 l2 0.700000
它工作得很好。如果您的惩罚值出现更多错误.. 尝试查看它们并确保它们不是一些疯狂的值。