由于神秘的 TypeError,Scikit-learn GridSearchCV 无法用 silhouette_score 拟合 EM 模型
Scikit-learn GridSearchCV fails to fit EM model with silhouette_score due to cryptic TypeError
以下代码产生:TypeError: __call__() takes at least 4 arguments (3 given)
。
我已经实例化了一个聚类分类器和一个创建的适合聚类的评分方法。我提供了一个用于拟合的简单数据集和一个用于网格搜索的参数字典。我很难看到哪里有错误,回溯也毫无帮助。
from sklearn.mixture import GaussianMixture
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import silhouette_score, make_scorer
parameters = {'n_components': range(1, 6), 'covariance_type': ['full', 'tied', 'diag', 'spherical']}
silhouette_scorer = make_scorer(silhouette_score)
gm = GaussianMixture()
clusterer = GridSearchCV(gm, parameters, scoring=silhouette_scorer)
clusterer.fit(data)
回溯是神秘的,据我所知,我完全遵循 GridSearchCV 的 sklearn 文档中描述的语法和工作流程。我在这里做错了什么会导致这个错误?
数据内容如下:
Dimension 1 Dimension 2
0 -0.837489 -1.076500
1 1.746697 0.193893
2 -0.141929 -2.772168
3 -2.809583 -3.645926
4 -2.070939 -2.485348
.. ... ...
401 -0.477716 -0.347241
402 0.742407 0.005890
403 -2.152810 5.385891
404 -0.074108 -1.691082
405 0.555363 -0.002872
416 -1.597249 -0.804744
以下是回溯的最后几行:
/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
132
133 def __len__(self):
/usr/local/lib/python2.7/site-packages/sklearn/model_selection/_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, error_score)
258 else:
259 fit_time = time.time() - start_time
--> 260 test_score = _score(estimator, X_test, y_test, scorer)
261 score_time = time.time() - start_time - fit_time
262 if return_train_score:
/usr/local/lib/python2.7/site-packages/sklearn/model_selection/_validation.pyc in _score(estimator, X_test, y_test, scorer)
284 """Compute the score of an estimator on a given test set."""
285 if y_test is None:
--> 286 score = scorer(estimator, X_test)
287 else:
288 score = scorer(estimator, X_test, y_test)
TypeError: __call__() takes at least 4 arguments (3 given)
嗯,问题是,您使用了错误的函数作为 make_scorer
的参数。 documentation for make_scorer
表示:
score_func - Score function (or loss function) with signature score_func(y_true, y_pred, **kwargs)
并且您将 silhouette_score
传递给它,其中有一个 signature (X, labels, metric='euclidean' ...)
显然不符合 make_scorer
的要求,因此出现错误。
尝试将其更改为其他指标以解决错误。
以下代码产生:TypeError: __call__() takes at least 4 arguments (3 given)
。
我已经实例化了一个聚类分类器和一个创建的适合聚类的评分方法。我提供了一个用于拟合的简单数据集和一个用于网格搜索的参数字典。我很难看到哪里有错误,回溯也毫无帮助。
from sklearn.mixture import GaussianMixture
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import silhouette_score, make_scorer
parameters = {'n_components': range(1, 6), 'covariance_type': ['full', 'tied', 'diag', 'spherical']}
silhouette_scorer = make_scorer(silhouette_score)
gm = GaussianMixture()
clusterer = GridSearchCV(gm, parameters, scoring=silhouette_scorer)
clusterer.fit(data)
回溯是神秘的,据我所知,我完全遵循 GridSearchCV 的 sklearn 文档中描述的语法和工作流程。我在这里做错了什么会导致这个错误?
数据内容如下:
Dimension 1 Dimension 2
0 -0.837489 -1.076500
1 1.746697 0.193893
2 -0.141929 -2.772168
3 -2.809583 -3.645926
4 -2.070939 -2.485348
.. ... ...
401 -0.477716 -0.347241
402 0.742407 0.005890
403 -2.152810 5.385891
404 -0.074108 -1.691082
405 0.555363 -0.002872
416 -1.597249 -0.804744
以下是回溯的最后几行:
/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
132
133 def __len__(self):
/usr/local/lib/python2.7/site-packages/sklearn/model_selection/_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, error_score)
258 else:
259 fit_time = time.time() - start_time
--> 260 test_score = _score(estimator, X_test, y_test, scorer)
261 score_time = time.time() - start_time - fit_time
262 if return_train_score:
/usr/local/lib/python2.7/site-packages/sklearn/model_selection/_validation.pyc in _score(estimator, X_test, y_test, scorer)
284 """Compute the score of an estimator on a given test set."""
285 if y_test is None:
--> 286 score = scorer(estimator, X_test)
287 else:
288 score = scorer(estimator, X_test, y_test)
TypeError: __call__() takes at least 4 arguments (3 given)
嗯,问题是,您使用了错误的函数作为 make_scorer
的参数。 documentation for make_scorer
表示:
score_func - Score function (or loss function) with signature score_func(y_true, y_pred, **kwargs)
并且您将 silhouette_score
传递给它,其中有一个 signature (X, labels, metric='euclidean' ...)
显然不符合 make_scorer
的要求,因此出现错误。
尝试将其更改为其他指标以解决错误。