SVC 未在 MNIST 上解析
SVC not resolving on MNIST
我正在使用以下代码分解 MNIST 和 运行 SVM:
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist['data'], mnist['target']
y = y.astype(np.uint8)
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
svm_clf = SVC()
svm_clf.fit(X_train, y_train)
昨晚我把它留给了 运行。三个小时过去了,还是没有解决。
我收到未来警告
FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
"avoid this warning.", FutureWarning)
但我无法想象不设置伽马会以这种方式影响它。
我在 Jupyter 5.7.8 中 运行ning Python 3.6.7。
确实没有解决办法。随着训练向量数量的增加,训练时间也会增加。
参考:https://scikit-learn.org/stable/modules/svm.html#complexity
郑重声明,SVM 非常适合解决这些问题(参见此处:https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html)但是当数据集很大时它们会变慢。
编辑 1:在 sklearn website 中有这个:
The implementation is based on libsvm. The fit time scales at least
quadratically with the number of samples and may be impractical beyond
tens of thousands of samples. For large datasets consider using
sklearn.linear_model.LinearSVC or sklearn.linear_model.SGDClassifier
instead, possibly after a sklearn.kernel_approximation.Nystroem
transformer.
我正在使用以下代码分解 MNIST 和 运行 SVM:
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist['data'], mnist['target']
y = y.astype(np.uint8)
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
svm_clf = SVC()
svm_clf.fit(X_train, y_train)
昨晚我把它留给了 运行。三个小时过去了,还是没有解决。
我收到未来警告
FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
"avoid this warning.", FutureWarning)
但我无法想象不设置伽马会以这种方式影响它。
我在 Jupyter 5.7.8 中 运行ning Python 3.6.7。
确实没有解决办法。随着训练向量数量的增加,训练时间也会增加。
参考:https://scikit-learn.org/stable/modules/svm.html#complexity
郑重声明,SVM 非常适合解决这些问题(参见此处:https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html)但是当数据集很大时它们会变慢。
编辑 1:在 sklearn website 中有这个:
The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples. For large datasets consider using sklearn.linear_model.LinearSVC or sklearn.linear_model.SGDClassifier instead, possibly after a sklearn.kernel_approximation.Nystroem transformer.