如何在 python 中进行 PCA 和 SVM 分类
How to do PCA and SVM for classification in python
我正在做分类,我有一个列表,有两个这样的大小;
Data=[list1,list2]
list1 的大小为 1000*784。这意味着 1000 张图像已从 28*28 大小重塑为 784.
list2 的大小为 1000*1。它显示了每个图像所属的标签。
使用以下代码,我应用了 PCA:
from matplotlib.mlab import PCA
results = PCA(Data[0])
输出是这样的:
Out[40]: <matplotlib.mlab.PCA instance at 0x7f301d58c638>
现在,我想使用 SVM 作为分类器。
我应该添加标签。所以我有像这样的 SVm 新数据:
newData=[results,Data[1]]
我不知道这里怎么用SVM。
我想你要找的是http://scikit-learn.org/. It's a python library where you'll find PCA, SVM and other cool algorithms for Machine Learning. It has a good tutorial, but I recommend you follow this guy's http://www.astroml.org/sklearn_tutorial/general_concepts.html . For your particular question, the SVM page of scikit-learn should suffice http://scikit-learn.org/stable/modules/svm.html。
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn import cross_validation
Data=[list1,list2]
X = Data[0]
y = Data[1]
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0)
pca = PCA(n_components=2)# adjust yourself
pca.fit(X_train)
X_t_train = pca.transform(X_train)
X_t_test = pca.transform(X_test)
clf = SVC()
clf.fit(X_t_train, y_train)
print 'score', clf.score(X_t_test, y_test)
print 'pred label', clf.predict(X_t_test)
这是在另一个数据集上测试过的代码。
import numpy as np
from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn import cross_validation
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0)
pca = PCA(n_components=2)# adjust yourself
pca.fit(X_train)
X_t_train = pca.transform(X_train)
X_t_test = pca.transform(X_test)
clf = SVC()
clf.fit(X_t_train, y_train)
print 'score', clf.score(X_t_test, y_test)
print 'pred label', clf.predict(X_t_test)
基于这些参考文献:
我正在做分类,我有一个列表,有两个这样的大小;
Data=[list1,list2]
list1 的大小为 1000*784。这意味着 1000 张图像已从 28*28 大小重塑为 784.
list2 的大小为 1000*1。它显示了每个图像所属的标签。 使用以下代码,我应用了 PCA:
from matplotlib.mlab import PCA
results = PCA(Data[0])
输出是这样的:
Out[40]: <matplotlib.mlab.PCA instance at 0x7f301d58c638>
现在,我想使用 SVM 作为分类器。 我应该添加标签。所以我有像这样的 SVm 新数据:
newData=[results,Data[1]]
我不知道这里怎么用SVM。
我想你要找的是http://scikit-learn.org/. It's a python library where you'll find PCA, SVM and other cool algorithms for Machine Learning. It has a good tutorial, but I recommend you follow this guy's http://www.astroml.org/sklearn_tutorial/general_concepts.html . For your particular question, the SVM page of scikit-learn should suffice http://scikit-learn.org/stable/modules/svm.html。
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn import cross_validation
Data=[list1,list2]
X = Data[0]
y = Data[1]
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0)
pca = PCA(n_components=2)# adjust yourself
pca.fit(X_train)
X_t_train = pca.transform(X_train)
X_t_test = pca.transform(X_test)
clf = SVC()
clf.fit(X_t_train, y_train)
print 'score', clf.score(X_t_test, y_test)
print 'pred label', clf.predict(X_t_test)
这是在另一个数据集上测试过的代码。
import numpy as np
from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn import cross_validation
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0)
pca = PCA(n_components=2)# adjust yourself
pca.fit(X_train)
X_t_train = pca.transform(X_train)
X_t_test = pca.transform(X_test)
clf = SVC()
clf.fit(X_t_train, y_train)
print 'score', clf.score(X_t_test, y_test)
print 'pred label', clf.predict(X_t_test)
基于这些参考文献: