为什么在使用 scikit-learn 进行分类时,预测和得分 return 会产生不同的结果?
Why do predictions and scores return different results in classification using scikit-learn?
我基于鸢尾花数据集编写了一个非常简单的多类分类器。这是代码:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC, SVC
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import classification_report
# Load the data
iris = load_iris()
X = iris.data
y = iris.target
# Use label_binarize to be multi-label like settings
Y = label_binarize(y, classes=[0, 1, 2])
n_classes = Y.shape[1]
# Add noisy features
random_state = np.random.RandomState(0)
n_samples, n_features = X.shape
X = np.concatenate([X, random_state.randn(n_samples, 200 * n_features)], axis=1)
from sklearn.preprocessing import label_binarize
# Split into training and test
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.5, random_state=0
)
# Create classifier
classifier = OneVsRestClassifier(
make_pipeline(StandardScaler(), LinearSVC(random_state=random_state))
)
# Train the model
classifier.fit(X_train, y_train)
我的目标是通过两种方式预测测试集的值:
- 使用
classifier.predict()
函数并定义y_pred
.
- 使用
classifier.decision_function()
获取分数,然后为每个实例选择最高的分数并定义 y_pred_
。
我是这样做的:
# Get the scores for the Test set
y_score = classifier.decision_function(X_test)
# Make predictions
y_pred = classifier.predict(X_test)
y_pred_ = label_binarize(np.argmax(y_score, axis=1), [0,1,2])
但是看起来,当我尝试计算分类报告时,我得到的结果略有不同,但我希望结果是一样的,因为预测是基于从决策函数中获得的分数,如图所示在 documentation (line 789)。以下是两份报告:
print(classification_report(y_test, y_pred))
print(classification_report(y_test, y_pred_))
precision recall f1-score support
0 0.54 0.62 0.58 21
1 0.44 0.40 0.42 30
2 0.36 0.50 0.42 24
micro avg 0.44 0.49 0.47 75
macro avg 0.45 0.51 0.47 75
weighted avg 0.45 0.49 0.46 75
samples avg 0.39 0.49 0.42 75
precision recall f1-score support
0 0.42 0.38 0.40 21
1 0.52 0.47 0.49 30
2 0.38 0.46 0.42 24
micro avg 0.44 0.44 0.44 75
macro avg 0.44 0.44 0.44 75
weighted avg 0.45 0.44 0.44 75
samples avg 0.44 0.44 0.44 75
我做错了什么?您能否提出一个聪明而优雅的解决方案,使两个报告完全相同?
OneVsRestClassifier
假设您期望多标签结果,即单个输入可能有多个正标签。因此,结果不同于使用 argmax
和 decision_function
。
尝试
print(y_pred[0])
print(y_pred_[0])
输出:
[0 1 1]
[0 0 1]
对于多标签class化,你应该使用
y_pred_ = np.where(classifier.decision_function(X_test) > 0, 1, 0)
复制 predict()
方法的输出,因为在这种情况下,不同的 classes 不是相互排斥的,即一个给定的样本可以属于多个 classes。
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler, label_binarize
from sklearn.svm import LinearSVC
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import classification_report
# Load the data
iris = load_iris()
X = iris.data
y = label_binarize(iris.target, classes=[0, 1, 2])
# Split the data into training and test
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=0
)
# Create classifier
classifier = OneVsRestClassifier(
make_pipeline(StandardScaler(), LinearSVC(random_state=0))
)
# Train the model
classifier.fit(X_train, y_train)
# Make predictions
y_pred = classifier.predict(X_test)
y_pred_ = np.where(classifier.decision_function(X_test) > 0, 1, 0)
print(classification_report(y_test, y_pred))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.58 0.37 0.45 30
# 2 0.95 0.83 0.89 24
# micro avg 0.85 0.69 0.76 75
# macro avg 0.84 0.73 0.78 75
# weighted avg 0.82 0.69 0.74 75
# samples avg 0.66 0.69 0.67 75
print(classification_report(y_test, y_pred_))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.58 0.37 0.45 30
# 2 0.95 0.83 0.89 24
# micro avg 0.85 0.69 0.76 75
# macro avg 0.84 0.73 0.78 75
# weighted avg 0.82 0.69 0.74 75
# samples avg 0.66 0.69 0.67 75
对于多class class化,您可以改用
y_pred_ = np.argmax(classifier.decision_function(X_test), axis=1)
在您的代码中,在这种情况下,不同的 class 是互斥的,即每个样本仅分配给一个 class。
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import classification_report
# Load the data
iris = load_iris()
X = iris.data
y = iris.target
# Split into training and test
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=0
)
# Create classifier
classifier = OneVsRestClassifier(
make_pipeline(StandardScaler(), LinearSVC(random_state=0))
)
# Train the model
classifier.fit(X_train, y_train)
# Make predictions
y_pred = classifier.predict(X_test)
y_pred_ = np.argmax(classifier.decision_function(X_test), axis=1)
print(classification_report(y_test, y_pred))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.85 0.73 0.79 30
# 2 0.71 0.83 0.77 24
# accuracy 0.84 75
# macro avg 0.85 0.86 0.85 75
# weighted avg 0.85 0.84 0.84 75
print(classification_report(y_test, y_pred_))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.85 0.73 0.79 30
# 2 0.71 0.83 0.77 24
# accuracy 0.84 75
# macro avg 0.85 0.86 0.85 75
# weighted avg 0.85 0.84 0.84 75
我基于鸢尾花数据集编写了一个非常简单的多类分类器。这是代码:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC, SVC
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import classification_report
# Load the data
iris = load_iris()
X = iris.data
y = iris.target
# Use label_binarize to be multi-label like settings
Y = label_binarize(y, classes=[0, 1, 2])
n_classes = Y.shape[1]
# Add noisy features
random_state = np.random.RandomState(0)
n_samples, n_features = X.shape
X = np.concatenate([X, random_state.randn(n_samples, 200 * n_features)], axis=1)
from sklearn.preprocessing import label_binarize
# Split into training and test
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.5, random_state=0
)
# Create classifier
classifier = OneVsRestClassifier(
make_pipeline(StandardScaler(), LinearSVC(random_state=random_state))
)
# Train the model
classifier.fit(X_train, y_train)
我的目标是通过两种方式预测测试集的值:
- 使用
classifier.predict()
函数并定义y_pred
. - 使用
classifier.decision_function()
获取分数,然后为每个实例选择最高的分数并定义y_pred_
。
我是这样做的:
# Get the scores for the Test set
y_score = classifier.decision_function(X_test)
# Make predictions
y_pred = classifier.predict(X_test)
y_pred_ = label_binarize(np.argmax(y_score, axis=1), [0,1,2])
但是看起来,当我尝试计算分类报告时,我得到的结果略有不同,但我希望结果是一样的,因为预测是基于从决策函数中获得的分数,如图所示在 documentation (line 789)。以下是两份报告:
print(classification_report(y_test, y_pred))
print(classification_report(y_test, y_pred_))
precision recall f1-score support
0 0.54 0.62 0.58 21
1 0.44 0.40 0.42 30
2 0.36 0.50 0.42 24
micro avg 0.44 0.49 0.47 75
macro avg 0.45 0.51 0.47 75
weighted avg 0.45 0.49 0.46 75
samples avg 0.39 0.49 0.42 75
precision recall f1-score support
0 0.42 0.38 0.40 21
1 0.52 0.47 0.49 30
2 0.38 0.46 0.42 24
micro avg 0.44 0.44 0.44 75
macro avg 0.44 0.44 0.44 75
weighted avg 0.45 0.44 0.44 75
samples avg 0.44 0.44 0.44 75
我做错了什么?您能否提出一个聪明而优雅的解决方案,使两个报告完全相同?
OneVsRestClassifier
假设您期望多标签结果,即单个输入可能有多个正标签。因此,结果不同于使用 argmax
和 decision_function
。
尝试
print(y_pred[0])
print(y_pred_[0])
输出:
[0 1 1]
[0 0 1]
对于多标签class化,你应该使用
y_pred_ = np.where(classifier.decision_function(X_test) > 0, 1, 0)
复制 predict()
方法的输出,因为在这种情况下,不同的 classes 不是相互排斥的,即一个给定的样本可以属于多个 classes。
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler, label_binarize
from sklearn.svm import LinearSVC
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import classification_report
# Load the data
iris = load_iris()
X = iris.data
y = label_binarize(iris.target, classes=[0, 1, 2])
# Split the data into training and test
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=0
)
# Create classifier
classifier = OneVsRestClassifier(
make_pipeline(StandardScaler(), LinearSVC(random_state=0))
)
# Train the model
classifier.fit(X_train, y_train)
# Make predictions
y_pred = classifier.predict(X_test)
y_pred_ = np.where(classifier.decision_function(X_test) > 0, 1, 0)
print(classification_report(y_test, y_pred))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.58 0.37 0.45 30
# 2 0.95 0.83 0.89 24
# micro avg 0.85 0.69 0.76 75
# macro avg 0.84 0.73 0.78 75
# weighted avg 0.82 0.69 0.74 75
# samples avg 0.66 0.69 0.67 75
print(classification_report(y_test, y_pred_))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.58 0.37 0.45 30
# 2 0.95 0.83 0.89 24
# micro avg 0.85 0.69 0.76 75
# macro avg 0.84 0.73 0.78 75
# weighted avg 0.82 0.69 0.74 75
# samples avg 0.66 0.69 0.67 75
对于多class class化,您可以改用
y_pred_ = np.argmax(classifier.decision_function(X_test), axis=1)
在您的代码中,在这种情况下,不同的 class 是互斥的,即每个样本仅分配给一个 class。
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import classification_report
# Load the data
iris = load_iris()
X = iris.data
y = iris.target
# Split into training and test
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=0
)
# Create classifier
classifier = OneVsRestClassifier(
make_pipeline(StandardScaler(), LinearSVC(random_state=0))
)
# Train the model
classifier.fit(X_train, y_train)
# Make predictions
y_pred = classifier.predict(X_test)
y_pred_ = np.argmax(classifier.decision_function(X_test), axis=1)
print(classification_report(y_test, y_pred))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.85 0.73 0.79 30
# 2 0.71 0.83 0.77 24
# accuracy 0.84 75
# macro avg 0.85 0.86 0.85 75
# weighted avg 0.85 0.84 0.84 75
print(classification_report(y_test, y_pred_))
# precision recall f1-score support
# 0 1.00 1.00 1.00 21
# 1 0.85 0.73 0.79 30
# 2 0.71 0.83 0.77 24
# accuracy 0.84 75
# macro avg 0.85 0.86 0.85 75
# weighted avg 0.85 0.84 0.84 75