SVM精度计算中样本数不一致错误

Inconsistent number of samples error in SVM accuracy calculation

我正在尝试使用拉普拉斯核(作为预计算核)计算 SVM 的准确度分数。但是,当我尝试计算准确度分数时出现如下错误。

我的代码:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC
from sklearn.metrics.pairwise import laplacian_kernel

#Load the iris data
iris_data = load_iris()

#Split the data and target
X = iris_data.data
y = iris_data.target

#Convert X and y to a numpy array
X = np.array(X)
y = np.array(y)

#Perform train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42, shuffle=True)

#Using Laplacian kernel - https://scikit-learn.org/stable/modules/metrics.html#laplacian-kernel
K = np.array(laplacian_kernel(X_train, gamma=.5))
svm = SVC(kernel='precomputed').fit(K, np.ravel(y_train))
pred_y = svm.predict(K)

#Print accuracy score - here is where the error is happening.
print(accuracy_score(y_test, pred_y))

当我 运行 此代码时,出现如下所示的错误:

Traceback (most recent call last):
  File "/Users/user/Desktop/Research/Src/Laplace.py", line 36, in <module>
    print(accuracy_score(y_test, pred_y))
  File "/Users/user/miniforge3/envs/user_venv/lib/python3.8/site-packages/sklearn/utils/validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "/Users/user/miniforge3/envs/user/lib/python3.8/site-packages/sklearn/metrics/_classification.py", line 202, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/Users/user/miniforge3/envs/user/lib/python3.8/site-packages/sklearn/metrics/_classification.py", line 83, in _check_targets
    check_consistent_length(y_true, y_pred)
  File "/Users/user/miniforge3/envs/user/lib/python3.8/site-packages/sklearn/utils/validation.py", line 262, in check_consistent_length
    raise ValueError("Found input variables with inconsistent numbers of"
ValueError: Found input variables with inconsistent numbers of samples: [45, 105]

那么我该如何解决这个错误呢?

您使用包含 105 个元素的训练输入计算 pred_yy_test 包含 45 个元素。

您需要添加一个步骤:

#user3046211's code

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC
from sklearn.metrics.pairwise import laplacian_kernel

#Load the iris data
iris_data = load_iris()

#Split the data and target
X = iris_data.data
y = iris_data.target

#Convert X and y to a numpy array
X = np.array(X)
y = np.array(y)

#Perform train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42, shuffle=True)

#Using Laplacian kernel - https://scikit-learn.org/stable/modules/metrics.html#laplacian-kernel
K = np.array(laplacian_kernel(X_train, gamma=.5))
svm = SVC(kernel='precomputed').fit(K, np.ravel(y_train))
pred_y = svm.predict(K)

#Print accuracy score - here is where the error is happening.
print(accuracy_score(y_test, pred_y))

# NEW CODE STARTS HERE
K_test = np.array(laplacian_kernel(X=X_test,Y=X_train, gamma=.5))
pred_y_test = svm.predict(K_test)
print(accuracy_score(y_test, pred_y_test))