scikit learn test_data_split: ValueError: Found input variables with inconsistent numbers of samples:[4999, 5000]

scikit learn test_data_split: ValueError: Found input variables with inconsistent numbers of samples:[4999, 5000]

这是我的代码

print(len(image_dataset.data))
print(len(phylum_target))
X_train, X_test, y_train, y_test = train_test_split(image_dataset.data, phylum_target, test_size=0.2,random_state=109)

这是输出和错误

5000
5000
Traceback (most recent call last):
  File "Image_SVM_run_only.py", line 298, in <module>
    X_train_temp, X_test_temp, y_train_temp, y_test_temp = train_test_split(image_dataset.data, phylum_target, test_size=0.2,random_state=109)
  File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/model_selection/_split.py", line 2127, in train_test_split
    arrays = indexable(*arrays)
  File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/utils/validation.py", line 293, in indexable
    check_consistent_length(*result)
  File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/utils/validation.py", line 257, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [4999, 5000]

即使训练数据和测试数据的长度相同,我也遇到了这个错误。 请帮帮我T.T

这是我可以从您的信息中辨别出的最小可重现示例,并且工作正常

import numpy as np
from sklearn.model_selection import train_test_split

X = np.zeros((5000, 49152))
y = np.zeros((5000, 1))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=109)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)