Problem when splitting data: KeyError: "None of [Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], dtype='int64')] are in the [columns]"

Question

我正在尝试对某些数据执行火车测试拆分，wine.data 但是在初始化 x 和 y 时：

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import cross_val_score

wine =  pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data")

print(wine.shape)
wine.head()
X = wine[np.arange(1,14)]
y = wine[0]

该段下方的其余代码不会运行，因为我收到错误消息：

KeyError: "None of [Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], dtype='int64')] are in the [columns]"

我试图通过更改 X 值的范围或更改 np.arange 函数来解决此问题，但均无济于事。

如有任何帮助或建议，我们将不胜感激，谢谢！

Answer 1

您忘记将 header=None 添加到数据框构造函数中。您正在下载的 csv 没有 header 行。所以，如果你不指定header=None，第一行数据将被用作header。

试试

wine =  pd.read_csv(
    "https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data",
    header=None
)

Answer 2

您是否尝试 select 按位置排列列？如果是这样，请尝试：

X = wine.iloc[:,np.arange(1,14)]
y = wine.iloc[:, 0]

Problem when splitting data: KeyError: "None of [Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], dtype='int64')] are in the [columns]"

Problem when splitting data: KeyError: "None of [Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], dtype='int64')] are in the [columns]"

python

numpy

pandas

train-test-split