tf.data.experimental.make_csv_dataset: ValueError: `label_name` provided must be one of the columns

tf.data.experimental.make_csv_dataset: ValueError: `label_name` provided must be one of the columns

我正在尝试构建一个数据集以在 Keras 中用于 Kaggle 上的泰坦尼克号示例。 这是我到目前为止所做的:

train_data = pd.read_csv("/kaggle/input/titanic/train.csv")

all_columns = ['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'] # all the columns names present in the csv

feature_columns = ['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'] # columns that I want to use as features for the training part


train_data = tf.data.experimental.make_csv_dataset(
    "/kaggle/input/titanic/train.csv",
    batch_size=12,
    column_names=all_columns,
    select_columns=feature_columns,
    label_name='Survived', # name of the 'label' column
    na_value="?",
    num_epochs=1,
    ignore_errors=False)

但是在编译时,我得到这个错误:

495   if label_name is not None and label_name not in column_names:
496     raise ValueError("`label_name` provided must be one of the columns.")
497 
498   def filename_to_dataset(filename):

ValueError: label_name provided must be one of the columns.

但是,正如您所看到的 label_name 值是 'Survived' 并且它存在于 all_columns(还有column_names

有什么想法吗?

最佳

艾默里克

label_name 必须包含在 select_columns

尝试:

train_data = tf.data.experimental.make_csv_dataset(
    "/kaggle/input/titanic/train.csv",
    batch_size=12,
    column_names=all_columns,
    select_columns=feature_columns + ['Survived'],
    label_name='Survived', # name of the 'label' column
    na_value="?",
    num_epochs=1,
    ignore_errors=False)