KeyErrors 是什么意思,我该如何解决?
What do KeyErrors means and how can I resolve them?
我对 Python 和编码还很陌生,我正在研究 Kaggle 预测竞赛的预测模型。我正在尝试编写代码来删除我认为对于预测泰坦尼克号沉没的生存能力不重要的某个变量(Kaggle 竞赛提示)。仅供参考,'Cabin' 是一个定义的术语,因为它是一个变量并且是给定信息的一部分。
我的代码是:
import re
deck = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "U": 8}
data = [train_df, test_df]
for dataset in data:
dataset['Cabin'] = dataset['Cabin'].fillna("U0")
dataset['Deck'] = dataset['Cabin'].map(lambda x: re.compile("([a-zA-Z]+)").search(x).group())
dataset['Deck'] = dataset['Deck'].map(deck)
dataset['Deck'] = dataset['Deck'].fillna(0)
dataset['Deck'] = dataset['Deck'].astype(int)
train_df = train_df.drop(['Cabin'], axis=1)
test_df = test_df.drop(['Cabin'], axis=1)
我收到的错误是:
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2894 try:
-> 2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Cabin'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-52-b7c547f14770> in <module>
4
5 for dataset in data:
----> 6 dataset['Cabin'] = dataset['Cabin'].fillna("U0")
7 dataset['Deck'] = dataset['Cabin'].map(lambda x: re.compile("([a-zAZ]+)").search(x).group())
8 dataset['Deck'] = dataset['Deck'].map(deck)
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2900 if self.columns.nlevels > 1:
2901 return self._getitem_multilevel(key)
-> 2902 indexer = self.columns.get_loc(key)
2903 if is_integer(indexer):
2904 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
-> 2897 raise KeyError(key) from err
2898
2899 if tolerance is not None:
KeyError: 'Cabin'
我不完全确定错误的含义以及如何修复它,所以如果有人能帮助我,我将不胜感激!!
大多数情况下,引发 Python KeyError
是因为在字典或字典子类中找不到键
--
检查 train_df
test_df
数据框是否有名为 'Cabin' 的列。
这是一个例子,
import re
import pandas as pd
test_df = pd.read_csv("test.csv")
train_df = pd.read_csv("train.csv")
deck = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "U": 8}
data = [train_df, test_df]
for dataset in data:
dataset['Cabin'] = dataset['Cabin'].fillna("U0")
dataset['Deck'] = dataset['Cabin'].map(
lambda x: re.compile("([a-zA-Z]+)").search(x).group())
dataset['Deck'] = dataset['Deck'].map(deck)
dataset['Deck'] = dataset['Deck'].fillna(0)
dataset['Deck'] = dataset['Deck'].astype(int)
train_df = train_df.drop(['Cabin'], axis=1)
test_df = test_df.drop(['Cabin'], axis=1)
print(train_df, test_df)
training/test 个文件从 here 下载。
我对 Python 和编码还很陌生,我正在研究 Kaggle 预测竞赛的预测模型。我正在尝试编写代码来删除我认为对于预测泰坦尼克号沉没的生存能力不重要的某个变量(Kaggle 竞赛提示)。仅供参考,'Cabin' 是一个定义的术语,因为它是一个变量并且是给定信息的一部分。
我的代码是:
import re
deck = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "U": 8}
data = [train_df, test_df]
for dataset in data:
dataset['Cabin'] = dataset['Cabin'].fillna("U0")
dataset['Deck'] = dataset['Cabin'].map(lambda x: re.compile("([a-zA-Z]+)").search(x).group())
dataset['Deck'] = dataset['Deck'].map(deck)
dataset['Deck'] = dataset['Deck'].fillna(0)
dataset['Deck'] = dataset['Deck'].astype(int)
train_df = train_df.drop(['Cabin'], axis=1)
test_df = test_df.drop(['Cabin'], axis=1)
我收到的错误是:
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2894 try:
-> 2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Cabin'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-52-b7c547f14770> in <module>
4
5 for dataset in data:
----> 6 dataset['Cabin'] = dataset['Cabin'].fillna("U0")
7 dataset['Deck'] = dataset['Cabin'].map(lambda x: re.compile("([a-zAZ]+)").search(x).group())
8 dataset['Deck'] = dataset['Deck'].map(deck)
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2900 if self.columns.nlevels > 1:
2901 return self._getitem_multilevel(key)
-> 2902 indexer = self.columns.get_loc(key)
2903 if is_integer(indexer):
2904 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
-> 2897 raise KeyError(key) from err
2898
2899 if tolerance is not None:
KeyError: 'Cabin'
我不完全确定错误的含义以及如何修复它,所以如果有人能帮助我,我将不胜感激!!
大多数情况下,引发 Python KeyError
是因为在字典或字典子类中找不到键
--
检查 train_df
test_df
数据框是否有名为 'Cabin' 的列。
这是一个例子,
import re
import pandas as pd
test_df = pd.read_csv("test.csv")
train_df = pd.read_csv("train.csv")
deck = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "U": 8}
data = [train_df, test_df]
for dataset in data:
dataset['Cabin'] = dataset['Cabin'].fillna("U0")
dataset['Deck'] = dataset['Cabin'].map(
lambda x: re.compile("([a-zA-Z]+)").search(x).group())
dataset['Deck'] = dataset['Deck'].map(deck)
dataset['Deck'] = dataset['Deck'].fillna(0)
dataset['Deck'] = dataset['Deck'].astype(int)
train_df = train_df.drop(['Cabin'], axis=1)
test_df = test_df.drop(['Cabin'], axis=1)
print(train_df, test_df)
training/test 个文件从 here 下载。