尝试使用 pandas 读取 csv 时出现问题?
Problems while trying to read a csv with pandas?
我有一个如下所示的 csv 文件:
Id, text, label
10101, string, label
然后我想用 pandas 放入数据框,所以我这样做:
df = pd.read_csv('/path/.csv')
X, y = df['text'], df['label']
我得到了这个回溯:
Traceback (most recent call last):
File "/Users/user/test.py", line 27, in <module>
X, y, = df['text'], df['label']
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 1780, in __getitem__
return self._getitem_column(key)
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 1787, in _getitem_column
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/site-packages/pandas/core/generic.py", line 1058, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/site-packages/pandas/core/internals.py", line 2889, in get
loc = self.items.get_loc(item)
File "/usr/local/lib/python2.7/site-packages/pandas/core/index.py", line 1400, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3807)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3687)
File "pandas/hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12310)
File "pandas/hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12261)
KeyError: 'text'
任何人都可以帮助我了解正在发生的事情以及如何使用 pandas 正确读取此文件吗?在此先感谢大家。
CSV 文件中的 header 是:
Id, text, label
请注意,第 2 列和第 3 列的 header 列中有前导 space。您可以通过包含 space:
来访问该列
x, y = df[' text'], df[' label']
或指定skipinitialspace
参数:
df = pd.read_csv('/path/x.csv', skipinitialspace=True)
x, y = df['text'], df['label']
后者也从列数据中删除初始 spaces。
我有一个如下所示的 csv 文件:
Id, text, label
10101, string, label
然后我想用 pandas 放入数据框,所以我这样做:
df = pd.read_csv('/path/.csv')
X, y = df['text'], df['label']
我得到了这个回溯:
Traceback (most recent call last):
File "/Users/user/test.py", line 27, in <module>
X, y, = df['text'], df['label']
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 1780, in __getitem__
return self._getitem_column(key)
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 1787, in _getitem_column
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/site-packages/pandas/core/generic.py", line 1058, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/site-packages/pandas/core/internals.py", line 2889, in get
loc = self.items.get_loc(item)
File "/usr/local/lib/python2.7/site-packages/pandas/core/index.py", line 1400, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3807)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3687)
File "pandas/hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12310)
File "pandas/hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12261)
KeyError: 'text'
任何人都可以帮助我了解正在发生的事情以及如何使用 pandas 正确读取此文件吗?在此先感谢大家。
CSV 文件中的 header 是:
Id, text, label
请注意,第 2 列和第 3 列的 header 列中有前导 space。您可以通过包含 space:
来访问该列x, y = df[' text'], df[' label']
或指定skipinitialspace
参数:
df = pd.read_csv('/path/x.csv', skipinitialspace=True)
x, y = df['text'], df['label']
后者也从列数据中删除初始 spaces。