Select pandas DF 中的列
Select Columns in pandas DF
下面是我的数据,我正在尝试访问一个列。直到昨天它都工作正常,但现在我不确定我是否做错了什么:
DISTRICT;CPE;EQUIPMENT,NR_EQUIPM
0 47;CASTELO BRANCO;17520091VM;101
1 48;CASTELO BRANCO;17520103VV;160
2 49;CASTELO BRANCO;17520103VV;160
当我尝试这个时,它给了我一个错误:
df = pd.read_csv(archiv, sep=",")
df['EQUIPMENT']
错误:
KeyError: 'EQUIPMENT'
我也在尝试这个,但也不起作用:
df.EQUIPMENT
错误:
AttributeError: 'DataFrame' object has no attribute 'EQUIPMENT'
顺便说一句,我正在使用:
Python 2.7.12 |Anaconda 4.1.1 (32-bit)| (default, Jun 29 2016,
11:42:13) [MSC v.1500 32 bit (Intel)]
有什么想法吗?
您需要将 sep 更改为 ;
,因为 csv
中的分隔符已更改:
df = pd.read_csv(archiv, sep=";")
如果检查列的最后一个分隔符,有 ,
,所以您可以使用两个分隔符 - ;,
,但必须添加参数 engine='python'
,因为警告:
ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
for index, row in df.iterrows():
样本:
import pandas as pd
import io
temp=u"""DISTRICT;CPE;EQUIPMENT,NR_EQUIPM
47;CASTELO BRANCO;17520091VM;101
48;CASTELO BRANCO;17520103VV;160
49;CASTELO BRANCO;17520103VV;160"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="[;,]", engine='python')
print (df)
DISTRICT CPE EQUIPMENT NR_EQUIPM
0 47 CASTELO BRANCO 17520091VM 101
1 48 CASTELO BRANCO 17520103VV 160
2 49 CASTELO BRANCO 17520103VV 160
下面是我的数据,我正在尝试访问一个列。直到昨天它都工作正常,但现在我不确定我是否做错了什么:
DISTRICT;CPE;EQUIPMENT,NR_EQUIPM
0 47;CASTELO BRANCO;17520091VM;101
1 48;CASTELO BRANCO;17520103VV;160
2 49;CASTELO BRANCO;17520103VV;160
当我尝试这个时,它给了我一个错误:
df = pd.read_csv(archiv, sep=",")
df['EQUIPMENT']
错误:
KeyError: 'EQUIPMENT'
我也在尝试这个,但也不起作用:
df.EQUIPMENT
错误:
AttributeError: 'DataFrame' object has no attribute 'EQUIPMENT'
顺便说一句,我正在使用:
Python 2.7.12 |Anaconda 4.1.1 (32-bit)| (default, Jun 29 2016, 11:42:13) [MSC v.1500 32 bit (Intel)]
有什么想法吗?
您需要将 sep 更改为 ;
,因为 csv
中的分隔符已更改:
df = pd.read_csv(archiv, sep=";")
如果检查列的最后一个分隔符,有 ,
,所以您可以使用两个分隔符 - ;,
,但必须添加参数 engine='python'
,因为警告:
ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'. for index, row in df.iterrows():
样本:
import pandas as pd
import io
temp=u"""DISTRICT;CPE;EQUIPMENT,NR_EQUIPM
47;CASTELO BRANCO;17520091VM;101
48;CASTELO BRANCO;17520103VV;160
49;CASTELO BRANCO;17520103VV;160"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="[;,]", engine='python')
print (df)
DISTRICT CPE EQUIPMENT NR_EQUIPM
0 47 CASTELO BRANCO 17520091VM 101
1 48 CASTELO BRANCO 17520103VV 160
2 49 CASTELO BRANCO 17520103VV 160