converting a column of object values to float or integers. ValueError: invalid literal for int() with base 10: '1,026,765'
converting a column of object values to float or integers. ValueError: invalid literal for int() with base 10: '1,026,765'
dfproduction = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Oil%20Spill%20Data%20-%20Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)/production%20data/Crude%20Oil%20Production%20and%20Well%20Counts%20(since%201935).csv', encoding='utf-8')
我想将此数据转换为数字(即列,'Crude Oil Production (Mbbl)'),例如 float 或 int。
当前数据类型是对象
print(dfproduction.dtypes)
MasterYear int64
Crude Oil Production (Mbbl) object
Daily Avg. Production (Mbbl/day) object
Number of Producing Wells object
Percent Change in Production object
Avg. Per Well Production (bbl/day) float64
Crude Oil Reserves as of Jan. 1 (Mbbl) object
info object
dtype: object
然而,任何这样做的尝试都会导致某种形式的错误。
dfproduction['Crude Oil Production (Mbbl)'].astype('int')
ValueError: invalid literal for int() with base 10: '1,026,765'
dfproduction['Crude Oil Production (Mbbl)'].astype('float')
ValueError: could not convert string to float: '375,617'
更新:
问题是数字中的逗号,我删除了逗号并重新上传了数据。直到现在我才收到以下错误..
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 83: invalid start byte
使用 str.replace()
删除逗号。
dfproduction['Crude Oil Production (Mbbl)'].str.replace(r',', '').astype('int')
dfproduction = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Oil%20Spill%20Data%20-%20Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)/production%20data/Crude%20Oil%20Production%20and%20Well%20Counts%20(since%201935).csv', encoding='utf-8')
我想将此数据转换为数字(即列,'Crude Oil Production (Mbbl)'),例如 float 或 int。
当前数据类型是对象
print(dfproduction.dtypes)
MasterYear int64
Crude Oil Production (Mbbl) object
Daily Avg. Production (Mbbl/day) object
Number of Producing Wells object
Percent Change in Production object
Avg. Per Well Production (bbl/day) float64
Crude Oil Reserves as of Jan. 1 (Mbbl) object
info object
dtype: object
然而,任何这样做的尝试都会导致某种形式的错误。
dfproduction['Crude Oil Production (Mbbl)'].astype('int')
ValueError: invalid literal for int() with base 10: '1,026,765'
dfproduction['Crude Oil Production (Mbbl)'].astype('float')
ValueError: could not convert string to float: '375,617'
更新:
问题是数字中的逗号,我删除了逗号并重新上传了数据。直到现在我才收到以下错误..
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 83: invalid start byte
使用 str.replace()
删除逗号。
dfproduction['Crude Oil Production (Mbbl)'].str.replace(r',', '').astype('int')