从 pandas 数据框列中的 objects 中删除逗号

Question

我已经使用 pandas.

导入了一个 csv 文件

我的数据框有多个标题为 "Farm"、"Total Apples" 和 "Good Apples" 的列。

为 "Total Apples" 和 "Good Apples" 导入的数值数据包含表示千位的逗号，例如1,200 等我想删除逗号，使数据看起来像 1200 等

"Total Apples" 和 "Good Apples" 列的变量类型显示为 object。

我尝试使用 df.str.replace 和 df.strip 但没有成功。

还尝试将变量类型从 object 更改为字符串，将 object 更改为整数，但未能成功。

如有任何帮助，我们将不胜感激。

****编辑****

使用 pd.read_csv 导入的 csv 文件中的数据摘录：

Farm_Name   Total Apples    Good Apples
EM  18,327  14,176
EE  18,785  14,146
IW  635 486
L   33,929  24,586
NE  12,497  9,609
NW  30,756  23,765
SC  8,515   6,438
SE  22,896  17,914
SW  11,972  9,114
WM  27,251  20,931
Y   21,495  16,662

Answer 1

试试这个：

locale.setlocale(locale.LC_NUMERIC, '')
df = df[['Farm Name']].join(df[['Total Apples', 'Good Apples']].applymap(locale.atof))

Answer 2

我认为您可以将参数 thousands 添加到 read_csv，然后 Total Apples 和 Good Apples 列中的值将转换为 integers:

可能你的separator不一样，别忘了改。如果分隔符是空格，将其更改为 sep='\s+'.

import pandas as pd
import io

temp=u"""Farm_Name;Total Apples;Good Apples
EM;18,327;14,176
EE;18,785;14,146
IW;635;486
L;33,929;24,586
NE;12,497;9,609
NW;30,756;23,765
SC;8,515;6,438
SE;22,896;17,914
SW;11,972;9,114
WM;27,251;20,931
Y;21,495;16,662"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=";",thousands=',')
print df
   Farm_Name  Total Apples  Good Apples
0         EM         18327        14176
1         EE         18785        14146
2         IW           635          486
3          L         33929        24586
4         NE         12497         9609
5         NW         30756        23765
6         SC          8515         6438
7         SE         22896        17914
8         SW         11972         9114
9         WM         27251        20931
10         Y         21495        16662

print df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11 entries, 0 to 10
Data columns (total 3 columns):
Farm_Name       11 non-null object
Total Apples    11 non-null int64
Good Apples     11 non-null int64
dtypes: int64(2), object(1)
memory usage: 336.0+ bytes
None

从 pandas 数据框列中的 objects 中删除逗号

Remove comma from objects in a pandas dataframe column

python

csv

comma

pandas