如何向下转换 Pandas 中的数字列?
How to downcast numeric columns in Pandas?
如何优化数据帧内存占用并为数字列找到最佳(最少)数据类型dtypes
。例如:
A B C D
0 1 1000000 1.1 1.111111
1 2 -1000000 2.1 2.111111
>>> df.dtypes
A int64
B int64
C float64
D float64
预期结果:
>>> df.dtypes
A int8
B int32
C float32
D float32
dtype: object
您可以像@anurag 提到的那样在 to_numeric
with selectig integers and floats columns by DataFrame.select_dtypes
, it working from pandas 0.19+
中使用参数 downcast
,谢谢:
fcols = df.select_dtypes('float').columns
icols = df.select_dtypes('integer').columns
df[fcols] = df[fcols].apply(pd.to_numeric, downcast='float')
df[icols] = df[icols].apply(pd.to_numeric, downcast='integer')
print (df.dtypes)
A int8
B int32
C float32
D float32
dtype: object
如何优化数据帧内存占用并为数字列找到最佳(最少)数据类型dtypes
。例如:
A B C D
0 1 1000000 1.1 1.111111
1 2 -1000000 2.1 2.111111
>>> df.dtypes
A int64
B int64
C float64
D float64
预期结果:
>>> df.dtypes
A int8
B int32
C float32
D float32
dtype: object
您可以像@anurag 提到的那样在 to_numeric
with selectig integers and floats columns by DataFrame.select_dtypes
, it working from pandas 0.19+
中使用参数 downcast
,谢谢:
fcols = df.select_dtypes('float').columns
icols = df.select_dtypes('integer').columns
df[fcols] = df[fcols].apply(pd.to_numeric, downcast='float')
df[icols] = df[icols].apply(pd.to_numeric, downcast='integer')
print (df.dtypes)
A int8
B int32
C float32
D float32
dtype: object