Pandas `pivot_table` 使用 `decimal.Decimal` 类型

Question

我的数据框如下所示：

date        id      value           type
2021-01-02  123123   0.3           apple
2021-01-02  123123  2.05           banana
2021-01-02  456456  2.01819        apple
2021-01-02  456456  606800000      banana
2021-01-02  567567  2.2            apple
2021-01-02  891891  2475368        banana
........

列 value 的数据类型为 decimal.Decimal。

我的预期结果如下所示：

date        id       apple         banana
2021-01-02  123123   0.3           2.05
2021-01-02  456456   2.01819       606800000
2021-01-02  567567   2.2           NaN
2021-01-02  891891   Nan           2475368

我尝试使用 pandas.pivot_table:

pivot_df = pd.pivot_table(df,
                          values='value',
                          index=['date', 'id'],
                          columns='type').reset_index().rename_axis(None, axis=1)

这给了我结果（只有前两列）：

date        id
2021-01-02  123123 
2021-01-02  456456  
2021-01-02  567567  
2021-01-02  891891  
...

有人知道这是怎么回事吗？为什么我只有两列？谢谢。

更新：我看到评论和答案说不能用两列复制数据框，这太奇怪了，是因为我使用的是 pandas 的旧版本吗？我仍然只有两列...我正在使用 Python3.8 + pandas==1.3.0

下面是我的结果：

我设法通过使用 pandas 1.3.3.

获得了预期的结果

Answer 1

你的代码对我有用，我无法重现你的问题。

我的设置：

import pandas as pd
from pandas import Timestamp
from decimal import Decimal


data = {'date': [Timestamp('2021-01-02 00:00:00'),
                Timestamp('2021-01-02 00:00:00'),
                Timestamp('2021-01-02 00:00:00'),
                Timestamp('2021-01-02 00:00:00'),
                Timestamp('2021-01-02 00:00:00'),
                Timestamp('2021-01-02 00:00:00')],
               'id': [123123, 123123, 456456, 456456, 567567, 891891],
               'value': [Decimal('0.299999999999999988897769753748434595763683319091796875'),
                Decimal('2.04999999999999982236431605997495353221893310546875'),
                Decimal('2.018190000000000150492951433989219367504119873046875'),
                Decimal('606800000'),
                Decimal('2.20000000000000017763568394002504646778106689453125'),
                Decimal('2475368')],
               'type': ['apple', 'apple', 'apple', 'banana', 'apple', 'banana']}

df = pd.DataFrame(data)

枢轴：

pivot_df = pd.pivot_table(df,
                          values='value',
                          index=['date', 'id'],
                          columns='type').reset_index().rename_axis(None, axis=1)

输出：

>>> df
        date      id    apple       banana
0 2021-01-02  123123  1.17500          NaN
1 2021-01-02  456456  2.01819  606800000.0
2 2021-01-02  567567  2.20000          NaN
3 2021-01-02  891891      NaN    2475368.0

Answer 2

我也得到了 4 列代码我是运行:

import pandas as pd
import sys
print(pd.__version__)
print(sys.version)
df = pd.read_csv('data.csv')
pivot_df = pd.pivot_table(df,
                          values='value',
                          index=['date', 'id'],
                          columns='type').reset_index().rename_axis(None, axis=1)
print(pivot_df.to_string())

输出

1.0.3
3.7.7 (default, Apr 15 2020, 05:09:04) [MSC v.1916 64 bit (AMD64)]
         date      id    apple        banana
0  2021-01-02  123123  0.30000  2.050000e+00
1  2021-01-02  456456  2.01819  6.068000e+08
2  2021-01-02  567567  2.20000           NaN
3  2021-01-02  891891      NaN  2.475368e+06

Pandas `pivot_table` 使用 `decimal.Decimal` 类型

Pandas `pivot_table` working with `decimal.Decimal` type

python

types

pivot-table

decimal

pandas