有没有办法从 pandas 数据帧中删除自动截断?

Is there a way to remove autotruncation from pandas dataframe?

我正在尝试创建一个多索引数据框,其中包含所有可能的索引,甚至包括当前不包含值的索引。我希望将这些不存在的值设置为 0。为此,我使用了以下内容:

index_levels = ['Channel', 'Duration', 'Designation', 'Manufacturing Class']

grouped_df = df.groupby(by = index_levels)[['Total Purchases', 'Sales', 'Cost']].agg('sum')

grouped_df = grouped_df.reindex(pd.MultiIndex.from_product(grouped_df.index.levels), fill_value = 0)

预期结果:

 ___________________________________________________________________________________________ 
|Chan. | Duration   | Designation|    Manufact. |Total Purchases|  Sales      |   Cost      |
|______|____________|____________|______________|_______________|_____________|_____________|
|      | Month      | Special    |    Brand     |     0         |    0.00     |   0.00      |
|      |            |            |______________|_______________|_____________|_____________|
|      |            |            |    Generic   |     0         |    0.00     |   0.00      |
|Retail|            |____________|______________|_______________|_____________|_____________|
|      |            |Not Special |    Brand     |     756       | 15654.07    |   9498.23   |
|      |            |            |______________|_______________|_____________|_____________|
|      |            |            |    Generic   |     7896      |  98745.23   |    78953.56 |
|      |____________|____________|______________|_______________|_____________|_____________|
|      | Season     | Special    |    Brand     |     0         |  0.00       |    0.00     |
|      |            |            |______________|_______________|_____________|_____________|
|      |            |            |    Generic   |     0         |  0.00       |    0.00     |
|      |            |____________|______________|_______________|_____________|_____________|
|      |            |Not Special |    Brand     |     0         |  0.00       |    0.00     |
|      |            |            |______________|_______________|_____________|_____________|
|      |            |            |    Generic   |     0         |  0.00       |    0.00     |
|______|____________|____________|______________|_______________|_____________|_____________|

当至少一个索引级别包含一个值时,会产生此结果。但是,如果索引级别不包含任何值,则在下面产生以下结果。

___________________________________________________________________________________________ 
|Chan. | Duration   | Designation|    Manufact. |Total Purchases|  Sales      |   Cost      |
|______|____________|____________|______________|_______________|_____________|_____________|
|      | Month      | Not Special|    Brand     |     756       |  15654.07   |   9498.23   |
|      |            |            |______________|_______________|_____________|_____________|
|      |            |            |    Generic   |    7896       | 98745.23    |   78953.56  |
|Retail|____________|____________|______________|_______________|_____________|_____________|
|      | Season     |Not Special |    Brand     |       0       |    0.00     |     0.00    |
|      |            |            |______________|_______________|_____________|_____________|
|      |            |            |    Generic   |       0       |    0.00     |     0.00    |
|______|____________|____________|______________|_______________|_____________|_____________|

由于某种原因,值继续被自动截断。如何修复索引以便始终产生所需的结果并且我始终可以可靠地使用这些索引进行计算,即使所述索引中没有值?

您可以做的是预先构建所需的固定索引。例如,基于字典,其中键是用作组索引的列标签,值是所有可能的结果。

index_levels = {
    'Channel': ['Retails'], 
    'Duration': ['Month', 'Season'], 
    'Designation': ['Special', 'Not Special'], 
    'Manufacturing Class': ['Brand', 'Generic']
}

fixed_index = pd.MultiIndex.from_product(index_levels.values(), names=index_levels.keys())

那你可以做

grouped_df = df.groupby(by=index_levels.keys())[['Total Purchases', 'Sales', 'Cost']].agg('sum')

grouped_df = grouped_df.reindex(fixed_index, fill_value=0)