如何根据多索引中的值创建新列 pandas

Question

我有一个多索引数据框，其中 custmer_key 和月份作为索引，然后是一些拟合值。

问题是，我想使用月份索引中的值创建一个新列。我总共有 45 个不同的月份，我需要按顺序给每个月分配一个从 1 到 45 的数字，有什么办法可以做到这一点吗？

                           fitted_values    period
Customer_Key    month   
12870          2018-01-01   -3.073268         1
               2018-02-01   -3.002010         2
               2018-03-01   -2.888226         3
               2018-05-01   -2.857996         5
2858439        2018-03-01   -2.857996         3
               2021-09-01   -2.857996         45
.
.
.

Answer 1

试试这个：

df['new_column'] = df.groupby(level=0).cumcount()

Answer 2

您似乎想查找自最早日期以来的月份。您可以将索引转换为月份周期，并用最小值减去它：

month = df.index.get_level_values(1).to_period('M').astype(int)
df['period'] = month - month.min() + 1

df
                         fitted_values  period
Customer_Key month                            
12870        2018-01-01      -3.073268       1
             2018-02-01      -3.002010       2
             2018-03-01      -2.888226       3
             2018-05-01      -2.857996       5
2858439      2018-03-01      -2.857996       3
             2021-09-01      -2.857996      45

这假设您的 month 索引是日期时间数据类型。

如何根据多索引中的值创建新列 pandas

How to create a new column based on values inside a multiindex pandas

python

multi-index

pandas