pandas 在相同的 table (相同的数据帧)中,如何使用新名称和其他行值的总和对不同的行进行分组
pandas in same table (same dataframe), how to group different rows with new name and with sum of other row values
below dataframe is the output of below code i want to group rows further
train=pd.read_excel("monthly_report.xlsx", sheet_name="xy12",sep=r'\s*,\s*')
train['Date/Time Opened']=train['Date/Time Opened'].dt.month_name()
train=train.groupby(['col1', 'Date/Time Opened'])['Date/Time Opened'].count()
col1 Date/Time Opened number
abc April 40
August 30
December 25
February 30
January 45
xyz April 1
August 1
November 3
October 2
September 3
pqr March 2
May 4
November 5
October 2
现在我希望上面的格式如下所示。此后,基于此我想构建图表
abcxyz(new name) April 41
August 31
December 25
February 30
January 45
September 3
November 3
October 2
pqr(new name)
March 2
May 4
November 5
October 2
有人可以告诉我如何将新行中的行与 diffrenet 值和其余行值的总和连接起来吗
您可以使用 Series.mask
with Series.isin
来设置相同的类别:
train['col1'] = train['col1'].mask(train['col1'].isin(['abc','xyz']), 'abcxyz')
或使用 Series.replace
与字典:
train['col1'] = train['col1'].replace({'abc':'abcxyz','xyz':'abcxyz'})
...然后使用您的解决方案:
train['Date/Time Opened']=train['Date/Time Opened'].dt.month_name()
train=train.groupby(['col1', 'Date/Time Opened'])['Date/Time Opened'].count()
below dataframe is the output of below code i want to group rows further
train=pd.read_excel("monthly_report.xlsx", sheet_name="xy12",sep=r'\s*,\s*')
train['Date/Time Opened']=train['Date/Time Opened'].dt.month_name()
train=train.groupby(['col1', 'Date/Time Opened'])['Date/Time Opened'].count()
col1 Date/Time Opened number
abc April 40
August 30
December 25
February 30
January 45
xyz April 1
August 1
November 3
October 2
September 3
pqr March 2
May 4
November 5
October 2
现在我希望上面的格式如下所示。此后,基于此我想构建图表
abcxyz(new name) April 41
August 31
December 25
February 30
January 45
September 3
November 3
October 2
pqr(new name)
March 2
May 4
November 5
October 2
有人可以告诉我如何将新行中的行与 diffrenet 值和其余行值的总和连接起来吗
您可以使用 Series.mask
with Series.isin
来设置相同的类别:
train['col1'] = train['col1'].mask(train['col1'].isin(['abc','xyz']), 'abcxyz')
或使用 Series.replace
与字典:
train['col1'] = train['col1'].replace({'abc':'abcxyz','xyz':'abcxyz'})
...然后使用您的解决方案:
train['Date/Time Opened']=train['Date/Time Opened'].dt.month_name()
train=train.groupby(['col1', 'Date/Time Opened'])['Date/Time Opened'].count()