如何计算基于连续天数和其他两列的值的总和
How to calculate the summation for values based on consecutive days and two other columns
如何只对连续几天的相同名称和相同供应商进行汇总?
例如,对于 A 和 Supplier Wal,我需要对 2021-05-31 和 2021-06-01 进行求和,然后再对 2021-06-08 和 2021-06-09 进行求和。我需要添加一个新列进行求和。请看下面的例子:
这是 Pandas DataFrame 代码 table:
df = pd.DataFrame({'Name': ['A', 'A', 'A','A','B','B','C','C','C','C','C','C','C','C','C'],
'Supplier': ['Wal', 'Wal', 'Wal', 'Wal', 'Co', 'Co', 'Mc', 'Mc', 'St', 'St', 'St', 'St', 'St', 'To', 'To'],
'Date': ['2021-05-31', '2021-06-01', '2021-06-08', '2021-06-09', '2021-05-17', '2021-05-18'
, '2021-04-07', '2021-04-08', '2021-05-11', '2021-05-12', '2021-05-13', '2021-05-18'
, '2021-05-19', '2021-03-30', '2021-03-31'],
'Amount': [27, 400, 410, 250, 100, 50, 22, 78, 60, 180, 100, 240, 140, 30, 110],
'Summation': [427,427,660,660,150,150,100,100,340,340,340,380,380,140,140 ]})
像这样?
import pandas as pd
df = pd.DataFrame({'Name': ['A', 'A', 'A','A','B','B','C','C','C','C','C','C','C','C','C'],
'Supplier': ['Wal', 'Wal', 'Wal', 'Wal', 'Co', 'Co', 'Mc', 'Mc', 'St', 'St', 'St', 'St', 'St', 'To', 'To'],
'Date': ['2021-05-31', '2021-06-01', '2021-06-08', '2021-06-09', '2021-05-17', '2021-05-18'
, '2021-04-07', '2021-04-08', '2021-05-11', '2021-05-12', '2021-05-13', '2021-05-18'
, '2021-05-19', '2021-03-30', '2021-03-31'],
'Amount': [27, 400, 410, 250, 100, 50, 22, 78, 60, 180, 100, 240, 140, 30, 110]})
df['Date'] = pd.to_datetime(df['Date'])
filt = df.loc[((df['Date'] - df['Date'].shift(-1)).abs() == pd.Timedelta('1d')) | (df['Date'].diff() == pd.Timedelta('1d'))]
breaks = filt['Date'].diff() != pd.Timedelta('1d')
df['Summation'] = df.groupby(['Name','Supplier',breaks.cumsum()])['Amount'].transform('sum')
print(df)
输出:
Name Supplier Date Amount Summation
0 A Wal 2021-05-31 27 427
1 A Wal 2021-06-01 400 427
2 A Wal 2021-06-08 410 660
3 A Wal 2021-06-09 250 660
4 B Co 2021-05-17 100 150
5 B Co 2021-05-18 50 150
6 C Mc 2021-04-07 22 100
7 C Mc 2021-04-08 78 100
8 C St 2021-05-11 60 340
9 C St 2021-05-12 180 340
10 C St 2021-05-13 100 340
11 C St 2021-05-18 240 380
12 C St 2021-05-19 140 380
13 C To 2021-03-30 30 140
14 C To 2021-03-31 110 140
如何只对连续几天的相同名称和相同供应商进行汇总? 例如,对于 A 和 Supplier Wal,我需要对 2021-05-31 和 2021-06-01 进行求和,然后再对 2021-06-08 和 2021-06-09 进行求和。我需要添加一个新列进行求和。请看下面的例子:
这是 Pandas DataFrame 代码 table:
df = pd.DataFrame({'Name': ['A', 'A', 'A','A','B','B','C','C','C','C','C','C','C','C','C'],
'Supplier': ['Wal', 'Wal', 'Wal', 'Wal', 'Co', 'Co', 'Mc', 'Mc', 'St', 'St', 'St', 'St', 'St', 'To', 'To'],
'Date': ['2021-05-31', '2021-06-01', '2021-06-08', '2021-06-09', '2021-05-17', '2021-05-18'
, '2021-04-07', '2021-04-08', '2021-05-11', '2021-05-12', '2021-05-13', '2021-05-18'
, '2021-05-19', '2021-03-30', '2021-03-31'],
'Amount': [27, 400, 410, 250, 100, 50, 22, 78, 60, 180, 100, 240, 140, 30, 110],
'Summation': [427,427,660,660,150,150,100,100,340,340,340,380,380,140,140 ]})
像这样?
import pandas as pd
df = pd.DataFrame({'Name': ['A', 'A', 'A','A','B','B','C','C','C','C','C','C','C','C','C'],
'Supplier': ['Wal', 'Wal', 'Wal', 'Wal', 'Co', 'Co', 'Mc', 'Mc', 'St', 'St', 'St', 'St', 'St', 'To', 'To'],
'Date': ['2021-05-31', '2021-06-01', '2021-06-08', '2021-06-09', '2021-05-17', '2021-05-18'
, '2021-04-07', '2021-04-08', '2021-05-11', '2021-05-12', '2021-05-13', '2021-05-18'
, '2021-05-19', '2021-03-30', '2021-03-31'],
'Amount': [27, 400, 410, 250, 100, 50, 22, 78, 60, 180, 100, 240, 140, 30, 110]})
df['Date'] = pd.to_datetime(df['Date'])
filt = df.loc[((df['Date'] - df['Date'].shift(-1)).abs() == pd.Timedelta('1d')) | (df['Date'].diff() == pd.Timedelta('1d'))]
breaks = filt['Date'].diff() != pd.Timedelta('1d')
df['Summation'] = df.groupby(['Name','Supplier',breaks.cumsum()])['Amount'].transform('sum')
print(df)
输出:
Name Supplier Date Amount Summation
0 A Wal 2021-05-31 27 427
1 A Wal 2021-06-01 400 427
2 A Wal 2021-06-08 410 660
3 A Wal 2021-06-09 250 660
4 B Co 2021-05-17 100 150
5 B Co 2021-05-18 50 150
6 C Mc 2021-04-07 22 100
7 C Mc 2021-04-08 78 100
8 C St 2021-05-11 60 340
9 C St 2021-05-12 180 340
10 C St 2021-05-13 100 340
11 C St 2021-05-18 240 380
12 C St 2021-05-19 140 380
13 C To 2021-03-30 30 140
14 C To 2021-03-31 110 140