如何计算列上的序数?
How to calculate ordinal number on columns?
我有包含列 user_id 和类型的数据集:
user_id
type
ordinal_number
1
request
1
1
request
1
1
request
1
1
request
1
1
payment
1
2
request
1
2
request
1
2
payment
1
2
request
2
2
payment
2
我想在表格中填充第 ordinal_number 列值。
如果 type == payment 则分配一个序号并在 user_id values 序号上填写所有前一行(type==request)。
对于某些用户,它可能只是请求,并且可能是连续多次付款。
IIUC,您想在“付款”上设置一个计数器并按组回填:
m = df['type'].eq('payment')
df['ordinal_number'] = (m.cumsum().where(m)
.groupby(df['user_id'])
.bfill().astype(int)
)
或使用值或“user_id”作为起始值:
df['ordinal_number'] = (df['user_id'].where(df['type'].eq('payment'))
.groupby(df['user_id'])
.bfill().astype(int)
)
输出:
user_id type ordinal_number
0 1 request 1
1 1 request 1
2 1 request 1
3 1 request 1
4 1 payment 1
5 2 request 2
6 2 request 2
7 2 payment 2
您可以识别“付款”; groupby
"user_id" 并在每组中,反转系列,找到 cumsum
,然后将其反转回来。
def assign_num(x):
s = x[::-1].cumsum()
# must subtract from max value to get an ascending Series
return s.iat[-1] + 1 - s[::-1]
df['ordinal_number'] = df['type'].eq('payment').groupby(df['user_id']).apply(assign_num)
输出:
user_id type ordinal_number
0 1 request 1
1 1 request 1
2 1 request 1
3 1 request 1
4 1 payment 1
5 2 request 1
6 2 request 1
7 2 payment 1
8 2 request 2
9 2 payment 2
我有包含列 user_id 和类型的数据集:
user_id | type | ordinal_number |
---|---|---|
1 | request | 1 |
1 | request | 1 |
1 | request | 1 |
1 | request | 1 |
1 | payment | 1 |
2 | request | 1 |
2 | request | 1 |
2 | payment | 1 |
2 | request | 2 |
2 | payment | 2 |
我想在表格中填充第 ordinal_number 列值。 如果 type == payment 则分配一个序号并在 user_id values 序号上填写所有前一行(type==request)。
对于某些用户,它可能只是请求,并且可能是连续多次付款。
IIUC,您想在“付款”上设置一个计数器并按组回填:
m = df['type'].eq('payment')
df['ordinal_number'] = (m.cumsum().where(m)
.groupby(df['user_id'])
.bfill().astype(int)
)
或使用值或“user_id”作为起始值:
df['ordinal_number'] = (df['user_id'].where(df['type'].eq('payment'))
.groupby(df['user_id'])
.bfill().astype(int)
)
输出:
user_id type ordinal_number
0 1 request 1
1 1 request 1
2 1 request 1
3 1 request 1
4 1 payment 1
5 2 request 2
6 2 request 2
7 2 payment 2
您可以识别“付款”; groupby
"user_id" 并在每组中,反转系列,找到 cumsum
,然后将其反转回来。
def assign_num(x):
s = x[::-1].cumsum()
# must subtract from max value to get an ascending Series
return s.iat[-1] + 1 - s[::-1]
df['ordinal_number'] = df['type'].eq('payment').groupby(df['user_id']).apply(assign_num)
输出:
user_id type ordinal_number
0 1 request 1
1 1 request 1
2 1 request 1
3 1 request 1
4 1 payment 1
5 2 request 1
6 2 request 1
7 2 payment 1
8 2 request 2
9 2 payment 2