如何根据 id 的 groupby 创建一个标记(1 - 体重减轻;0 - 相同体重 a)体重减轻(8% 或更多)的列?

How to create a column which flags (1 - weight loss;0 - same weight a) weight loss (8% or more) from previous measurement based on groupby of id?

我有一个数据框 df:

import pandas as pd
df = pd.DataFrame({"CLIENT_ID": [8222, 8222, 8222, 8222, 8300, 8300, 8300, 8300, 8300],
                   "ENCOUNTER_DATE": ['2020-01-01', '2020-03-02', '2020-04-18', '2020-07-31', '2017-06-10', '2017-09-11', '2018-02-01', '2018-04-01', '2018-05-31'],
                   "WEIGHT_KG": [56, 58, 50, 54, 71, 72, 74, 75, 65]})

CLIENT_IDENCOUNTER_DATE

排序
CLIENT_ID ENCOUNTER_DATE WEIGHT_KG
8222 2020-01-01 56
8222 2020-03-02 58
8222 2020-04-18 50
8222 2020-07-31 54
8300 2017-06-10 71
8300 2017-09-11 72
8300 2018-02-01 74
8300 2018-04-01 75
8300 2018-05-31 65

我想创建一个 WEIGHT_LOSS 标志列,如果当前 WEIGHT_KG 比之前的测量值至少低 10%,则为 1,否则为 0,对于每个 CLIENT_ID 导致下面的 table:

CLIENT_ID ENCOUNTER_DATE WEIGHT_KG WEIGHT_LOSS
8222 2020-01-01 56 0
8222 2020-03-02 58 0
8222 2020-04-18 50 1
8222 2020-07-31 54 0
8300 2017-06-10 71 0
8300 2017-09-11 72 0
8300 2018-02-01 74 0
8300 2018-04-01 75 0
8300 2018-05-31 65 1

df.assignnp.where 或列表理解可能很容易回答。

您可以groupby客户端并在“WEIGHT_KG”列上使用pct_change

df['WEIGHT_LOSS'] = (df.groupby('CLIENT_ID')
                       ['WEIGHT_KG']
                       .pct_change() # calculate percent change
                       .lt(-0.1)     # loss if lower than -0.1 (-10%)
                       .astype(int)  # convert True/False to 1/0
                     )

输出:

   CLIENT_ID ENCOUNTER_DATE  WEIGHT_KG  WEIGHT_LOSS
0       8222     2020-01-01         56            0
1       8222     2020-03-02         58            0
2       8222     2020-04-18         50            1
3       8222     2020-07-31         54            0
4       8300     2017-06-10         71            0
5       8300     2017-09-11         72            0
6       8300     2018-02-01         74            0
7       8300     2018-04-01         75            0
8       8300     2018-05-31         65            1