获取每个 pandas 列值的最后一个条件值

Question

我有一个这样的 Df:

date_from	date_to	item_id	VALUE_NEW	VALUE_OLD	cost_var
1/1/1900 00:00:00	11/3/2022 15:31:18	452953	5366,46	4024,71	33.34%
11/3/2022 15:31:18	1/1/2200 00:00:00	452953	9122,57	5366,46	69.99%
1/1/1900 00:00:00	11/3/2022 15:31:18	452954	5366,46	4024,71	33.34%
11/3/2022 15:31:18	1/1/2200 00:00:00	452954	9122,57	5366,46	69.99%
1/1/1900 00:00:00	21/7/2021 16:30:46	452961	6170,98	4024,71	53.33%
21/7/2021 16:30:46	11/3/2022 15:31:09	452961	5312	6170,98	13.92%
11/3/2022 15:31:09	1/1/2200 00:00:00	452961	9122,57	5312	71.74%
1/1/1900 00:00:00	13/10/2021 14:39:55	801286	4052,1	1332,8	204.03%
13/10/2021 14:39:55	13/10/2021 14:43:09	801286	4,4732	4052,1	99.89%
13/10/2021 14:43:09	3/2/2022 17:16:23	801286	4473,2	4,4732	99900.00%
3/2/2022 17:16:23	1/1/2200 00:00:00	801286	4946,8	4473,2	10.59%

我需要检查每个 item_id，并获取 cost_var >60% 的最后一行。如果是最后一行，那没关系，但如果有下一行，而且是<60%，我必须把最后一行>60%去掉。输出应如下所示：

date_from	date_to	item_id	VALUE_NEW	VALUE_OLD	cost_var
11/3/2022 15:31:18	1/1/2200 00:00:00	452953	9122,57	5366,46	69.99%
11/3/2022 15:31:18	1/1/2200 00:00:00	452954	9122,57	5366,46	69.99%
11/3/2022 15:31:09	1/1/2200 00:00:00	452961	9122,57	5312	71.74%

项目 802186 没有返回任何值，因为最后一行 >60% (99900.00%) 有下一行并且 cost_var<60% (10.59%)...是否可以这样做？我找不到解决它的方法。

Answer 1

试试这个

# read data
df = pd.read_clipboard()
# select the last row of each item_id and only select ones where cost_var > 60%
df.groupby(df.item_id, as_index=False).last().query("cost_var.str.rstrip('%').astype('float')>60", engine='python')

获取每个 pandas 列值的最后一个条件值

Get Last condition value for each pandas cloumn value

python

group-by

filter

dataframe

pandas