遍历 pandas df 中的行
Iterate over rows in pandas df
我有如下所示的 df:
CX CY CS
97539 0.39896 0.7787 0
97540 0.39896 0.7787 0
97541 0.39896 0.7787 0
97542 0.39896 0.7787 0
97543 0.39896 0.7787 0
97544 0.39896 0.7787 0
97545 0.39896 0.7787 0
97546 0.39896 0.7787 0
97547 0.39896 0.7787 0
97548 0.39896 0.7787 0
97549 0.39896 0.7787 0
97550 0.39896 0.7787 0
97551 0.39896 0.7787 0
97552 0.39896 0.7787 0
97553 0.39896 0.7787 0
97554 0.39896 0.7787 0
97555 0.39896 0.7787 0
97556 0.39896 0.7787 0
97557 0.39896 0.7787 0
97558 0.39896 0.7787 0
97559 0.39896 0.7787 0
97560 0.39896 0.7787 0
97561 0.39896 0.7787 1
97562 0.39896 0.7787 0
97563 0.39896 0.7787 0
97564 0.39896 0.7787 0
97565 0.39896 0.7787 0
我只想保留 df 的一部分,直到 'CS' 列上的值变为 1 并删除其余行。所以我想要这样的东西:
CX CY CS
97539 0.39896 0.7787 0
97540 0.39896 0.7787 0
97541 0.39896 0.7787 0
97542 0.39896 0.7787 0
97543 0.39896 0.7787 0
97544 0.39896 0.7787 0
97545 0.39896 0.7787 0
97546 0.39896 0.7787 0
97547 0.39896 0.7787 0
97548 0.39896 0.7787 0
97549 0.39896 0.7787 0
97550 0.39896 0.7787 0
97551 0.39896 0.7787 0
97552 0.39896 0.7787 0
97553 0.39896 0.7787 0
97554 0.39896 0.7787 0
97555 0.39896 0.7787 0
97556 0.39896 0.7787 0
97557 0.39896 0.7787 0
97558 0.39896 0.7787 0
97559 0.39896 0.7787 0
97560 0.39896 0.7787 0
97561 0.39896 0.7787 1
有什么办法吗?请注意,1 的值可以在任何行,所以我不能只使用 .iloc()。理想情况下,我想避免 itterows().
如果总是有至少一个 1
是可能的比较值 Series.eq
and then get index of first 1
by Series.idxmax
, last filter by DataFrame.loc
:
df1 = df.loc[: df['CS'].eq(1).idxmax()]
如果也没有 1
值,则解决方案有效 - 然后 return 空 DataFrame:
m = df['CS'].eq(1)
df1 = df.loc[: m.idxmax()] if m.any() else pd.DataFrame()
或使用Series.cummax
in boolean indexing
的技巧,只需要更改顺序2次:
df1 = df[df['CS'].iloc[::-1].eq(1).cummax().iloc[::-1]]
我有如下所示的 df:
CX CY CS
97539 0.39896 0.7787 0
97540 0.39896 0.7787 0
97541 0.39896 0.7787 0
97542 0.39896 0.7787 0
97543 0.39896 0.7787 0
97544 0.39896 0.7787 0
97545 0.39896 0.7787 0
97546 0.39896 0.7787 0
97547 0.39896 0.7787 0
97548 0.39896 0.7787 0
97549 0.39896 0.7787 0
97550 0.39896 0.7787 0
97551 0.39896 0.7787 0
97552 0.39896 0.7787 0
97553 0.39896 0.7787 0
97554 0.39896 0.7787 0
97555 0.39896 0.7787 0
97556 0.39896 0.7787 0
97557 0.39896 0.7787 0
97558 0.39896 0.7787 0
97559 0.39896 0.7787 0
97560 0.39896 0.7787 0
97561 0.39896 0.7787 1
97562 0.39896 0.7787 0
97563 0.39896 0.7787 0
97564 0.39896 0.7787 0
97565 0.39896 0.7787 0
我只想保留 df 的一部分,直到 'CS' 列上的值变为 1 并删除其余行。所以我想要这样的东西:
CX CY CS
97539 0.39896 0.7787 0
97540 0.39896 0.7787 0
97541 0.39896 0.7787 0
97542 0.39896 0.7787 0
97543 0.39896 0.7787 0
97544 0.39896 0.7787 0
97545 0.39896 0.7787 0
97546 0.39896 0.7787 0
97547 0.39896 0.7787 0
97548 0.39896 0.7787 0
97549 0.39896 0.7787 0
97550 0.39896 0.7787 0
97551 0.39896 0.7787 0
97552 0.39896 0.7787 0
97553 0.39896 0.7787 0
97554 0.39896 0.7787 0
97555 0.39896 0.7787 0
97556 0.39896 0.7787 0
97557 0.39896 0.7787 0
97558 0.39896 0.7787 0
97559 0.39896 0.7787 0
97560 0.39896 0.7787 0
97561 0.39896 0.7787 1
有什么办法吗?请注意,1 的值可以在任何行,所以我不能只使用 .iloc()。理想情况下,我想避免 itterows().
如果总是有至少一个 1
是可能的比较值 Series.eq
and then get index of first 1
by Series.idxmax
, last filter by DataFrame.loc
:
df1 = df.loc[: df['CS'].eq(1).idxmax()]
如果也没有 1
值,则解决方案有效 - 然后 return 空 DataFrame:
m = df['CS'].eq(1)
df1 = df.loc[: m.idxmax()] if m.any() else pd.DataFrame()
或使用Series.cummax
in boolean indexing
的技巧,只需要更改顺序2次:
df1 = df[df['CS'].iloc[::-1].eq(1).cummax().iloc[::-1]]