第 n 次出现值

nth occurrence of a value

    user_login  login_type  login_time
0   a   0   14:00:00
1   b   0   08:20:03
2   c   1   09:10:03
3   b   1   10:49:03
4   a   1   11:19:03
5   a   1   12:29:03
6   c   0   13:39:03
7   c   1   14:49:03

我有这个 df1,我想找到 user_login 的第 2 次出现,如果 login_type 中的对应值为 1,则将 login_time 放入一个新专栏。 最终结果如下所示:

user_login  login_type  login_time  2nd_login_time
a               0        14:00:00   No 2nd login_time
b               0         8:20:03   No 2nd login_time
c               1         9:10:03   No 2nd login_time
b               1        10:49:03   10:49:03
a               1        11:19:03   11:19:03
a               1        12:29:03   No 2nd login_time
c               0        13:39:03   13:39:03
c               1        14:49:03   No 2nd login_time

有什么想法可以在 pandas 中实现吗?

使用cumcount for positions of values in groups and chain with another condition. Last set new values by loc:

m = (df.groupby('user_login').cumcount() == 1) & (df['login_type'] == 1)

df.loc[m, 'new'] = df['login_time']
print (df)
  user_login  login_type login_time       new
0          a           0   14:00:00       NaN
1          b           0   08:20:03       NaN
2          c           1   09:10:03       NaN
3          b           1   10:49:03  10:49:03
4          a           1   11:19:03  11:19:03
5          a           1   12:29:03       NaN
6          c           0   13:39:03       NaN
7          c           1   14:49:03       NaN

如果要设置两个值:

df['new'] = np.where(m, df['login_time'], 'No 2nd login_time')
print (df)
  user_login  login_type login_time                new
0          a           0   14:00:00  No 2nd login_time
1          b           0   08:20:03  No 2nd login_time
2          c           1   09:10:03  No 2nd login_time
3          b           1   10:49:03           10:49:03
4          a           1   11:19:03           11:19:03
5          a           1   12:29:03  No 2nd login_time
6          c           0   13:39:03  No 2nd login_time
7          c           1   14:49:03  No 2nd login_time

详情:

print (df.groupby('user_login').cumcount())
0    0
1    0
2    0
3    1
4    1
5    2
6    1
7    2
dtype: int64

print (m)
0    False
1    False
2    False
3     True
4     True
5    False
6    False
7    False
dtype: bool