DataFrame 中存在 Pandas select 行相关列值
Pandas select rows where relative column values exist in DataFrame
假设您有这样一个数据框:
>>> df = pd.DataFrame({
'epoch_minute': [i for i in reversed(range(25090627,25635267))],
'count': [random.randint(11, 35) for _ in range(25090627,25635267)]})
>>> df.head()
epoch_minute count
0 25635266 12
1 25635265 20
2 25635264 33
3 25635263 11
4 25635262 35
和一些相关的纪元分钟增量,如下所示:
day = 1440
week = 10080
month = 302400
如何完成此代码块的等效项:
for i,r in df.iterrows():
if r['epoch_minute'] - day in df['epoch_minute'].values and \
r['epoch_minute'] - week in df['epoch_minute'].values and \
r['epoch_minute'] - month in df['epoch_minute'].values:
# do stuff
使用此语法:
valid_rows = df.loc[(df['epoch_minute'] == df['epoch_minute'] - day) &
(df['epoch_minute'] == df['epoch_minute'] - week) &
(df['epoch_minute'] == df['epoch_minute'] - month]
我明白为什么 loc
select 不起作用,但我只是问是否有更优雅的方法 select 有效行而不用遍历数据框的行数。
为 bitwise AND
添加括号和 &
,为检查成员资格添加 isin
:
valid_rows = df[(df['epoch_minute'].isin(df['epoch_minute'] - day)) &
(df['epoch_minute'].isin(df['epoch_minute'] - week)) &
(df['epoch_minute'].isin(df['epoch_minute'] - month))]
valid_rows = df[((df['epoch_minute'] - day).isin(df['epoch_minute'])) &
((df['epoch_minute']- week).isin(df['epoch_minute'] )) &
((df['epoch_minute'] - month).isin(df['epoch_minute']))]
假设您有这样一个数据框:
>>> df = pd.DataFrame({
'epoch_minute': [i for i in reversed(range(25090627,25635267))],
'count': [random.randint(11, 35) for _ in range(25090627,25635267)]})
>>> df.head()
epoch_minute count
0 25635266 12
1 25635265 20
2 25635264 33
3 25635263 11
4 25635262 35
和一些相关的纪元分钟增量,如下所示:
day = 1440
week = 10080
month = 302400
如何完成此代码块的等效项:
for i,r in df.iterrows():
if r['epoch_minute'] - day in df['epoch_minute'].values and \
r['epoch_minute'] - week in df['epoch_minute'].values and \
r['epoch_minute'] - month in df['epoch_minute'].values:
# do stuff
使用此语法:
valid_rows = df.loc[(df['epoch_minute'] == df['epoch_minute'] - day) &
(df['epoch_minute'] == df['epoch_minute'] - week) &
(df['epoch_minute'] == df['epoch_minute'] - month]
我明白为什么 loc
select 不起作用,但我只是问是否有更优雅的方法 select 有效行而不用遍历数据框的行数。
为 bitwise AND
添加括号和 &
,为检查成员资格添加 isin
:
valid_rows = df[(df['epoch_minute'].isin(df['epoch_minute'] - day)) &
(df['epoch_minute'].isin(df['epoch_minute'] - week)) &
(df['epoch_minute'].isin(df['epoch_minute'] - month))]
valid_rows = df[((df['epoch_minute'] - day).isin(df['epoch_minute'])) &
((df['epoch_minute']- week).isin(df['epoch_minute'] )) &
((df['epoch_minute'] - month).isin(df['epoch_minute']))]