如何对一列中值低于负数的数据帧的行进行子集化?
How to subset rows of a dataframe that have values lower than a negative number in one column?
我有一个 51077 行 × 4 列的数据框。
我需要在第三列中使用值 > 0.3 和 < -0.3 的行对数据框进行子集化。
我使用了以下内容:
df_filtered = df[np.logical_and(df["third column"] > 0.3, df["third column"] < -0.3)]
但结果只显示列名
我也试过:
df_filtered = df.query("third column < -0.3 & third column > 0.3")
结果还是一样
我该如何解决这个问题?
你快明白了:
df_filtered = df.loc[(df['third column'] > 0.3) | (df['third column'] < -0.3)]
或
df_filtered = df[(df['third column'] > 0.3) | (df['third column'] < -0.3)]
您也可以使用 between
并反转结果:
df_filtered = df[~df['third_column'].between(-0.3, 0.3)]
示例:
>>> df
third_column
0 -0.190030
1 -0.205187
2 -0.066776
3 -0.264480
4 0.064962
5 0.024708
6 -0.354629 # Want to keep
7 -0.180228
8 0.261640
9 0.315986 # Want to keep
>>> df[~df['third_column'].between(-0.3, 0.3)]
third_column
6 -0.354629
9 0.315986
我有一个 51077 行 × 4 列的数据框。 我需要在第三列中使用值 > 0.3 和 < -0.3 的行对数据框进行子集化。
我使用了以下内容:
df_filtered = df[np.logical_and(df["third column"] > 0.3, df["third column"] < -0.3)]
但结果只显示列名
我也试过:
df_filtered = df.query("third column < -0.3 & third column > 0.3")
结果还是一样
我该如何解决这个问题?
你快明白了:
df_filtered = df.loc[(df['third column'] > 0.3) | (df['third column'] < -0.3)]
或
df_filtered = df[(df['third column'] > 0.3) | (df['third column'] < -0.3)]
您也可以使用 between
并反转结果:
df_filtered = df[~df['third_column'].between(-0.3, 0.3)]
示例:
>>> df
third_column
0 -0.190030
1 -0.205187
2 -0.066776
3 -0.264480
4 0.064962
5 0.024708
6 -0.354629 # Want to keep
7 -0.180228
8 0.261640
9 0.315986 # Want to keep
>>> df[~df['third_column'].between(-0.3, 0.3)]
third_column
6 -0.354629
9 0.315986