Python Dataframe:如何检查元素的特定列
Python Dataframe: How to check specific columns for elements
我想检查某一列中的所有元素是否都包含数字 0?
我有一个用 df=pd.read_table('ad-data')
读取的数据集
由此我感受到了一个带有元素
的数据框
[0] [1.] [2] [3] [4] [5] [6] [7] ....1559
[1.] 3 2 3 0 0 0 0
[2] 2 3 2 0 0 0 0
[3] 3 2 2 0 0 0 0
[4] 6 7 3 0 0 0 0
[5] 3 2 1 0 0 0 0
...
3220
我想检查从第4列到第1559列的数据集是否只包含0或其他值。
df['Check']=df.loc[:,4:].sum(axis=1)
您可以按元素检查是否与 0 相等,并对行使用 all
:
df['all_zeros'] = (df.iloc[:, 4:1560] == 0).all(axis=1)
演示它的小例子(基于此处的第 1 至 3 列):
N = 5
df = pd.DataFrame(np.random.binomial(1, 0.4, size=(N, N)))
df['all_zeros'] = (df.iloc[:, 1:4] == 0).all(axis=1)
df
输出:
0 1 2 3 4 all_zeros
0 0 1 1 0 0 False
1 0 0 1 1 1 False
2 0 1 1 0 0 False
3 0 0 0 0 0 True
4 1 0 0 0 0 True
更新:过滤非零值:
df[~df['all_zeros']]
输出:
0 1 2 3 4 all_zeros
0 0 1 1 0 0 False
1 0 0 1 1 1 False
2 0 1 1 0 0 False
更新 2: 仅显示非零值:
pd.melt(
df_filtered.iloc[:, 1:4].reset_index(),
id_vars='index', var_name='column'
).query('value != 0').sort_values('index')
输出:
index column value
0 0 1 1
3 0 2 1
4 1 2 1
7 1 3 1
2 2 1 1
5 2 2 1
here is the way to check if all of values are zero or not: it's simple
and doesn't need advanced functions as above answers. only basic
functions like filtering and if loops and variable assigning.
first is the way to check if one column has only zeros or not and
second is how to find if all the columns have zeros or not. and it
prints and answer statement.
检查一列是否只有零值的方法:
先做一个系列:
has_zero = df[4] == 0
# has_zero is a series which contains bool values for each row eg. True, False.
# if there is a zero in a row it result will be "row_number : True"
下一个:
rows_which_have_zero = df[has_zero]
# stores the rows which have zero as a data frame
下一个:
if len[rows_which_have_zero] == total_number_rows:
print("contains only zeros")
else:
print("contains other numbers than zero")
# substitute total_number_rows for 3220
上述方法仅检查 rows_which_have_zero 是否等于列中的行数。
查看所有列是否只有零的方法:
它使用上述函数并将其放入 if 循环中。
no_of_columns = 1559
value_1 = 1
if value_1 <= 1559
has_zero = df[value_1] == 0
rows_which_have_zero = df[has_zero]
value_1 += 1
if len[rows_which_have_zero] == 1559
no_of_rows_with_only_zero += 1
else:
return
检查所有行是否只有零:
#since it doesn't matter if first 3 columns have zero or not:
no_of_rows_with_only_zero = no_of_rows_with_only_zero - 3
if no_of_rows_with_only_zero == 1559:
print("there are only zero values")
else:
print("there are numbers which are not zero")
以上检查 no_of_rows_with_only_zero 是否等于行数(1559 减去 3,因为只需要检查第 4 - 1559 行)
更新:
# convert the value_1 to str if the column title is a str instead of int
# when updating value_1 by adding: convert it back to int and then back to str
我想检查某一列中的所有元素是否都包含数字 0?
我有一个用 df=pd.read_table('ad-data')
读取的数据集
由此我感受到了一个带有元素
[0] [1.] [2] [3] [4] [5] [6] [7] ....1559
[1.] 3 2 3 0 0 0 0
[2] 2 3 2 0 0 0 0
[3] 3 2 2 0 0 0 0
[4] 6 7 3 0 0 0 0
[5] 3 2 1 0 0 0 0
...
3220
我想检查从第4列到第1559列的数据集是否只包含0或其他值。
df['Check']=df.loc[:,4:].sum(axis=1)
您可以按元素检查是否与 0 相等,并对行使用 all
:
df['all_zeros'] = (df.iloc[:, 4:1560] == 0).all(axis=1)
演示它的小例子(基于此处的第 1 至 3 列):
N = 5
df = pd.DataFrame(np.random.binomial(1, 0.4, size=(N, N)))
df['all_zeros'] = (df.iloc[:, 1:4] == 0).all(axis=1)
df
输出:
0 1 2 3 4 all_zeros
0 0 1 1 0 0 False
1 0 0 1 1 1 False
2 0 1 1 0 0 False
3 0 0 0 0 0 True
4 1 0 0 0 0 True
更新:过滤非零值:
df[~df['all_zeros']]
输出:
0 1 2 3 4 all_zeros
0 0 1 1 0 0 False
1 0 0 1 1 1 False
2 0 1 1 0 0 False
更新 2: 仅显示非零值:
pd.melt(
df_filtered.iloc[:, 1:4].reset_index(),
id_vars='index', var_name='column'
).query('value != 0').sort_values('index')
输出:
index column value
0 0 1 1
3 0 2 1
4 1 2 1
7 1 3 1
2 2 1 1
5 2 2 1
here is the way to check if all of values are zero or not: it's simple and doesn't need advanced functions as above answers. only basic functions like filtering and if loops and variable assigning.
first is the way to check if one column has only zeros or not and second is how to find if all the columns have zeros or not. and it prints and answer statement.
检查一列是否只有零值的方法:
先做一个系列:
has_zero = df[4] == 0
# has_zero is a series which contains bool values for each row eg. True, False.
# if there is a zero in a row it result will be "row_number : True"
下一个:
rows_which_have_zero = df[has_zero]
# stores the rows which have zero as a data frame
下一个:
if len[rows_which_have_zero] == total_number_rows:
print("contains only zeros")
else:
print("contains other numbers than zero")
# substitute total_number_rows for 3220
上述方法仅检查 rows_which_have_zero 是否等于列中的行数。
查看所有列是否只有零的方法:
它使用上述函数并将其放入 if 循环中。
no_of_columns = 1559
value_1 = 1
if value_1 <= 1559
has_zero = df[value_1] == 0
rows_which_have_zero = df[has_zero]
value_1 += 1
if len[rows_which_have_zero] == 1559
no_of_rows_with_only_zero += 1
else:
return
检查所有行是否只有零:
#since it doesn't matter if first 3 columns have zero or not:
no_of_rows_with_only_zero = no_of_rows_with_only_zero - 3
if no_of_rows_with_only_zero == 1559:
print("there are only zero values")
else:
print("there are numbers which are not zero")
以上检查 no_of_rows_with_only_zero 是否等于行数(1559 减去 3,因为只需要检查第 4 - 1559 行)
更新:
# convert the value_1 to str if the column title is a str instead of int
# when updating value_1 by adding: convert it back to int and then back to str