确定 pandas DataFrame 行中的所有数据是否相同，特定列除外

Question

我能够在 SO 的帮助下完成这项工作：

# Input string changes but the one I need is always passed in
input_string = "random string"

df.loc[input_string, df.drop(input_string).eq(111).all()] = 111

上面的代码本质上是获取一列并检查该列中 DataFrame 的所有单元格值是否由 input_string 指定的单元格值除外，如果是，则将其也设置为 111 .

如何连续执行此操作？

Answer 1

我认为你的逻辑是倒退的。您当前正在做的实际上是删除一行并检查其他行是否符合您的条件。见下文：

#do column checking
df = pd.DataFrame({'a':[111]*10,'b':[111]*10})
df_col = df.copy() 
input_col = 'a'
df_col[input_col] = [222]*len(df.index) #distort input - should be removed in next step
df_col.loc[df_col.drop(input_col,axis=1).eq(111).all(1), input_col] = 111

#do row checking
df_row = df.copy()
input_row = 2
df_row.iloc[input_row] = [222]*len(df.columns) #distort input - should be removed in next step
df_row.loc[input_row,df_row.drop(input_row).eq(111).all()] = 111

确定 pandas DataFrame 行中的所有数据是否相同，特定列除外

Figuring out if all data in a pandas DataFrame row is the same except for a particular column

python

dataframe

pandas

python-3.7

pandas-loc