如何将 nan 或字符串值更改为其所属列的平均值？

Question

df = pd.read_csv(self.table_name)

for j in df.values:
    for k in j[0:-1]:
        try:
            k = float(k)
        except ValueError:
            df.replace(to_replace=k,value=np.nan,inplace=True)

df.replace(to_replace=np.nan, value=df.mean(), inplace=True)



# df.fillna(df.mean(), inplace=True)
df.to_csv(self.table_name, index=False)
print(df)

If the data has string values while entering the training, it may not enter the training. In order to prevent this, I created a function that becomes active with a button, but the string values are deleted in the first run, and in the second run, I get the result I want. I made the button and its function over pyqt5. When the user clicks the button I mentioned, he connects to this function and its functions respectively. But where is the problem I could not solve, is there anyone who can help?

Answer 1

您可以对所有列使用自定义函数，而无需先由 DataFrame.iloc with convert values to numeric with to_numeric and errors='coerce', so if created missing values for not parseable values. Last replace them by mean in Series.fillna 选择：

def f(x):
    s = pd.to_numeric(x, errors='coerce')
    return s.fillna(s.mean())


df.iloc[:, 1:]= df.iloc[:, 1:].apply(f)

如何将 nan 或字符串值更改为其所属列的平均值？

How can ı change nan or string values to average of the column to which it belongs?

python

numpy

python-3.x

pandas

pyqt5