使用 namedtuple 或任何其他 class pandas 填充

fillna with a namedtuple or any other class pandas

有没有办法在 python 中用命名元组填充 na?

我收到这个 TypeError:

from collections import namedtuple
import pandas as pd
import numpy as np

df = pd.DataFrame([0, 0, 0, 0, np.nan, 0, 0, 0])

nametup = namedtuple('mynp', ['arg1', 'arg2'])
q = nametup(None, None)
df.fillna(q)

Traceback (most recent call last):
  File "C:\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-25-363ec560dd77>", line 9, in <module>
    df.fillna(q)
  File "C:\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2762, in fillna
    downcast=downcast, **kwargs)
  File "C:\Anaconda2\lib\site-packages\pandas\core\generic.py", line 3101, in fillna
    'you passed a "{0}"'.format(type(value).__name__))
TypeError: "value" parameter must be a scalar or dict, but you passed a "mynp"

也试过这个:

df.replace(np.nan, q)
Traceback (most recent call last):
  File "C:\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-31-6f8a86f11bbb>", line 1, in <module>
    df.replace(np.nan, q)
  File "C:\Anaconda2\lib\site-packages\pandas\core\generic.py", line 3440, in replace
    raise TypeError(msg)  # pragma: no cover
TypeError: Invalid "to_replace" type: 'float'

有什么解决方法吗?谢谢!

不容易,需要通过对象创建Series然后替换NaN:

nametup = namedtuple('mynp', ['arg1', 'arg2'])
q = nametup(None, None)

s = pd.Series([q]*len(df.index))
print (s)
0    (None, None)
1    (None, None)
2    (None, None)
3    (None, None)
4    (None, None)
5    (None, None)
6    (None, None)
7    (None, None)
dtype: object

mask 的解决方案:

df[0] = df[0].mask(df[0].isnull(), s)
print (df)
              0
0             0
1             0
2             0
3             0
4  (None, None)
5             0
6             0
7             0

combine_first or fillna 的另一个解决方案 Series s:

df[0] = df[0].combine_first(s)
#similar solution
#df[0] = df[0].fillna(s)
print (df)
              0
0             0
1             0
2             0
3             0
4  (None, None)
5             0
6             0
7             0