pandas 新列的数据框条件填充

Question

我正在处理 pandas DataFrame 中的列（趋势）。下面是我的源 DataFrame。目前我已经设置为0.

我想用来填充趋势列的逻辑如下

如果 df['Close'] > df.shift(1)['Down'] 那么 1
如果 df['Close'] < df.shift(1)['Up'] 那么 -1
以上任一条件不满足则df.shift(1)['Trend']。如果此值为 NaN，则将其设置为 1。

以上纯文本代码，

如果当前收盘价大于 Down 列的前一行值，则 1
如果当前收盘价小于 Up 列的前一行值，则 -1
如果其中任何一个条件不满足，则设置 Trend 列的前一行值，只要其 不是 NaN。如果它的 NaN 则设置为 1

更新

数据为文本

   Close        Up      Down  Trend
   3.138       NaN       NaN      0
   3.141       NaN       NaN      0
   3.141       NaN       NaN      0
   3.130       NaN       NaN      0
   3.110       NaN       NaN      0
   3.130  3.026432  3.214568      0
   3.142  3.044721  3.214568      0
   3.140  3.047010  3.214568      0
   3.146  3.059807  3.214568      0
   3.153  3.064479  3.214568      0
   3.173  3.080040  3.214568      0
   3.145  3.080040  3.214568      0
   3.132  3.080040  3.214568      0
   3.131  3.080040  3.209850      0
   3.141  3.080040  3.209850      0
   3.098  3.080040  3.205953      0
   3.070  3.080040  3.195226      0

预期输出

Answer 1

我们可以使用 numpy.select 到 select 值，具体取决于满足的条件。然后将 numpy.select 的结果传递给 fillna 以用它填充缺少的“趋势”值（这用于不丢失现有的“趋势”值）。然后由于 NaN 趋势值必须用以前的“趋势”值填充，我们使用 ffill 并用 1.

填充剩余的 NaN 值

import numpy as np
df['Trend'] = (df['Trend'].replace(0, np.nan)
               .fillna(pd.Series(np.select([df['Close'] > df['Down'].shift(), 
                                            df['Close'] < df['Up'].shift()],
                                           [1, -1], np.nan), index=df.index))
               .ffill().fillna(1))

输出：

    Close        Up      Down  Trend
0   3.138       NaN       NaN    1.0
1   3.141       NaN       NaN    1.0
2   3.141       NaN       NaN    1.0
3   3.130       NaN       NaN    1.0
4   3.110       NaN       NaN    1.0
5   3.130  3.026432  3.214568    1.0
6   3.142  3.044721  3.214568    1.0
7   3.140  3.047010  3.214568    1.0
8   3.146  3.059807  3.214568    1.0
9   3.153  3.064479  3.214568    1.0
10  3.173  3.080040  3.214568    1.0
11  3.145  3.080040  3.214568    1.0
12  3.132  3.080040  3.214568    1.0
13  3.131  3.080040  3.209850    1.0
14  3.141  3.080040  3.209850    1.0
15  3.098  3.080040  3.205953    1.0
16  3.070  3.080040  3.195226   -1.0

pandas 新列的数据框条件填充

pandas dataframe conditional population of a new column

python

numpy

dataframe

pandas

fillna