创建行值为 NaN 的新列

Question

我有一列数据，其中包含 NaN 存在的行（见图）。我打算在值为 NaN 的位置拆分它，并在 NaN 之后出现值的位置创建新列。例如，我打算在第 7 行和后续行中创建一个新列，其中列中的后续 NaN 值。我已经试过了，但是它把数据挤在一起了。

Col1
0   Start
1   65
2   oft
3   23:59:02
4   12-Feb-99
5   NaN
6   NaN
7   17
8   Sparkle
9   10

我已经使用下面的代码将它们分成几组。 df['group_no'] = (df.Column1.isnull()).cumsum()

Col1           groups
0   Start      0
1   65         0
2   oft        0
3   23:59:02   0
4   12-Feb-99. 0
5   NaN        1
6   NaN        2
7   17         2
8   Sparkle    2
9   10         2

我现在打算根据组数将数据堆叠到不同的列中

Col1              Col2    Col3   ...   ColN
0   Start         NaN     Nan           ...
1   65                    17            ....
2   oft                   Sparkle       ....
3   23:59:02              10            ...
4   12-Feb-99

Answer 1

我建议手动切片 pandas 数据帧而不是使用 numpy 切片。

# Get index of Null values
index = df.index[df.col.isna()].to_list()

starting_index = [0] + [i + 1 for i in index]
ending_index = [i - 1 for i in index] + [len(df) - 1]

n = 0

for i, j in zip(starting_index, ending_index):
    if i <= j:
        n += 1
        df[f"col{n}"] = np.nan
        df.loc[: j - i, f"col{n}"] = df.loc[i:j, "col"].values

创建行值为 NaN 的新列

create new columns where values of row is NaN

python

pandas