Python :在数据框中用相同的值填充特定列并删除无用的行

Python : Fill a specific column with the same value in a Dataframe and remove the rows useless

假设我有这个数据框:

data3 = ['ID','ID','','','','','']
data4 = [12,34,465,678,896,'','']
data5 = [8798,67,2313,'','','','']
data6 = [56,67,'','','','','']

df2 = pd.DataFrame(list(zip(data3,data4,data5,data6)),columns = ['Name','Data1','Data2','Data3'])
print(df2)

  Name Data1 Data2 Data3
0   ID    12  8798    56
1   ID    34    67    67
2        465  2313
3        678
4        896
5
6

我想用我们可以找到的相同值填充“名称”列,并为所有有值的行填充,并删除没有任何内容的无用行。所以我想得到这个结果:

  Name Data1 Data2 Data3
0   ID    12   8798   56
1   ID    34   67     67
2   ID    465  2313
3   ID    678
4   ID    896

有人有有效的想法吗?

谢谢

使用 DataFrame.replace if empty strings, not NaNs, then DataFrame.dropna 并最后向前填充 Name 列中的缺失值 ffill:

df2 = df2.replace('', np.nan)

df2 = df2.dropna(how='all')
df2['Name'] = df2['Name'].ffill()
print(df2)
  Name  Data1   Data2  Data3
0   ID   12.0  8798.0   56.0
1   ID   34.0    67.0   67.0
2   ID  465.0  2313.0    NaN
3   ID  678.0     NaN    NaN
4   ID  896.0     NaN    NaN

您可以使用 df.replaceisna()all 在所有行中删除 Nan 并用 ffill() 填充 Nan:

In [2731]: df2 = df2.replace('', np.nan)
In [2756]: df2 = df2[~df2.isna().all(1)]
In [2733]: df2.Name = df2.Name.ffill()

In [2758]: df2
Out[2758]: 
  Name  Data1   Data2  Data3
0   ID   12.0  8798.0   56.0
1   ID   34.0    67.0   67.0
2   ID  465.0  2313.0    NaN
3   ID  678.0     NaN    NaN
4   ID  896.0     NaN    NaN