如何在 pandas/dataframe-js/etc 中合并复杂的数据帧

How to merge complex data frames in pandas/dataframe-js/etc

我想我想做的事情在 pandas 中相当简单,但我就是做不到。 我真的想在 dataframe-js(或 danfojs)中执行此操作,但 pandasdataframe-js 中的任何帮助都会有所帮助。

本质上:

示例数据帧:

    let data1 = [
      [['col A', 'uuid'], ['1238', '12']],
      [['col B', 'uuid'], ['42.4', '12']],
      [['col A', 'uuid'], ['1091', '48']],
      [['col B', 'uuid'], ['35.1', '48']],
      [['col B', 'uuid'], ['44.4', '77']],
    ]

期望的输出(列顺序无关紧要):

[
      ['col A', 'uuid', 'col B'],
      ['1238', '12', '42.4'],
      ['1091', '48', '35.1'],
      [null, '77', '44.4'] // null, undefined, NaN...doesn't matter for the gaps
]

请帮忙:)

df = (pd.DataFrame(map(lambda x: dict(zip(*x)), data1)).set_index('uuid').
      stack().unstack().reset_index())

df2 = np.r_[df.columns.values[None,[1,0,2]],df.iloc[:,[1,0,2]].values].tolist()
print(df2)

[['col A', 'uuid', 'col B'],
 ['1238', '12', '42.4'],
 ['1091', '48', '35.1'],
 [nan, '77', '44.4']]

好的,我将@onyambu 的回答与 merge 函数相结合,该函数现在接受不同大小的数据帧

# create an initial empty df
t = pd.DataFrame(columns=['uuid'])

# reduce list of dataframes into one
df = reduce(lambda x,y: x.merge(pd.DataFrame(y[1:], columns=y[0]), how='outer'), data1, t)

# squash rows on `uuid` index with stack/unstack
df = df.set_index('uuid').stack().unstack().reset_index()

# output in original "table" format
df2 = np.r_[df.columns.values[None],df.iloc[:].values].tolist()
print(df2)