在 pandas 中分解具有相同嵌套键的行

Question

我在 pandas 数据框中的列中有数据：

'Column name':    
    [{'Name': Dan Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}, {'Name': Bob Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}], 
    
    [{'Name': Shelly Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}, {'Name': Sam Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}], 
    
    {'Name': Jane Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6},
     
    [{'Name': Chris Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}, {'Name': Darryl Smith, 'Attribute1': 4, 'Attribute2': 15, 'Attribute3': 6}],

公司用 [] 分隔，但公司只有 1 个观察结果的情况除外（例如本例中 Jane Smith 的第 3 个观察结果）。我的问题是在嵌套键相同时尝试解析嵌套键。我的目标是为每个公司获取价值最高的属性。

我试过：

 df = df.explode('Column Name')

但是，这没有任何作用。观察结果与以前相同。经过一番研究后，我尝试了以下方法

from ast import literal_eval
df['Column name'] = df['Column name'].apply(literal_eval)
df = df.explode('Column Name')

但是，当我执行此操作时，出现“KeyError:0”return。我发现此错误的发生是由于像第 3 行这样的实例，其中该公司只有 1 个观察值。我可以分解我的数据的小样本并获取最高属性并按计划进行。但是，我有 162 万行，因此将样本分成小批次并不明智。

有没有办法传递 'KeyError:0' 异常？还是有更好的方法可以到达我想去的地方？我是 Python/Pandas.

的新手

Answer 1

def tolist(x):
    if isinstance(x, dict):
        return [x]
    else:
        return x

df['Column name'] = df['Column name'].apply(literal_eval).apply(tolist)
df = df.explode('Column name')

说明

要使用爆炸，每一行都必须是序列类型（list 本例）。您需要做的第一件事是清理所有属于单个元素的行并将其转换为一个元素列表

    [{'Name': Jane Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}],

在 pandas 中分解具有相同嵌套键的行

Exploding rows with identical nested keys in pandas

python

parsing

explode

dataframe

pandas

说明