在 pandas 中突出显示基于 groupby 的值

Question

我在 pandas 中有一个多索引数据框，我想在其中针对每个“Id1”子帧突出显示高于每个“计数”列平均值的值。我的实际数据框要大得多，但这是一个简化的示例：

Desired output

通过遍历 groupby 并将样式函数分别应用于每个子帧，我几乎已经可以得到我想要的东西了。

import pandas as pd


def highlight_max(x):
    return ['background-color: yellow' if v > (x.mean()) else '' for v in x]

iterables = [["Land", "Ocean"], ["Liquid", "Ice"]]

index = pd.MultiIndex.from_product(iterables, names=["Id1", "Id2"])

df = pd.DataFrame({'Count A': [12., 70., 30., 20.], 'Count B': [12., 70., 30., 20.]}, index=index)

for id, id_frame in df.groupby('Id1'):
    id_frame = id_frame.style.apply(highlight_max, axis=0)
    id_frame.to_excel(id+'.xlsx')

问题是我希望突出显示应用于整个数据框，而不是将其分开。使用我当前的代码，数据帧被拆分：

Frame 1

Frame 2

我考虑过将每个子帧连接在一起，但它们是 Styler 对象，据我所知这是不可能的。这个问题有更好的解决方案吗？

Answer 1

不是为每个组调用您的函数，而是为整个数据帧调用它。在函数内部，使用 groupby(level=0).transform('mean') 获取每个组的均值，然后与 col > means:

进行比较

def s(col):
    means = col.groupby(level=0).transform('mean')
    return (col > means).map({
        True: 'background-color: yellow',
        False: '',
    })

style = df.style.apply(s)
style

输出：

在 pandas 中突出显示基于 groupby 的值

Highlighting values based on groupby in pandas

python

multi-index

pandas

pandas-groupby

pandas-styles