Pandas 多索引 to_html

Question

我正在使用 pandas –to_html 函数生成电子邮件报告。我正在寻找解决方案并在某个点（附上屏幕截图）中找到了我想要在数据框中组合单元格值的地方（就像 excel 合并和居中）某些列。如果您在之前的经历中遇到过这种情况，能否请您帮助我。我只想使用 pandas 而不使用外部库来实现此目的。

数据框中的内容（按顺序排列的列 - 类别、公司、销售额、2019 年、2020 年、LTM、Q1 20、Q1 21、同比变化（%））：

我们需要的输出：

我已经尝试使用 MultiIndex（下面是代码）但是它并没有完全满足我的需要。任何帮助将不胜感激。

email_report = (
        df_category.set_index(
            [
                "category",
                "company",
                "previous_year",
                "current_year",
                "ltm_share",
                "previous_value",
                "current_value",
                "change",
            ]
        )
        .to_html()
    )

Answer 1

我相信 pandas 会为开箱即用的多索引进行单元格合并，但不会为数据单元格进行单元格合并。您可能会使用 CSS 技巧来摆脱它。

我正在使用带有随机值的 2 级索引数据框和您的列名 + no merge 列：

     previous_year  no merge  current_year  ltm_share  previous_value  current_value    change
a 1       0.538438  0.197967      0.158720   0.031351        0.180214       0.888741  0.132500
  2       0.966025  0.363504      0.071190   0.503113        0.132445       0.883562  0.461739
  3       0.226929  0.913076      0.570731   0.521068        0.776050       0.996729  0.040835
b 1       0.327364  0.274166      0.789224   0.030502        0.508330       0.091049  0.497796
  2       0.041149  0.403038      0.924517   0.271489        0.692771       0.003774  0.391067
c 1       0.260083  0.873030      0.658576   0.983804        0.736934       0.970065  0.162908

这些是解决方案的工作思路：

在 table

grid

对所有中间标签使用 display: contents（thead、tbody、tr）。应该是well support by now
使用 display: none
使用 grid-row: span N
rowspan 属性不再被识别，因此需要将索引级别视为任何其他列。

我正在使用带有随机值的 2 级索引数据框和您的列名 + no merge 列：

     previous_year  no merge  current_year  ltm_share  previous_value  current_value    change
a 1       0.538438  0.197967      0.158720   0.031351        0.180214       0.888741  0.132500
  2       0.966025  0.363504      0.071190   0.503113        0.132445       0.883562  0.461739
  3       0.226929  0.913076      0.570731   0.521068        0.776050       0.996729  0.040835
b 1       0.327364  0.274166      0.789224   0.030502        0.508330       0.091049  0.497796
  2       0.041149  0.403038      0.924517   0.271489        0.692771       0.003774  0.391067
c 1       0.260083  0.873030      0.658576   0.983804        0.736934       0.970065  0.162908

代码如下：

hide_cells = df.index.get_level_values(0).duplicated(keep='first')
multiples = df.index.get_level_values(0).value_counts()

style = df.reset_index().style.hide_index().set_table_styles([
    {'selector': '', 'props': [
        ('display', 'grid'),
        ('grid-template-columns', f'repeat({len(df.columns) + df.index.nlevels}, auto)'),
    ]},
    {'selector': 'thead, tbody, tr', 'props': [
        ('display', 'contents'),
    ]},
])
style = style.set_properties(subset=pd.IndexSlice[hide_cells, ['level_0', *merge_cols]], display='none')

for height in multiples.unique():
    if height <= 1:
        continue
    mask = ~hide_cells & df.index.get_level_values(0).isin(multiples.index[multiples.eq(height)])
    style = style.set_properties(subset=pd.IndexSlice[mask, ['level_0', *merge_cols]], **{'grid-row':  f'span {height}'})

with open('out.html', 'w') as f:
    f.write(style.render())

结果：

这是我的初始答案，依赖于旧的 CSS 功能，但只有在没有相邻的列时才有效。它使用

固定行高 table
增大多索引行的第一个单元格（position:absolute 以避免更改 table 行，line-height 垂直对齐文本）
用visibility:hidden隐藏下面的单元格

hide_cells = df.index.get_level_values(0).duplicated(keep='first')
multiples = df.index.get_level_values(0).value_counts()

style = df.style.set_properties(height='1.5em')

style = style.set_properties(subset=pd.IndexSlice[hide_cells, merge_cols], visibility='hidden')
for height in multiples.unique():
    if height <= 1:
        continue
    mask = ~hide_cells & df.index.get_level_values(0).isin(multiples.index[multiples.eq(height)])
    style = style.set_properties(subset=pd.IndexSlice[mask, merge_cols], **{'position': 'absolute', 'line-height': f'{height * 1.5}em'})

with open('out.html', 'w') as f:
    f.write(style.render())

这在垂直方向上有效，但当多个合并的列彼此相邻时会弄乱水平对齐方式：

我想解决方法是手动指定列的固定位置。

Pandas 多索引 to_html

Pandas MultiIndex with to_html

dataframe

python-3.x

pandas

pandas-styles