如何连接两个具有相同列的 pandas 样式器?

How to concatenate two pandas stylers that have the same columns?

我有两个具有不同样式但具有相同列的数据框。

这是一个最小的例子,数据更少,样式更简单(我有一些复杂的突出显示而不是 highlight_max

import pandas as pd

data = pd.DataFrame({'maturity': ['2022-03-11', '2022-04-21', '2022-04-20', '2022-03-11', '2022-04-21'],
              'position': [-1500000, 2.3, -50, 10, -9],
              'price': [12, 51, 62, 10, 90000]})

data_dict = {'pos_data': data[data.position > 0], 'neg_data': data[data.position < 0]}
styled_df = {}
first = True
for name, df in data_dict.items():
    color = 'green' if 'pos' in name else 'red'
    header = pd.DataFrame({'position': [name]}, columns=df.columns)
    styled_df[name] = (
        pd.concat([header, df]).reset_index(drop=True)
                               .style.highlight_max(pd.IndexSlice[1:, 'position'], color=color, axis=0)
                                     .format(precision=2, na_rep='')
                                     .set_table_styles([{'selector': 'tbody td', 'props': [
                                         ('border-style', 'solid'), ('border-width', 'thin'), ('border-color', 'gray'), 
                                         ('border-collapse', 'collapse !important')]}], overwrite=False)
                                     .set_table_styles([{'selector': 'th', 'props': [
                                         ('border-style', 'solid'), ('border-width', 'thin'), 
                                         ('border-collapse', 'collapse !important')]}], overwrite=False)
                                     .set_table_attributes(
                                         'style="border-width: thin; border-collapse :collapse !important;'
                                         ' border-color:black; border-style: solid !important"')
                                     .hide_index()
    )
    if first:
        first = False
    else:
        styled_df[name] = styled_df[name].hide_columns()

email_body = f"<html><body> {''.join(s.to_html() for s in styled_df.values())} </body></html>"
# saving `email_body` to local and open gives:

当前输出:

如何使两个结果表共享相同的列宽(就像它们是一个串联的数据框一样)?

期望输出:

编辑:

我的意思

some complicated highlighting instead of highlight_max

在我的例子中是这样的:

.style.apply(highlight_range, columns=cols_red, low=5,
             high=10, color='red', color_light='light_red', axis=1)
      .apply(highlight_range, columns=cols_blue, low=0,
             high=5,color='blue', color_light='light_blue', axis=1))

而不是上面的.style.highlight_max(pd.IndexSlice[1:, 'position'], color=color, axis=0)

其中 highlight_range 是:

def highlight_range(row, columns, low: int, high: int, color: str, color_light: str):
    is_between = pd.Series(data=False, index=row.index)
    is_between[columns] = row.loc[columns].between(low, high, inclusive='left') # <= row <
    if not is_between.any():
        return [''] * len(is_between)
    return [f'background-color: {color_light}' if e else f'background-color: {color}' for e in is_between]

让这两个 DataFrame 显示为单个串联 DataFrame 的最简单方法是实际串联两个 DataFrame 然后 制作一个 Styler。以下解决方案使用 pandas 1.4.2(Styler 的版本之间可能存在显着差异)。

我们可以先计算要应用于各个单元格的样式:

for name in data_dict:
    style_str = f'background-color: {"green" if name == "pos_data" else "red"}'
    # Manually compute styles
    data_dict[name]['style'] = np.where(
        data_dict[name]['position'] == data_dict[name]['position'].max(),
        style_str,
        ''
    )
    # Add Header Row
    data_dict[name] = pd.concat([
        pd.DataFrame({'position': [name]}, columns=data_dict[name].columns),
        data_dict[name]
    ])

这里我们使用np.where to determine if the current value matches the maximal value by comparing the individual values to the max。在这个条件为真的任何地方我们都用 style_str 填充,其他地方得到一个空字符串。

生成的 DataFrame 如下所示:

#pos_data
     maturity  position price                    style
0         NaN  pos_data   NaN                      NaN
1  2022-04-21       2.3    51                         
3  2022-03-11      10.0    10  background-color: green

# neg_data
     maturity   position  price                  style
0         NaN   neg_data    NaN                    NaN
0  2022-03-11 -1500000.0     12                       
2  2022-04-20      -50.0     62                       
4  2022-04-21       -9.0  90000  background-color: red

请注意,样式对应于每个值。


现在我们可以 concat 数据帧:

df = pd.concat(data_dict, ignore_index=True)

     maturity   position  price                    style
0         NaN   pos_data    NaN                      NaN
1  2022-04-21        2.3     51                         
2  2022-03-11       10.0     10  background-color: green
3         NaN   neg_data    NaN                      NaN
4  2022-03-11 -1500000.0     12                         
5  2022-04-20      -50.0     62                         
6  2022-04-21       -9.0  90000    background-color: red

现在可以轻松使用 df 的样式列来为位置列设置样式 apply:

df.style.apply(lambda _: df['style'], subset='position')


整体样式看起来像:

styler = df.style
# Hide the "style" column
styler.hide(axis='columns', subset='style')
# styler.hide_columns(subset='style') pre pandas 1.4.0

# Hide the index
styler.hide(axis='index')
# styler.hide_index() pre pandas 1.4.0

# Apply the values from the style column as the styles
styler.apply(lambda _: df['style'], subset='position')

# Format decimal places and replace NaN
styler.format(precision=2, na_rep='')

# Table styles
styler.set_table_styles([
    {
        'selector': 'tbody td',
        'props': [
            ('border-style', 'solid'), ('border-width', 'thin'),
            ('border-color', 'gray'), ('border-collapse', 'collapse !important')
        ]
    },
    {
        'selector': 'th',
        'props': [
            ('border-style', 'solid'), ('border-width', 'thin'),
            ('border-collapse', 'collapse !important')
        ]
    }
])
# Table attributes
styler.set_table_attributes(
    'style="border-width: thin; border-collapse :collapse !important;'
    ' border-color:black; border-style: solid !important"'
)


从 pandas 1.3.0 开始,我们可以使用 Styler.to_html

从 Styler 对象中获得完整的 HTML
email_body = styler.to_html(doctype_html=True)

这会生成以下 HTML/CSS:

<!DOCTYPE html>
<html>

<head>
  <meta charset="utf-8">
  <style type="text/css">
    #T_7da9e tbody td {
      border-style: solid;
      border-width: thin;
      border-color: gray;
      border-collapse: collapse !important;
    }
    
    #T_7da9e th {
      border-style: solid;
      border-width: thin;
      border-collapse: collapse !important;
    }
    
    #T_7da9e_row2_col1 {
      background-color: green;
    }
    
    #T_7da9e_row6_col1 {
      background-color: red;
    }
  </style>
</head>

<body>
  <table id="T_7da9e" style="border-width: thin; border-collapse :collapse !important; border-color:black; border-style: solid !important">
    <thead>
      <tr>
        <th id="T_7da9e_level0_col0" class="col_heading level0 col0">maturity</th>
        <th id="T_7da9e_level0_col1" class="col_heading level0 col1">position</th>
        <th id="T_7da9e_level0_col2" class="col_heading level0 col2">price</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td id="T_7da9e_row0_col0" class="data row0 col0"></td>
        <td id="T_7da9e_row0_col1" class="data row0 col1">pos_data</td>
        <td id="T_7da9e_row0_col2" class="data row0 col2"></td>
      </tr>
      <tr>
        <td id="T_7da9e_row1_col0" class="data row1 col0">2022-04-21</td>
        <td id="T_7da9e_row1_col1" class="data row1 col1">2.30</td>
        <td id="T_7da9e_row1_col2" class="data row1 col2">51</td>
      </tr>
      <tr>
        <td id="T_7da9e_row2_col0" class="data row2 col0">2022-03-11</td>
        <td id="T_7da9e_row2_col1" class="data row2 col1">10.00</td>
        <td id="T_7da9e_row2_col2" class="data row2 col2">10</td>
      </tr>
      <tr>
        <td id="T_7da9e_row3_col0" class="data row3 col0"></td>
        <td id="T_7da9e_row3_col1" class="data row3 col1">neg_data</td>
        <td id="T_7da9e_row3_col2" class="data row3 col2"></td>
      </tr>
      <tr>
        <td id="T_7da9e_row4_col0" class="data row4 col0">2022-03-11</td>
        <td id="T_7da9e_row4_col1" class="data row4 col1">-1500000.00</td>
        <td id="T_7da9e_row4_col2" class="data row4 col2">12</td>
      </tr>
      <tr>
        <td id="T_7da9e_row5_col0" class="data row5 col0">2022-04-20</td>
        <td id="T_7da9e_row5_col1" class="data row5 col1">-50.00</td>
        <td id="T_7da9e_row5_col2" class="data row5 col2">62</td>
      </tr>
      <tr>
        <td id="T_7da9e_row6_col0" class="data row6 col0">2022-04-21</td>
        <td id="T_7da9e_row6_col1" class="data row6 col1">-9.00</td>
        <td id="T_7da9e_row6_col2" class="data row6 col2">90000</td>
      </tr>
    </tbody>
  </table>
</body>

</html>


使用的设置和导入。

import numpy as np
import pandas as pd

data = pd.DataFrame({
    'maturity': ['2022-03-11', '2022-04-21', '2022-04-20', '2022-03-11',
                 '2022-04-21'],
    'position': [-1500000, 2.3, -50, 10, -9],
    'price': [12, 51, 62, 10, 90000]
})

data_dict = {
    # Adding copy to prevent a later SettingWithCopyWarning
    'pos_data': data[data.position > 0].copy(),
    'neg_data': data[data.position < 0].copy()
}