如何根据 Excel Dataframe 中的内容突出显示行?

How to highlight rows based on content in Excel Dataframe?

我有一个 excel 文件,其中包含与左边一样的数据,我正在尝试对其进行格式化以获取 右侧的数据格式为 table。

使用我当前的代码,我能够格式化所有包含 headers (H1, H2,...)

的行

这是file.xlsx的内容:

这是我当前的代码:

import pandas as pd
import numpy as np

from xlsxwriter.utility import xl_rowcol_to_cell

data = {'H1': {0: 'A', 1: '', 2: 'H1', 3: 'A', 4: '', 5: 'H1', 6: 'A', 7: 'A', 8: 'B', 9: 'B', 10: 'B', 11: '', 12: 'H1', 13: 'B', 14: 'B', 15: '', 16: 'H1', 17: 'C', 18: 'C', 19: 'C', 20: 'D', 21: 'D', 22: ''}, 'H2': {0: 'Rty', 1: '', 2: 'H2', 3: 'Rty', 4: '', 5: 'H2', 6: 'Rty', 7: 'Rty', 8: 'Rty', 9: 'Rty', 10: 'Rty', 11: '', 12: 'H2', 13: 'Rty', 14: 'Rty', 15: '', 16: 'H2', 17: 'Rty', 18: 'Rty', 19: 'Rty', 20: 'Rty', 21: 'Rty', 22: ''}, 'H3': {0: '1195', 1: '', 2: 'H3', 3: '1195', 4: '', 5: 'H3', 6: '1195', 7: '1195', 8: '1195', 9: '1195', 10: '1195', 11: '', 12: 'H3', 13: '1195', 14: '1195', 15: '', 16: 'H3', 17: '1195', 18: '1195', 19: '1195', 20: '1195', 21: '1195', 22: ''}, 'H4': {0: '9038', 1: 'H3=9038, 000', 2: 'H4', 3: '1355', 4: 'H3=1355, 363', 5: 'H4', 6: '2022', 7: '2022', 8: '2022', 9: '2022', 10: '2022', 11: 'H3=2022, 234', 12: 'H4', 13: '2564', 14: '2564', 15: 'H3=2564, 726', 16: 'H4', 17: '1501', 18: '1501', 19: '1501', 20: '1501', 21: '1501', 22: 'H3=1501, 143'}, 'H5': {0: '1537', 1: '', 2: 'H5', 3: '8', 4: '', 5: 'H5', 6: '59', 7: '78', 8: '76', 9: '6', 10: '31', 11: '', 12: 'H5', 13: '71', 14: '17', 15: '', 16: 'H5', 17: '72', 18: '89', 19: '47', 20: '32', 21: '233', 22: ''}}
df = pd.DataFrame.from_dict(data)


writer = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
df.to_excel(writer, index=False, sheet_name='Output')

workbook = writer.book
worksheet = writer.sheets['Output']
number_rows = len(df.index)

format1 = workbook.add_format({'bg_color': 'black', 'font_color': 'yellow'})

for r in range(0,number_rows):
    if df.iat[r,0] == "H1":
        worksheet.set_row(r+1, None, format1) 

writer.save()

这是我当前的输出:

我被困在如何限制从 A 列到 E 列的格式以及如何根据颜色嵌入绿色、黄色、绿色黄色 当 A 列中的值发生变化时。我的意思是,对于 A 列中的所有连续值 = "A" 以绿色突出显示,当更改突出显示为黄色时 当再次将突出显示更改为绿色时,依此类推。

我该怎么做?提前致谢。

您可以使用不同的 excel 库,例如 openpyxl

您可以分别设置每个单元格的格式 例如:

from openpyxl import Workbook
from openpyxl.styles import Font, Color, colors, fills
from openpyxl.utils.dataframe import dataframe_to_rows
wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=False, header=True):
    ws.append(r)

a1 = ws['A1']
a1.font = Font(color="FF0000")
a1.fill = fills.PatternFill(patternType='solid', fgColor=Color(rgb='00FF00'))
wb.save("pandas_openpyxl.xlsx")

他们这里有很棒的文档:https://openpyxl.readthedocs.io/en/stable/pandas.html

如果您想继续使用 xlsxwriter Python 模块,您可以使用工作表对象上的 write(…) 方法一步设置单元格的内容和格式。

您将不得不分解 to_excel() 方法并在循环中单独写入每个 DataFrame 值。

示例单元格创建和格式化调用:

cell_format = workbook.add_format({'bold': True, 'italic': True})

# inside a loop iterating over your DataFrame
worksheet.write(row, column, value, cell_format)  # Cell is bold and italic.