Python - 调整多个电子表格中的列和颜色单元格的功能

Python - Function that adjusts columns and colors cells in multiple spread sheets

大家好希望你们一切都好

目前,我正在处理一个处理大量数据的项目,我正在使用我拥有的所有数据创建大量 pandas DataFrame,并尝试将其全部编译成 excel 文件,每个 DataFrame 都有自己的 excel sheet。我想要做的是创建一个函数,自动将每个 sheet 添加到 excel 文件,扩展每个 sheet 中的列,并相应地为每个 sheet 中的单元格着色。

例如...

sheet14 看起来像附加的东西...

每个 sheet 看起来就像这样,但可以有不同数量的行,但总是有相同数量的列。

我想要做的是为 Col1 的单元格着色,这些单元格的长度为绿色,长度为 3 黄色,长度为 5 紫色,依此类推。

我怎样才能做到这一点?我可以用一个 sheet 轻松地做到这一点,但是自动化它很乏味,因为多个 sheet 的部分对我来说很困难,因为我从来不必处理它。

如你所知,cycled_data_aggregate 看起来像, [DataFrame, 'A', 'A']

它是一个 ,其中包含, [, , ]

如有帮助,万分感谢!希望我解释得足够好。如果不是一般的解释会有所帮助,因为我编写的代码可能很奇怪哈哈! :)

import pandas as pd
import openpyxl
from openpyxl.styles import Color, PatternFill, Font, Border, Side
import xlsxwriter
from xlsxwriter.utility import xl_rowcol_to_cell

out_path = "C:\....\....xlsx"
writer1 = pd.ExcelWriter(out_path)

def MultipleSheetAdder(cycled_data_aggregate, overwrite_sheet_name, true_false):
    #  If the function for cycled_data_aggregate returns None...
    if cycled_data_aggregate == None:
        return None

    #  The sheet's data
    cycled_data = cycled_data_aggregate[0]
    
    #  If you want to overwrite what the sheet name is called and not use the
    #  cycled_data_aggregate's returned data

    if true_false:
        sheet_name = overwrite_sheet_name
    else:
        sheet_name = cycled_data_aggregate[1]


    cycled_data.to_excel(writer, sheet_name=sheet_name)
    for column in cycled_data:
        column_length = max(cycled_data[column].astype(str).map(len).max(), len(column)) + 3
        col_idx = cycled_data.columns.get_loc(column)
        writer.sheets[sheet_name].set_column(col_idx, col_idx, column_length)

    #  Add section here to change colors of specific rows in the first two columns depending on what
    #  values they are.
    {INSERT CODE HERE}

    return None  # Does this function need to even return anything? 

MultipleSheetAdder(Function(raw_data), '', False)

writer1.save()

添加颜色的一种方法是使用条件格式。这是一个基于您的数据的示例:

import pandas as pd


# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Col1': ['1.2.4', '2.2', '1.2.2', '2', '1.7.4'],
                   'Col2': [200, 100, 130, 140, 300],
                   'Col3': ['Text 1', 'Text 2', 'Text 3', 'Text 4', 'Text 5']})

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1', index=False)

# Get the xlsxwriter workbook and worksheet objects.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

format1 = workbook.add_format({'bg_color': 'green'})
format2 = workbook.add_format({'bg_color': 'yellow'})
format3 = workbook.add_format({'bg_color': 'purple'})


# Apply a conditional format to the cell range.
max_row = df.shape[0]
worksheet.conditional_format(1, 0, max_row, 0, {'type':     'formula',
                                                'criteria': '=LEN($A2)=1',
                                                'format':   format1})
worksheet.conditional_format(1, 0, max_row, 0, {'type':     'formula',
                                                'criteria': '=LEN($A2)=3',
                                                'format':   format2})
worksheet.conditional_format(1, 0, max_row, 0, {'type':     'formula',
                                                'criteria': '=LEN($A2)=5',
                                                'format':   format3})

# Close the Pandas Excel writer and output the Excel file.
writer.save()

输出: