Xlsxwriter 用 sheet 名称包含多个单词并将数字作为文本写入 excel 来绘制奇怪的行为图表

Xlsxwriter charts odd behaviour with sheet names consisting of multiple words and writing numbers as text into excel

我偶然发现了一个相当奇怪的行为,但还不能想出一个解释。也许有人对此有解释。

下面的代码产生图片中的输出。看起来“Chart Ok”就是我正在寻找的那个,但仔细观察(在 Excel 中)我们发现数据没有被正确引用(由于值被视为“一般”而不是“数字”格式)。然而 Excel 能够稍微正确地创建图表。

实际正确的方法是旁边的图表“缺少图表”,但是这里Excel无法创建图表。代码中的唯一区别create_chart_missing_chart中工作表名称周围的 属性 add_series.

显然,此行为的解决方法是将实际数字设置为数据框中的数字格式。 (我想最好在数据写入Excel之前做)

我正在使用:

  • MS Office 专业增强版 2019
  • Python3.8
  • Pandas 1.2.2

Resulting charts

import pandas as pd
from xlsxwriter.utility import xl_rowcol_to_cell


def main():
    df = pd.DataFrame({"text": ["some value1", "some value2", "sum of values"], "valeus": ["2", "4", "6"]})
    with pd.ExcelWriter("test.xlsx") as writer:
        create_worksheet(df, writer, "Some name with space")
    return 0


def create_worksheet(df: pd.DataFrame, writer, sheet: str):
    df.to_excel(writer, sheet, index=False, startrow=1, startcol=1, header=False)

    wb = writer.book
    ws = writer.sheets[sheet]
    cell_format = wb.add_format({"bold": True, "border": True, "border_color": "white"})
    cell_grid = wb.add_format({"border": True, "border_color": "white"})
    ws.set_column("A:AA", 10, cell_grid)
    ws.set_column("B:B", 25, cell_format)

    chart_OK = create_chart_ok(wb, df, sheet)
    chart_missing_values = create_chart_missing_values(wb, df, sheet)
    chart_missing_chart = create_chart_missing_chart(wb, df, sheet)
    chart_missing_x = create_chart_missing_x(wb, df, sheet)

    # This creates a visible chart even if the values to display are stored in excel as string, 
    # the only odd thing is the missing cross reference between the chart and the data
    ws.insert_chart("B6", chart_OK)
    
    # This creates a chart, where instead of the values from B2 and B3 only "1" and "2" are shown in the legend
    ws.insert_chart("B22", chart_missing_values)

    # This would actually create a valid chart crossreferencing correctly to the data, 
    # if the values for the chart in C2 and C3 where handled by excel as numbers instead of "general"
    ws.insert_chart("K6", chart_missing_chart)

    # Here we have the missing chart, due to the values in C2 and C3 being seen as text
    # additionally there is also missing values for the legend
    ws.insert_chart("K22", chart_missing_x)

    return 0


def create_chart_ok(wb, df: pd.DataFrame, sheet: str):
    chart = wb.add_chart({'type': 'pie'})
    chart.set_title({"name": "Chart OK", "name_font": {"bold": False, "size": 14}})
    chart.add_series({
        # Mind the missing ' characters around the sheet name in values
        "values": "=" + str(sheet) + "!$C:" +
                  xl_rowcol_to_cell(len(df.index) - 1, 2, row_abs=True, col_abs=True),
        # Mind the  ' characters around the sheet name in values
        "categories": "='" + str(sheet) + "'!$B:" +
                      xl_rowcol_to_cell(len(df.index) - 1, 1, row_abs=True, col_abs=True),
        'points': [
            {'fill': {'color': "9bbb59"}},
            {'fill': {'color': '4f81bd'}},
        ],
    })

    return chart


def create_chart_missing_values(wb, df: pd.DataFrame, sheet: str):
    chart = wb.add_chart({'type': 'pie'})
    chart.set_title({"name": "Missing Values in legend", "name_font": {"bold": False, "size": 14}})
    chart.add_series({
        # Mind the missing ' characters around the sheet name in values
        "values": "=" + str(sheet) + "!$C:" +
                  xl_rowcol_to_cell(len(df.index) - 1, 2, row_abs=True, col_abs=True),
        # Mind the missing ' characters around the sheet name in categories
        "categories": "=" + str(sheet) + "!$B:" +
                      xl_rowcol_to_cell(len(df.index) - 1, 1, row_abs=True, col_abs=True),
        'points': [
            {'fill': {'color': "9bbb59"}},
            {'fill': {'color': '4f81bd'}},
        ],
    })

    return chart


def create_chart_missing_chart(wb, df: pd.DataFrame, sheet: str):
    chart = wb.add_chart({'type': 'pie'})
    chart.set_title({"name": "Missing chart", "name_font": {"bold": False, "size": 14}})
    chart.add_series({
        # Mind the ' characters around the sheet name in values
        "values": "='" + str(sheet) + "'!$C:" +
                  xl_rowcol_to_cell(len(df.index) - 1, 2, row_abs=True, col_abs=True),
        # Mind the ' characters around the sheet name in categories
        "categories": "='" + str(sheet) + "'!$B:" +
                      xl_rowcol_to_cell(len(df.index) - 1, 1, row_abs=True, col_abs=True),
        'points': [
            {'fill': {'color': "9bbb59"}},
            {'fill': {'color': '4f81bd'}},
        ],
    })

    return chart


def create_chart_missing_x(wb, df: pd.DataFrame, sheet: str):
    chart = wb.add_chart({'type': 'pie'})
    chart.set_title({"name": "Missing whatever", "name_font": {"bold": False, "size": 14}})
    chart.add_series({
        # Mind the ' characters around the sheet name in values
        "values": "='" + str(sheet) + "'!$C:" +
                  xl_rowcol_to_cell(len(df.index) - 1, 2, row_abs=True, col_abs=True),
        # Mind the missing ' characters around the sheet name in categories
        "categories": "=" + str(sheet) + "!$B:" +
                      xl_rowcol_to_cell(len(df.index) - 1, 1, row_abs=True, col_abs=True),
        'points': [
            {'fill': {'color': "9bbb59"}},
            {'fill': {'color': '4f81bd'}},
        ],
    })

    return chart


if __name__ == "__main__":
    main()

Xlsxwriter charts odd behaviour with sheet names consisting of multiple words and writing numbers as text into excel. I have found a rather odd behaviour by chance, but could not come up with an explanation for it yet. Maybe someone has an explanation for it.

当您尝试以您尝试创建图表的方式创建图表时,这种奇怪的行为实际上只是 Excel 的默认行为。如果您在 Excel 中手动复制相同的图表,那么您会得到相同的结果。 XlsxWriter 只是在创建您告诉它要创建的图表。

示例中的一些问题是:

  1. Excel要求带空格的工作表名称是单个的 引。 XlsxWriter 对此发出警告。
  2. 图表“值”不能是字符串。它们必须是数字。
  3. XlsxWriter 警告的颜色名称也存在问题。

要解决这些问题中的第一个问题,您可以手动将工作表名称单引号或更好地让 XlsxWriter 通过 add_series() 列表语法来完成。第二个问题可以通过将字符串值转换为数字或使用 XlsxWriter strings_to_numbers 选项来解决。

更改示例以执行这两项操作,并修复颜色名称,将会得到如下内容:

import pandas as pd


def main():
    df = pd.DataFrame({'text': ['some value1', 'some value2', 'sum of values'],
                       'values': ['2', '4', '6']})

    options = {'options': {'strings_to_numbers': True}}

    with pd.ExcelWriter('test.xlsx', engine='xlsxwriter',
                        engine_kwargs=options) as writer:
        create_worksheet(df, writer, 'Some name with space')

    return 0


def create_worksheet(df: pd.DataFrame, writer, sheet: str):
    df.to_excel(writer, sheet, index=False, startrow=1, startcol=1,
                header=False)

    wb = writer.book
    ws = writer.sheets[sheet]
    cell_format = wb.add_format({'bold': True, 'border': True,
                                 'border_color': 'white'})
    cell_grid = wb.add_format({'border': True,
                               'border_color': 'white'})
    ws.set_column('A:AA', 10, cell_grid)
    ws.set_column('B:B', 25, cell_format)

    chart_OK = create_chart_ok(wb, df, sheet)

    ws.insert_chart('B6', chart_OK)

    return 0


def create_chart_ok(wb, df: pd.DataFrame, sheet: str):
    chart = wb.add_chart({'type': 'pie'})
    chart.set_title({'name': 'Chart OK',
                     'name_font': {'bold': False, 'size': 14}})
    first_row = 1
    last_row = first_row + df.shape[0] - 2

    chart.add_series({
        'categories': [sheet, first_row, 1, last_row, 1],
        'values': [sheet, first_row, 2, last_row, 2],

        'points': [
            {'fill': {'color': '#9bbb59'}},
            {'fill': {'color': '#4f81bd'}},
        ],
    })

    return chart


if __name__ == '__main__':
    main()

输出:

最后,隐藏工作表网格线的更好方法是使用 worksheet.hide_gridliens() 方法。