如何使用 xlrd 从 XLSX 单元格中获取每个字符格式?
How to get each char format from XLSX cell with xlrd?
我有一个 excel 格式如下:
并且我使用 python:
阅读了这个单元格
wb = open_workbook(xlsx_path,formatting_info=True)
sheet = wb.sheet_by_name(sheet_name)
cell = sheet.cell(i,j)
print("cell.xf_index is", cell.xf_index)
fmt = wb.xf_list[cell.xf_index]
print("type(fmt) is", type(fmt))
print("Dumped Info:")
fmt.dump()
但我得到的完全是单元格格式:
如何获取每个字符格式?谢谢!
我检查了你的电子表格并创建了我自己的电子表格,你可以在这里找到:
https://drive.google.com/open?id=1WBm_tcFdlcckDgIvPdh-ezosm5pP4xXd
这是我的样子:
我的字体大小为 11、22、33 和 44 磅,以便更容易找到它们。
我无法在 xlrd
或 openpyxl
中找到 API 让我们读取存储在单个单元格中的多种字体。但是,我知道 openpyxl
提供此功能。问题是openpyxl
的这方面没有很好的记录。
我试图对文件格式进行逆向工程。回想一下 .xlsx
文件是 ZIP 文件。所以我解压缩了它。这是我的 Sheet1.xml 文件的内容,打印精美:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" mc:Ignorable="x14ac xr xr2 xr3" xr:uid="{5F15C188-96B2-E44D-B28B-DB5F2AE0283E}">
<dimension ref="A1"/>
<sheetViews>
<sheetView tabSelected="1" zoomScale="352" zoomScaleNormal="352" workbookViewId="0"/>
</sheetViews>
<sheetFormatPr baseColWidth="10" defaultRowHeight="16"/>
<sheetData>
<row r="1" spans="1:1" ht="57">
<c r="A1" s="1" t="s">
<v>0</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
为了比较,这是你的:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" mc:Ignorable="x14ac xr xr2 xr3" xr:uid="{6A0965C4-B0BC-435B-932C-B62A46E63DFA}">
<dimension ref="A1"/>
<sheetViews>
<sheetView tabSelected="1" workbookViewId="0"/>
</sheetViews>
<sheetFormatPr defaultRowHeight="16.5" x14ac:dyDescent="0.25"/>
<sheetData>
<row r="1" spans="1:1" ht="25.5" x14ac:dyDescent="0.25">
<c r="A1" t="s">
<v>0</v>
</c>
</row>
</sheetData>
<phoneticPr fontId="1" type="noConversion"/>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
如您所见,样式信息未存储在Sheet1.xml中。相反,它存储在 styles.xml
中。 Excel 似乎为应用了专门格式的单元格创建了命名样式。命名样式可以包含单元格可以具有的任何格式设置,例如字体和填充。
下面是 styles.xml
文件对于空白 Excel 文件的样子:
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="x14ac x16r2 xr">
<fonts count="1" x14ac:knownFonts="1">
<font>
<sz val="12"/>
<color theme="1"/>
<name val="Calibri"/>
<family val="2"/>
<scheme val="minor"/>
</font>
</fonts>
<fills count="2">
<fill>
<patternFill patternType="none"/>
</fill>
<fill>
<patternFill patternType="gray125"/>
</fill>
</fills>
<borders count="1">
<border>
<left/>
<right/>
<top/>
<bottom/>
<diagonal/>
</border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0"/>
</cellStyleXfs>
<cellXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/>
</cellXfs>
<cellStyles count="1">
<cellStyle name="Normal" xfId="0" builtinId="0"/>
</cellStyles>
<dxfs count="0"/>
<tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleLight16"/>
<extLst>
<ext xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
<x14:slicerStyles defaultSlicerStyle="SlicerStyleLight1"/>
</ext>
<ext xmlns:x15="http://schemas.microsoft.com/office/spreadsheetml/2010/11/main" uri="{9260A510-F301-46a8-8635-F512D64BE5F5}">
<x15:timelineStyles defaultTimelineStyle="TimeSlicerStyleLight1"/>
</ext>
</extLst>
</styleSheet>
这是我的文件:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="x14ac x16r2 xr">
<fonts count="6">
<font>
<sz val="12"/>
<color theme="1"/>
<name val="Calibri"/>
<family val="2"/>
<scheme val="minor"/>
</font>
<font>
<sz val="11"/>
<color theme="1"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="12"/>
<color theme="1"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="44"/>
<color rgb="FFFF0000"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="33"/>
<color rgb="FF0070C0"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="22"/>
<color rgb="FF7030A0"/>
<name val="Calibri (Body)_x0000_"/>
</font>
</fonts>
<fills count="2">
<fill>
<patternFill patternType="none"/>
</fill>
<fill>
<patternFill patternType="gray125"/>
</fill>
</fills>
<borders count="1">
<border>
<left/>
<right/>
<top/>
<bottom/>
<diagonal/>
</border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0"/>
</cellStyleXfs>
<cellXfs count="2">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/>
<xf numFmtId="0" fontId="2" fillId="0" borderId="0" xfId="0" applyFont="1"/>
</cellXfs>
<cellStyles count="1">
<cellStyle name="Normal" xfId="0" builtinId="0"/>
</cellStyles>
<dxfs count="0"/>
<tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleLight16"/>
<extLst>
<ext xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
<x14:slicerStyles defaultSlicerStyle="SlicerStyleLight1"/>
</ext>
<ext xmlns:x15="http://schemas.microsoft.com/office/spreadsheetml/2010/11/main" uri="{9260A510-F301-46a8-8635-F512D64BE5F5}">
<x15:timelineStyles defaultTimelineStyle="TimeSlicerStyleLight1"/>
</ext>
</extLst>
</styleSheet>
这是你的:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="x14ac x16r2 xr">
<fonts count="4" x14ac:knownFonts="1">
<font>
<sz val="12"/>
<color theme="1"/>
<name val="新細明體"/>
<family val="2"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
<font>
<sz val="9"/>
<name val="新細明體"/>
<family val="2"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
<font>
<sz val="18"/>
<color rgb="FFFF0000"/>
<name val="新細明體"/>
<family val="1"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
<font>
<b/>
<sz val="16"/>
<color theme="1"/>
<name val="新細明體"/>
<family val="1"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
</fonts>
<fills count="2">
<fill>
<patternFill patternType="none"/>
</fill>
<fill>
<patternFill patternType="gray125"/>
</fill>
</fills>
<borders count="1">
<border>
<left/>
<right/>
<top/>
<bottom/>
<diagonal/>
</border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0">
<alignment vertical="center"/>
</xf>
</cellStyleXfs>
<cellXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0">
<alignment vertical="center"/>
</xf>
</cellXfs>
<cellStyles count="1">
<cellStyle name="一般" xfId="0" builtinId="0"/>
</cellStyles>
<dxfs count="0"/>
<tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleLight16"/>
<extLst>
<ext xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main"
uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
<x14:slicerStyles defaultSlicerStyle="SlicerStyleLight1"/>
</ext>
<ext xmlns:x15="http://schemas.microsoft.com/office/spreadsheetml/2010/11/main"
uri="{9260A510-F301-46a8-8635-F512D64BE5F5}">
<x15:timelineStyles defaultTimelineStyle="TimeSlicerStyleLight1"/>
</ext>
</extLst>
</styleSheet>
所以现在您可以明白为什么我的文件更易于使用了,至少对我而言:字体大小 11、22、33 和 44 在 XML 中很容易找到! (我的名字也有英文(我会说),而不是中文(我不会))。
因此在我的示例中有六种字体(我不清楚 count="6"
的用途),并且我的示例按顺序具有字体 0、4、5 和 3。
我现在的问题是我找不到字体列表按顺序绑定到单元格的位置。
哦,这是我用 openpyxl
编写的用于解码电子表格的程序,但我并没有走得太远:
from openpyxl import Workbook, load_workbook
import sys
if __name__=="__main__":
wb = load_workbook(sys.argv[1])
for ws in wb.worksheets:
print(f"Sheet {ws} max rows: {ws.max_row} max cols: {ws.max_column}")
for row in range(0,ws.max_row):
for column in range(0,ws.max_column):
# Note that openpyxl starts at 1 for rows and columns
cell = ws.cell(row=row+1, column=column+1)
print(cell)
print(dir(cell))
for attr in dir(cell):
if attr[0]!='_':
print(f"cell {attr} = {getattr(cell,attr)}")
print("")
print("ws['A1'].style=",ws['A1'].style)
print(dir(wb))
print(wb.named_styles)
style = wb.named_styles[0]
print(style)
这是输出:
Sheet <Worksheet "Sheet1"> max rows: 1 max cols: 1
<Cell 'Sheet1'.A1>
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '_bind_value', '_comment', '_hyperlink', '_style', '_value', 'alignment', 'base_date', 'border', 'check_error', 'check_string', 'col_idx', 'column', 'column_letter', 'comment', 'coordinate', 'data_type', 'encoding', 'fill', 'font', 'guess_types', 'has_style', 'hyperlink', 'internal_value', 'is_date', 'number_format', 'offset', 'parent', 'pivotButton', 'protection', 'quotePrefix', 'row', 'style', 'style_id', 'value']
cell alignment = <openpyxl.styles.alignment.Alignment object>
Parameters:
horizontal=None, vertical=None, textRotation=0, wrapText=None, shrinkToFit=None, indent=0.0, relativeIndent=0.0, justifyLastLine=None, readingOrder=0.0
cell base_date = 2415018.5
cell border = <openpyxl.styles.borders.Border object>
Parameters:
outline=True, diagonalUp=False, diagonalDown=False, start=None, end=None, left=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, right=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, top=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, bottom=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, diagonal=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, vertical=None, horizontal=None
cell check_error = <bound method Cell.check_error of <Cell 'Sheet1'.A1>>
cell check_string = <bound method Cell.check_string of <Cell 'Sheet1'.A1>>
cell col_idx = 1
cell column = 1
cell column_letter = A
cell comment = None
cell coordinate = A1
cell data_type = s
cell encoding = utf-8
cell fill = <openpyxl.styles.fills.PatternFill object>
Parameters:
patternType=None, fgColor=<openpyxl.styles.colors.Color object>
Parameters:
rgb='00000000', indexed=None, auto=None, theme=None, tint=0.0, type='rgb', bgColor=<openpyxl.styles.colors.Color object>
Parameters:
rgb='00000000', indexed=None, auto=None, theme=None, tint=0.0, type='rgb'
cell font = <openpyxl.styles.fonts.Font object>
Parameters:
name='Calibri (Body)_x0000_', charset=None, family=None, b=False, i=False, strike=None, outline=None, shadow=None, condense=None, color=<openpyxl.styles.colors.Color object>
Parameters:
rgb=None, indexed=None, auto=None, theme=1, tint=0.0, type='theme', extend=None, sz=12.0, u=None, vertAlign=None, scheme=None
cell guess_types = False
cell has_style = True
cell hyperlink = None
cell internal_value = test
cell is_date = False
cell number_format = General
cell offset = <bound method Cell.offset of <Cell 'Sheet1'.A1>>
cell parent = <Worksheet "Sheet1">
cell pivotButton = False
cell protection = <openpyxl.styles.protection.Protection object>
Parameters:
locked=True, hidden=False
cell quotePrefix = False
cell row = 1
cell style = Normal
cell style_id = 1
cell value = test
ws['A1'].style= Normal
['_Workbook__write_only', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_active_sheet_index', '_add_sheet', '_alignments', '_borders', '_cell_styles', '_colors', '_data_only', '_date_formats', '_differential_styles', '_external_links', '_fills', '_fonts', '_named_styles', '_number_formats', '_pivots', '_protections', '_read_only', '_setup_styles', '_sheets', '_table_styles', 'active', 'add_named_range', 'add_named_style', 'calculation', 'chartsheets', 'close', 'code_name', 'copy_worksheet', 'create_chartsheet', 'create_named_range', 'create_sheet', 'data_only', 'defined_names', 'encoding', 'epoch', 'excel_base_date', 'get_active_sheet', 'get_index', 'get_named_range', 'get_named_ranges', 'get_sheet_by_name', 'get_sheet_names', 'guess_types', 'index', 'is_template', 'iso_dates', 'loaded_theme', 'mime_type', 'move_sheet', 'named_styles', 'path', 'properties', 'read_only', 'rels', 'remove', 'remove_named_range', 'remove_sheet', 'save', 'security', 'shared_strings', 'sheetnames', 'style_names', 'template', 'vba_archive', 'views', 'worksheets', 'write_only']
['Normal']
Normal
以下是我查阅的一些参考资料,我认为它们会有所帮助:
- https://docs.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheet.cellstyleformats?view=openxml-2.8.1
- https://social.msdn.microsoft.com/Forums/en-US/e6fe3ff0-152e-4398-9d17-fee8476ae466/how-to-understand-the-process-of-cell-formatting-?forum=os_binaryfile
- https://wiki.openoffice.org/wiki/Cell_Style_in_Xls_module
- https://c-rex.net/projects/samples/ooxml/e1/Part4/OOXML_P4_DOCX_cellStyleXfs_topic_ID0EXX65.html
- https://social.msdn.microsoft.com/Forums/sqlserver/en-US/708978af-b598-45c4-a598-d3518a5a09f0/howwhen-is-cellstylexfs-vs-cellxfs-applied-to-a-cell?forum=os_binaryfile
我有一个 excel 格式如下:
并且我使用 python:
阅读了这个单元格wb = open_workbook(xlsx_path,formatting_info=True)
sheet = wb.sheet_by_name(sheet_name)
cell = sheet.cell(i,j)
print("cell.xf_index is", cell.xf_index)
fmt = wb.xf_list[cell.xf_index]
print("type(fmt) is", type(fmt))
print("Dumped Info:")
fmt.dump()
但我得到的完全是单元格格式:
如何获取每个字符格式?谢谢!
我检查了你的电子表格并创建了我自己的电子表格,你可以在这里找到: https://drive.google.com/open?id=1WBm_tcFdlcckDgIvPdh-ezosm5pP4xXd
这是我的样子:
我的字体大小为 11、22、33 和 44 磅,以便更容易找到它们。
我无法在 xlrd
或 openpyxl
中找到 API 让我们读取存储在单个单元格中的多种字体。但是,我知道 openpyxl
提供此功能。问题是openpyxl
的这方面没有很好的记录。
我试图对文件格式进行逆向工程。回想一下 .xlsx
文件是 ZIP 文件。所以我解压缩了它。这是我的 Sheet1.xml 文件的内容,打印精美:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" mc:Ignorable="x14ac xr xr2 xr3" xr:uid="{5F15C188-96B2-E44D-B28B-DB5F2AE0283E}">
<dimension ref="A1"/>
<sheetViews>
<sheetView tabSelected="1" zoomScale="352" zoomScaleNormal="352" workbookViewId="0"/>
</sheetViews>
<sheetFormatPr baseColWidth="10" defaultRowHeight="16"/>
<sheetData>
<row r="1" spans="1:1" ht="57">
<c r="A1" s="1" t="s">
<v>0</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
为了比较,这是你的:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" mc:Ignorable="x14ac xr xr2 xr3" xr:uid="{6A0965C4-B0BC-435B-932C-B62A46E63DFA}">
<dimension ref="A1"/>
<sheetViews>
<sheetView tabSelected="1" workbookViewId="0"/>
</sheetViews>
<sheetFormatPr defaultRowHeight="16.5" x14ac:dyDescent="0.25"/>
<sheetData>
<row r="1" spans="1:1" ht="25.5" x14ac:dyDescent="0.25">
<c r="A1" t="s">
<v>0</v>
</c>
</row>
</sheetData>
<phoneticPr fontId="1" type="noConversion"/>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
如您所见,样式信息未存储在Sheet1.xml中。相反,它存储在 styles.xml
中。 Excel 似乎为应用了专门格式的单元格创建了命名样式。命名样式可以包含单元格可以具有的任何格式设置,例如字体和填充。
下面是 styles.xml
文件对于空白 Excel 文件的样子:
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="x14ac x16r2 xr">
<fonts count="1" x14ac:knownFonts="1">
<font>
<sz val="12"/>
<color theme="1"/>
<name val="Calibri"/>
<family val="2"/>
<scheme val="minor"/>
</font>
</fonts>
<fills count="2">
<fill>
<patternFill patternType="none"/>
</fill>
<fill>
<patternFill patternType="gray125"/>
</fill>
</fills>
<borders count="1">
<border>
<left/>
<right/>
<top/>
<bottom/>
<diagonal/>
</border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0"/>
</cellStyleXfs>
<cellXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/>
</cellXfs>
<cellStyles count="1">
<cellStyle name="Normal" xfId="0" builtinId="0"/>
</cellStyles>
<dxfs count="0"/>
<tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleLight16"/>
<extLst>
<ext xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
<x14:slicerStyles defaultSlicerStyle="SlicerStyleLight1"/>
</ext>
<ext xmlns:x15="http://schemas.microsoft.com/office/spreadsheetml/2010/11/main" uri="{9260A510-F301-46a8-8635-F512D64BE5F5}">
<x15:timelineStyles defaultTimelineStyle="TimeSlicerStyleLight1"/>
</ext>
</extLst>
</styleSheet>
这是我的文件:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="x14ac x16r2 xr">
<fonts count="6">
<font>
<sz val="12"/>
<color theme="1"/>
<name val="Calibri"/>
<family val="2"/>
<scheme val="minor"/>
</font>
<font>
<sz val="11"/>
<color theme="1"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="12"/>
<color theme="1"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="44"/>
<color rgb="FFFF0000"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="33"/>
<color rgb="FF0070C0"/>
<name val="Calibri (Body)_x0000_"/>
</font>
<font>
<sz val="22"/>
<color rgb="FF7030A0"/>
<name val="Calibri (Body)_x0000_"/>
</font>
</fonts>
<fills count="2">
<fill>
<patternFill patternType="none"/>
</fill>
<fill>
<patternFill patternType="gray125"/>
</fill>
</fills>
<borders count="1">
<border>
<left/>
<right/>
<top/>
<bottom/>
<diagonal/>
</border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0"/>
</cellStyleXfs>
<cellXfs count="2">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/>
<xf numFmtId="0" fontId="2" fillId="0" borderId="0" xfId="0" applyFont="1"/>
</cellXfs>
<cellStyles count="1">
<cellStyle name="Normal" xfId="0" builtinId="0"/>
</cellStyles>
<dxfs count="0"/>
<tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleLight16"/>
<extLst>
<ext xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
<x14:slicerStyles defaultSlicerStyle="SlicerStyleLight1"/>
</ext>
<ext xmlns:x15="http://schemas.microsoft.com/office/spreadsheetml/2010/11/main" uri="{9260A510-F301-46a8-8635-F512D64BE5F5}">
<x15:timelineStyles defaultTimelineStyle="TimeSlicerStyleLight1"/>
</ext>
</extLst>
</styleSheet>
这是你的:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="x14ac x16r2 xr">
<fonts count="4" x14ac:knownFonts="1">
<font>
<sz val="12"/>
<color theme="1"/>
<name val="新細明體"/>
<family val="2"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
<font>
<sz val="9"/>
<name val="新細明體"/>
<family val="2"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
<font>
<sz val="18"/>
<color rgb="FFFF0000"/>
<name val="新細明體"/>
<family val="1"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
<font>
<b/>
<sz val="16"/>
<color theme="1"/>
<name val="新細明體"/>
<family val="1"/>
<charset val="136"/>
<scheme val="minor"/>
</font>
</fonts>
<fills count="2">
<fill>
<patternFill patternType="none"/>
</fill>
<fill>
<patternFill patternType="gray125"/>
</fill>
</fills>
<borders count="1">
<border>
<left/>
<right/>
<top/>
<bottom/>
<diagonal/>
</border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0">
<alignment vertical="center"/>
</xf>
</cellStyleXfs>
<cellXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0">
<alignment vertical="center"/>
</xf>
</cellXfs>
<cellStyles count="1">
<cellStyle name="一般" xfId="0" builtinId="0"/>
</cellStyles>
<dxfs count="0"/>
<tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleLight16"/>
<extLst>
<ext xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main"
uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
<x14:slicerStyles defaultSlicerStyle="SlicerStyleLight1"/>
</ext>
<ext xmlns:x15="http://schemas.microsoft.com/office/spreadsheetml/2010/11/main"
uri="{9260A510-F301-46a8-8635-F512D64BE5F5}">
<x15:timelineStyles defaultTimelineStyle="TimeSlicerStyleLight1"/>
</ext>
</extLst>
</styleSheet>
所以现在您可以明白为什么我的文件更易于使用了,至少对我而言:字体大小 11、22、33 和 44 在 XML 中很容易找到! (我的名字也有英文(我会说),而不是中文(我不会))。
因此在我的示例中有六种字体(我不清楚 count="6"
的用途),并且我的示例按顺序具有字体 0、4、5 和 3。
我现在的问题是我找不到字体列表按顺序绑定到单元格的位置。
哦,这是我用 openpyxl
编写的用于解码电子表格的程序,但我并没有走得太远:
from openpyxl import Workbook, load_workbook
import sys
if __name__=="__main__":
wb = load_workbook(sys.argv[1])
for ws in wb.worksheets:
print(f"Sheet {ws} max rows: {ws.max_row} max cols: {ws.max_column}")
for row in range(0,ws.max_row):
for column in range(0,ws.max_column):
# Note that openpyxl starts at 1 for rows and columns
cell = ws.cell(row=row+1, column=column+1)
print(cell)
print(dir(cell))
for attr in dir(cell):
if attr[0]!='_':
print(f"cell {attr} = {getattr(cell,attr)}")
print("")
print("ws['A1'].style=",ws['A1'].style)
print(dir(wb))
print(wb.named_styles)
style = wb.named_styles[0]
print(style)
这是输出:
Sheet <Worksheet "Sheet1"> max rows: 1 max cols: 1
<Cell 'Sheet1'.A1>
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '_bind_value', '_comment', '_hyperlink', '_style', '_value', 'alignment', 'base_date', 'border', 'check_error', 'check_string', 'col_idx', 'column', 'column_letter', 'comment', 'coordinate', 'data_type', 'encoding', 'fill', 'font', 'guess_types', 'has_style', 'hyperlink', 'internal_value', 'is_date', 'number_format', 'offset', 'parent', 'pivotButton', 'protection', 'quotePrefix', 'row', 'style', 'style_id', 'value']
cell alignment = <openpyxl.styles.alignment.Alignment object>
Parameters:
horizontal=None, vertical=None, textRotation=0, wrapText=None, shrinkToFit=None, indent=0.0, relativeIndent=0.0, justifyLastLine=None, readingOrder=0.0
cell base_date = 2415018.5
cell border = <openpyxl.styles.borders.Border object>
Parameters:
outline=True, diagonalUp=False, diagonalDown=False, start=None, end=None, left=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, right=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, top=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, bottom=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, diagonal=<openpyxl.styles.borders.Side object>
Parameters:
style=None, color=None, vertical=None, horizontal=None
cell check_error = <bound method Cell.check_error of <Cell 'Sheet1'.A1>>
cell check_string = <bound method Cell.check_string of <Cell 'Sheet1'.A1>>
cell col_idx = 1
cell column = 1
cell column_letter = A
cell comment = None
cell coordinate = A1
cell data_type = s
cell encoding = utf-8
cell fill = <openpyxl.styles.fills.PatternFill object>
Parameters:
patternType=None, fgColor=<openpyxl.styles.colors.Color object>
Parameters:
rgb='00000000', indexed=None, auto=None, theme=None, tint=0.0, type='rgb', bgColor=<openpyxl.styles.colors.Color object>
Parameters:
rgb='00000000', indexed=None, auto=None, theme=None, tint=0.0, type='rgb'
cell font = <openpyxl.styles.fonts.Font object>
Parameters:
name='Calibri (Body)_x0000_', charset=None, family=None, b=False, i=False, strike=None, outline=None, shadow=None, condense=None, color=<openpyxl.styles.colors.Color object>
Parameters:
rgb=None, indexed=None, auto=None, theme=1, tint=0.0, type='theme', extend=None, sz=12.0, u=None, vertAlign=None, scheme=None
cell guess_types = False
cell has_style = True
cell hyperlink = None
cell internal_value = test
cell is_date = False
cell number_format = General
cell offset = <bound method Cell.offset of <Cell 'Sheet1'.A1>>
cell parent = <Worksheet "Sheet1">
cell pivotButton = False
cell protection = <openpyxl.styles.protection.Protection object>
Parameters:
locked=True, hidden=False
cell quotePrefix = False
cell row = 1
cell style = Normal
cell style_id = 1
cell value = test
ws['A1'].style= Normal
['_Workbook__write_only', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_active_sheet_index', '_add_sheet', '_alignments', '_borders', '_cell_styles', '_colors', '_data_only', '_date_formats', '_differential_styles', '_external_links', '_fills', '_fonts', '_named_styles', '_number_formats', '_pivots', '_protections', '_read_only', '_setup_styles', '_sheets', '_table_styles', 'active', 'add_named_range', 'add_named_style', 'calculation', 'chartsheets', 'close', 'code_name', 'copy_worksheet', 'create_chartsheet', 'create_named_range', 'create_sheet', 'data_only', 'defined_names', 'encoding', 'epoch', 'excel_base_date', 'get_active_sheet', 'get_index', 'get_named_range', 'get_named_ranges', 'get_sheet_by_name', 'get_sheet_names', 'guess_types', 'index', 'is_template', 'iso_dates', 'loaded_theme', 'mime_type', 'move_sheet', 'named_styles', 'path', 'properties', 'read_only', 'rels', 'remove', 'remove_named_range', 'remove_sheet', 'save', 'security', 'shared_strings', 'sheetnames', 'style_names', 'template', 'vba_archive', 'views', 'worksheets', 'write_only']
['Normal']
Normal
以下是我查阅的一些参考资料,我认为它们会有所帮助:
- https://docs.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheet.cellstyleformats?view=openxml-2.8.1
- https://social.msdn.microsoft.com/Forums/en-US/e6fe3ff0-152e-4398-9d17-fee8476ae466/how-to-understand-the-process-of-cell-formatting-?forum=os_binaryfile
- https://wiki.openoffice.org/wiki/Cell_Style_in_Xls_module
- https://c-rex.net/projects/samples/ooxml/e1/Part4/OOXML_P4_DOCX_cellStyleXfs_topic_ID0EXX65.html
- https://social.msdn.microsoft.com/Forums/sqlserver/en-US/708978af-b598-45c4-a598-d3518a5a09f0/howwhen-is-cellstylexfs-vs-cellxfs-applied-to-a-cell?forum=os_binaryfile