在 python lxml prettyprint 中更改制表符间距

Change tab spacing in python lxml prettyprint

我有一个创建 xml 文档并使用 prettyprint=true 生成格式正确的 xml 文档的小脚本。但是,制表符缩进是 2 个空格,我想知道是否有办法将其更改为 4 个空格(我认为 4 个空格看起来更好)。有没有简单的方法来实现这个?

代码片段:

doc = lxml.etree.SubElement(root, 'dependencies')
for depen in dependency_list:
    dependency = lxml.etree.SubElement(doc, 'dependency')
    lxml.etree.SubElement(dependency, 'groupId').text = depen.group_id
    lxml.etree.SubElement(dependency, 'artifactId').text = depen.artifact_id
    lxml.etree.SubElement(dependency, 'version').text = depen.version
    if depen.scope == 'provided' or depen.scope == 'test':
        lxml.etree.SubElement(dependency, 'scope').text = depen.scope
    exclusions = lxml.etree.SubElement(dependency, 'exclusions')
    exclusion = lxml.etree.SubElement(exclusions, 'exclusion')
    lxml.etree.SubElement(exclusion, 'groupId').text = '*'
    lxml.etree.SubElement(exclusion, 'artifactId').text = '*'
tree.write('explicit-pom.xml' , pretty_print=True)

python lxml API.

似乎无法做到这一点

标签间距的可能解决方案是:

def prettyPrint(someRootNode):
    lines = lxml.etree.tostring(someRootNode, encoding="utf-8", pretty_print=True).decode("utf-8").split("\n")
    for i in range(len(lines)):
        line = lines[i]
        outLine = ""
        for j in range(0, len(line), 2):
            if line[j:j + 2] == "  ":
                outLine += "\t"
            else:
                outLine += line[j:]
                break
        lines[i] = outLine
    return "\n".join(lines)

请注意,这不是很有效。只有在 lxml C 代码中本地实现此功能才能实现高效率。

如果有人仍在尝试实现此目的,可以使用 lxml 4.5 中的 etree.indent() 方法来完成 -

>>> etree.indent(root, space="    ")
>>> print(etree.tostring(root))
<root>
    <a>
        <b/>
    </a>
</root>

https://lxml.de/tutorial.html#serialisation