如何使用 Python 中的 etree 按嵌套子元素文本值对 xml 进行排序
How to sort an xml by a nested child element text value using etree in Python
我看到这个问题的变体回答了无数次 (Sorting XML in python etree, ),但似乎无法使这些答案适应我的问题。我试图通过特定的子元素标签对导入的 xml 文件进行排序,在本例中它是通过“id”标签。下面是有问题的xml:
输入:
<bookstore Location="New York">
<Genre type="Fiction">
<name>Fiction</name>
<id>4</id>
<pages>300</pages>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<id>2</id>
<pages>500</pages>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<id>1</id>
<pages>450</pages>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<id>3</id>
<pages>20</pages>
</Genre>
<Genre type="Comedy">
<name>Comedic Comedy</name>
<id>0</id>
<pages>1</pages>
</Genre>
</bookstore>
我想按子元素“id”组织所有流派元素。这是我想要的输出:
输出:
<bookstore Location="New York">
<Genre type="Comedy">
<name>Comedic Comedy</name>
<id>0</id>
<pages>1</pages>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<id>1</id>
<pages>450</pages>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<id>2</id>
<pages>500</pages>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<id>3</id>
<pages>20</pages>
</Genre>
<Genre type="Fiction">
<name>Fiction</name>
<id>4</id>
<pages>300</pages>
</Genre>
</bookstore>
这是我试过的代码:
def sortchildrenby(parent):
parent[:] = sorted(parent, key=lambda child: child.tag == 'id')
filename = "Example.xml"
tree = ET.parse(filename)
root = tree.getroot()
attr = "type"
for elements in root:
sortchildrenby(elements)
tree.write("exampleORGANIZED.xml")
结果如下 xml:
<bookstore Location="New York">
<Genre type="Fiction">
<name>Fiction</name>
<pages>300</pages>
<id>4</id>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<pages>500</pages>
<id>2</id>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<pages>450</pages>
<id>1</id>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<pages>20</pages>
<id>3</id>
</Genre>
<Genre type="Comedy">
<name>Comedic Comedy</name>
<pages>1</pages>
<id>0</id>
</Genre>
</bookstore>
ID 已向下移动并且未按升序重新排序。
无需迭代即可将整个根传递给方法,因为您需要对基础 <Genre>
元素进行排序,而不是对每个单独的元素进行排序。此外,调整方法以按元素文本而不是布尔表达式排序:
def sortchildrenby(parent, attr):
parent[:] = sorted(parent, key=lambda child: child.find(attr).text)
tree = ET.parse("Input.xml")
root = tree.getroot()
sortchildrenby(root, "id")
ET.indent(tree, space="\t", level=0) # PRETTY PRINT (ADDED Python 3.9)
tree.write("Output.xml")
输出
<bookstore Location="New York">
<Genre type="Comedy">
<name>Comedic Comedy</name>
<id>0</id>
<pages>1</pages>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<id>1</id>
<pages>450</pages>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<id>2</id>
<pages>500</pages>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<id>3</id>
<pages>20</pages>
</Genre>
<Genre type="Fiction">
<name>Fiction</name>
<id>4</id>
<pages>300</pages>
</Genre>
</bookstore>
我看到这个问题的变体回答了无数次 (Sorting XML in python etree,
输入:
<bookstore Location="New York">
<Genre type="Fiction">
<name>Fiction</name>
<id>4</id>
<pages>300</pages>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<id>2</id>
<pages>500</pages>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<id>1</id>
<pages>450</pages>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<id>3</id>
<pages>20</pages>
</Genre>
<Genre type="Comedy">
<name>Comedic Comedy</name>
<id>0</id>
<pages>1</pages>
</Genre>
</bookstore>
我想按子元素“id”组织所有流派元素。这是我想要的输出:
输出:
<bookstore Location="New York">
<Genre type="Comedy">
<name>Comedic Comedy</name>
<id>0</id>
<pages>1</pages>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<id>1</id>
<pages>450</pages>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<id>2</id>
<pages>500</pages>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<id>3</id>
<pages>20</pages>
</Genre>
<Genre type="Fiction">
<name>Fiction</name>
<id>4</id>
<pages>300</pages>
</Genre>
</bookstore>
这是我试过的代码:
def sortchildrenby(parent):
parent[:] = sorted(parent, key=lambda child: child.tag == 'id')
filename = "Example.xml"
tree = ET.parse(filename)
root = tree.getroot()
attr = "type"
for elements in root:
sortchildrenby(elements)
tree.write("exampleORGANIZED.xml")
结果如下 xml:
<bookstore Location="New York">
<Genre type="Fiction">
<name>Fiction</name>
<pages>300</pages>
<id>4</id>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<pages>500</pages>
<id>2</id>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<pages>450</pages>
<id>1</id>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<pages>20</pages>
<id>3</id>
</Genre>
<Genre type="Comedy">
<name>Comedic Comedy</name>
<pages>1</pages>
<id>0</id>
</Genre>
</bookstore>
ID 已向下移动并且未按升序重新排序。
无需迭代即可将整个根传递给方法,因为您需要对基础 <Genre>
元素进行排序,而不是对每个单独的元素进行排序。此外,调整方法以按元素文本而不是布尔表达式排序:
def sortchildrenby(parent, attr):
parent[:] = sorted(parent, key=lambda child: child.find(attr).text)
tree = ET.parse("Input.xml")
root = tree.getroot()
sortchildrenby(root, "id")
ET.indent(tree, space="\t", level=0) # PRETTY PRINT (ADDED Python 3.9)
tree.write("Output.xml")
输出
<bookstore Location="New York">
<Genre type="Comedy">
<name>Comedic Comedy</name>
<id>0</id>
<pages>1</pages>
</Genre>
<Genre type="Horror">
<name>Horrors</name>
<id>1</id>
<pages>450</pages>
</Genre>
<Genre type="Fiction">
<name>Fictional Fiction</name>
<id>2</id>
<pages>500</pages>
</Genre>
<Genre type="Horror">
<name>Horrendous Horror</name>
<id>3</id>
<pages>20</pages>
</Genre>
<Genre type="Fiction">
<name>Fiction</name>
<id>4</id>
<pages>300</pages>
</Genre>
</bookstore>