根据属性值使用 lxml 对子元素进行排序
Sorting child elements with lxml based on attribute value
我正在尝试根据属性值对文档中的某些子元素进行排序,虽然实际排序功能似乎有效,但新排序元素的拼接似乎无效。
from lxml import etree
def getkey(elem):
# Used for sorting elements by @LIN.
# returns a tuple of ints from the exploded @LIN value
# '1.0' -> (1,0)
# '1.0.1' -> (1,0,1)
return tuple([int(x) for x in elem.get('LIN').split('.')])
xml_str = """<Interface>
<Header></Header>
<PurchaseOrder>
<LineItems>
<Line LIN="2.0"></Line>
<Line LIN="3.0"></Line>
<Line LIN="1.0"></Line>
</LineItems>
</PurchaseOrder>
</Interface>"""
root = etree.fromstring(xml_str)
lines = root.findall("PurchaseOrder/LineItems/Line")
lines[:] = sorted(lines, key=getkey)
res_lines = [x.get('LIN') for x in lines]
print res_lines
print etree.tostring(root, pretty_print=True)
当我执行上面的代码时,我会看到 lines
列表在打印 ['1.0', '2.0', '3.0']
时确实正确排序。但是 XML 树没有更新,因为 tostring() 打印出以下内容。
<Interface>
<Header/>
<PurchaseOrder>
<LineItems>
<Line LIN="2.0"/>
<Line LIN="3.0"/>
<Line LIN="1.0"/>
</LineItems>
</PurchaseOrder>
</Interface>
我从http://effbot.org/zone/element-sort.htm得到了如何排序的想法,它说拼接应该是我更新元素顺序所需的全部,但似乎并非如此。我意识到 lxml 与 elementtree 不是 100% 兼容,所以作为完整性检查,我用 elementtree 替换了 lxml 导入并得到了完全相同的结果。
这将排序并写入输出:
import xml.etree.ElementTree as ET
tree = ET.parse("in.xml")
def getkey(elem):
# Used for sorting elements by @LIN.
# returns a tuple of ints from the exploded @LIN value
# '1.0' -> (1,0)
# '1.0.1' -> (1,0,1)
return float(elem.get('LIN'))
container = tree.find("PurchaseOrder/LineItems")
container[:] = sorted(container, key=getkey)
tree.write("new.xml")
或者使用自己的代码打印:
import xml.etree.ElementTree as ET
tree = ET.fromstring(xml_str)
def getkey(elem):
# Used for sorting elements by @LIN.
# returns a tuple of ints from the exploded @LIN value
# '1.0' -> (1,0)
# '1.0.1' -> (1,0,1)
return float(elem.get('LIN'))
root = etree.fromstring(xml_str)
lines = root.find("PurchaseOrder/LineItems")
lines[:] = sorted(lines, key=getkey)
输出:
In [12]: print (etree.tostring(root, pretty_print=True))
<Interface>
<Header/>
<PurchaseOrder>
<LineItems>
<Line LIN="1.0"/>
<Line LIN="2.0"/>
<Line LIN="3.0"/>
</LineItems>
</PurchaseOrder>
</Interface>
键是 root.find("PurchaseOrder/LineItems")
,您想找到 LineItems
元素并对其进行排序。
我正在尝试根据属性值对文档中的某些子元素进行排序,虽然实际排序功能似乎有效,但新排序元素的拼接似乎无效。
from lxml import etree
def getkey(elem):
# Used for sorting elements by @LIN.
# returns a tuple of ints from the exploded @LIN value
# '1.0' -> (1,0)
# '1.0.1' -> (1,0,1)
return tuple([int(x) for x in elem.get('LIN').split('.')])
xml_str = """<Interface>
<Header></Header>
<PurchaseOrder>
<LineItems>
<Line LIN="2.0"></Line>
<Line LIN="3.0"></Line>
<Line LIN="1.0"></Line>
</LineItems>
</PurchaseOrder>
</Interface>"""
root = etree.fromstring(xml_str)
lines = root.findall("PurchaseOrder/LineItems/Line")
lines[:] = sorted(lines, key=getkey)
res_lines = [x.get('LIN') for x in lines]
print res_lines
print etree.tostring(root, pretty_print=True)
当我执行上面的代码时,我会看到 lines
列表在打印 ['1.0', '2.0', '3.0']
时确实正确排序。但是 XML 树没有更新,因为 tostring() 打印出以下内容。
<Interface>
<Header/>
<PurchaseOrder>
<LineItems>
<Line LIN="2.0"/>
<Line LIN="3.0"/>
<Line LIN="1.0"/>
</LineItems>
</PurchaseOrder>
</Interface>
我从http://effbot.org/zone/element-sort.htm得到了如何排序的想法,它说拼接应该是我更新元素顺序所需的全部,但似乎并非如此。我意识到 lxml 与 elementtree 不是 100% 兼容,所以作为完整性检查,我用 elementtree 替换了 lxml 导入并得到了完全相同的结果。
这将排序并写入输出:
import xml.etree.ElementTree as ET
tree = ET.parse("in.xml")
def getkey(elem):
# Used for sorting elements by @LIN.
# returns a tuple of ints from the exploded @LIN value
# '1.0' -> (1,0)
# '1.0.1' -> (1,0,1)
return float(elem.get('LIN'))
container = tree.find("PurchaseOrder/LineItems")
container[:] = sorted(container, key=getkey)
tree.write("new.xml")
或者使用自己的代码打印:
import xml.etree.ElementTree as ET
tree = ET.fromstring(xml_str)
def getkey(elem):
# Used for sorting elements by @LIN.
# returns a tuple of ints from the exploded @LIN value
# '1.0' -> (1,0)
# '1.0.1' -> (1,0,1)
return float(elem.get('LIN'))
root = etree.fromstring(xml_str)
lines = root.find("PurchaseOrder/LineItems")
lines[:] = sorted(lines, key=getkey)
输出:
In [12]: print (etree.tostring(root, pretty_print=True))
<Interface>
<Header/>
<PurchaseOrder>
<LineItems>
<Line LIN="1.0"/>
<Line LIN="2.0"/>
<Line LIN="3.0"/>
</LineItems>
</PurchaseOrder>
</Interface>
键是 root.find("PurchaseOrder/LineItems")
,您想找到 LineItems
元素并对其进行排序。