使用 Python 和 lxml 来解析 xml 文档并将元素写入文本文件

Question

使用以下 Python 代码我想解析一个 xml 文件。您可以在代码下方看到 xml 文件的摘录。我需要 "extract" 后面的所有内容 "inv: name =" 就像在这种情况下 "'datasource roof height' and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)”。有任何想法吗？

我的 Python 代码（到目前为止）：

from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
    for cons in con.xpath("./@body"):
        with open("output.txt", "w") as cons_out:
            cons_out.write(cons)
        cons_out.close()

xml 文件的一部分：

<ownedRule xmi:type="uml:Constraint" xmi:id="EAID_OR000004_EE68_4efa_8E1B_8DDFA8F95FB8" name="datasource roof height">
    <constrainedElement xmi:idref="EAID_94F3B0A6_EE68_4efa_8E1B_8DDFA8F95FB8"/>
    <specification xmi:type="uml:OpaqueExpression" xmi:id="EAID_COE000004_EE68_4efa_8E1B_8DDFA8F95FB8" body="inv: name = 'datasource roof height'  and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)"/>
</ownedRule>

Answer 1

XML 解析器理解属性和元素。 XML 解析器不关心这些属性或元素中存在的内容（文本内容）。

为了解决您的问题，您需要拆分从 body 属性中检索到的字符串。当然，我假设所有元素的 body 属性都具有相同的格式内容，即 "inv : name = some content"

from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
    for cons in con.xpath("./@body"):
        with open("output.txt", "w") as cons_out:
            content = cons.split("inv: name =")[1]
            cons_out.write(content)
        cons_out.close()

使用 Python 和 lxml 来解析 xml 文档并将元素写入文本文件

Use Python with lxml to parse a xml document and write elements into a text file

text

lxml

xml-parsing

python-2.7