从 XML 中提取特定标签并写入 Python 中的新 XML

Question

我有一个 xml 这样的：

<Manuscript>
<model defaultValue="data.TotalResult">
    <object id="data" path="data">
        <condition>
              <comparison compare="and">
                <operand idref="Context1" type="boolean" />
                <operand idref="Context2" type="boolean" />
              </comparison>
        </condition>

    </object>
    <condition>
              <comparison compare="and">
                <operand idref="Context5" type="boolean" />
                <operand idref="Context6" type="boolean" />
              </comparison>
    </condition>
</model>
    <condition>
              <comparison compare="and">
                <operand idref="Context9" type="boolean" />
                <operand idref="Context10" type="boolean" />
              </comparison>
    </condition>

</Manuscript>

我想提取所有名称为 'condition' 的标签，将它们 concatenate/append 一起创建另一个 xml 作为：

<root>
    <condition>
          <comparison compare="and">
            <operand idref="Context1" type="boolean" />
            <operand idref="Context2" type="boolean" />
          </comparison>
    </condition>
    <condition>
              <comparison compare="and">
                <operand idref="Context5" type="boolean" />
                <operand idref="Context6" type="boolean" />
              </comparison>
    </condition>
    <condition>
              <comparison compare="and">
                <operand idref="Context9" type="boolean" />
                <operand idref="Context10" type="boolean" />
              </comparison>
    </condition>
</root>

知道我怎样才能做到这一点 Python 吗？

提前致谢。

Answer 1

使用 Beautiful Soup is a Python library 从 HTML 和 XML 文件中提取数据。它与您最喜欢的解析器一起工作，提供惯用的导航、搜索和修改解析树的方法。它通常可以为程序员节省数小时或数天的工作时间。

Answer 2

您可以使用 BeautifulSoup 来解析和创建您想要的新 xml。

查看以下脚本并根据需要对其进行更改以达到预期的结果。

from bs4 import BeautifulSoup as Soup

xml_str = """
<Manuscript>
<model defaultValue="data.TotalResult">
    <object id="data" path="data">
        <condition>
              <comparison compare="and">
                <operand idref="Context1" type="boolean" />
                <operand idref="Context2" type="boolean" />
              </comparison>
        </condition>

    </object>
    <condition>
              <comparison compare="and">
                <operand idref="Context5" type="boolean" />
                <operand idref="Context6" type="boolean" />
              </comparison>
    </condition>
</model>
    <condition>
              <comparison compare="and">
                <operand idref="Context9" type="boolean" />
                <operand idref="Context10" type="boolean" />
              </comparison>
    </condition>
</Manuscript>
"""

xml_parsed = Soup(xml_str, 'lxml')
output_xml = Soup("", 'lxml')
output_xml.append(output_xml.new_tag('root'))
for condition in xml_parsed.find_all('condition'):
    output_xml.find('root').append(condition)
print(output_xml.prettify())

这将打印以下内容：

<root>
 <condition>
  <comparison compare="and">
   <operand idref="Context1" type="boolean">
   </operand>
   <operand idref="Context2" type="boolean">
   </operand>
  </comparison>
 </condition>
 <condition>
  <comparison compare="and">
   <operand idref="Context5" type="boolean">
   </operand>
   <operand idref="Context6" type="boolean">
   </operand>
  </comparison>
 </condition>
 <condition>
  <comparison compare="and">
   <operand idref="Context9" type="boolean">
   </operand>
   <operand idref="Context10" type="boolean">
   </operand>
  </comparison>
 </condition>
</root>

从 XML 中提取特定标签并写入 Python 中的新 XML

Extracting particular tags from an XML and write to a new XML in Python

python

xml

dom

elementtree