从 XML 中提取特定标签并写入 Python 中的新 XML
Extracting particular tags from an XML and write to a new XML in Python
我有一个 xml 这样的:
<Manuscript>
<model defaultValue="data.TotalResult">
<object id="data" path="data">
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean" />
<operand idref="Context2" type="boolean" />
</comparison>
</condition>
</object>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean" />
<operand idref="Context6" type="boolean" />
</comparison>
</condition>
</model>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean" />
<operand idref="Context10" type="boolean" />
</comparison>
</condition>
</Manuscript>
我想提取所有名称为 'condition' 的标签,将它们 concatenate/append 一起创建另一个 xml 作为:
<root>
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean" />
<operand idref="Context2" type="boolean" />
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean" />
<operand idref="Context6" type="boolean" />
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean" />
<operand idref="Context10" type="boolean" />
</comparison>
</condition>
</root>
知道我怎样才能做到这一点 Python 吗?
提前致谢。
使用 Beautiful Soup is a Python library 从 HTML 和 XML 文件中提取数据。它与您最喜欢的解析器一起工作,提供惯用的导航、搜索和修改解析树的方法。它通常可以为程序员节省数小时或数天的工作时间。
您可以使用 BeautifulSoup 来解析和创建您想要的新 xml。
查看以下脚本并根据需要对其进行更改以达到预期的结果。
from bs4 import BeautifulSoup as Soup
xml_str = """
<Manuscript>
<model defaultValue="data.TotalResult">
<object id="data" path="data">
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean" />
<operand idref="Context2" type="boolean" />
</comparison>
</condition>
</object>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean" />
<operand idref="Context6" type="boolean" />
</comparison>
</condition>
</model>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean" />
<operand idref="Context10" type="boolean" />
</comparison>
</condition>
</Manuscript>
"""
xml_parsed = Soup(xml_str, 'lxml')
output_xml = Soup("", 'lxml')
output_xml.append(output_xml.new_tag('root'))
for condition in xml_parsed.find_all('condition'):
output_xml.find('root').append(condition)
print(output_xml.prettify())
这将打印以下内容:
<root>
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean">
</operand>
<operand idref="Context2" type="boolean">
</operand>
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean">
</operand>
<operand idref="Context6" type="boolean">
</operand>
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean">
</operand>
<operand idref="Context10" type="boolean">
</operand>
</comparison>
</condition>
</root>
我有一个 xml 这样的:
<Manuscript>
<model defaultValue="data.TotalResult">
<object id="data" path="data">
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean" />
<operand idref="Context2" type="boolean" />
</comparison>
</condition>
</object>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean" />
<operand idref="Context6" type="boolean" />
</comparison>
</condition>
</model>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean" />
<operand idref="Context10" type="boolean" />
</comparison>
</condition>
</Manuscript>
我想提取所有名称为 'condition' 的标签,将它们 concatenate/append 一起创建另一个 xml 作为:
<root>
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean" />
<operand idref="Context2" type="boolean" />
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean" />
<operand idref="Context6" type="boolean" />
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean" />
<operand idref="Context10" type="boolean" />
</comparison>
</condition>
</root>
知道我怎样才能做到这一点 Python 吗?
提前致谢。
使用 Beautiful Soup is a Python library 从 HTML 和 XML 文件中提取数据。它与您最喜欢的解析器一起工作,提供惯用的导航、搜索和修改解析树的方法。它通常可以为程序员节省数小时或数天的工作时间。
您可以使用 BeautifulSoup 来解析和创建您想要的新 xml。
查看以下脚本并根据需要对其进行更改以达到预期的结果。
from bs4 import BeautifulSoup as Soup
xml_str = """
<Manuscript>
<model defaultValue="data.TotalResult">
<object id="data" path="data">
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean" />
<operand idref="Context2" type="boolean" />
</comparison>
</condition>
</object>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean" />
<operand idref="Context6" type="boolean" />
</comparison>
</condition>
</model>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean" />
<operand idref="Context10" type="boolean" />
</comparison>
</condition>
</Manuscript>
"""
xml_parsed = Soup(xml_str, 'lxml')
output_xml = Soup("", 'lxml')
output_xml.append(output_xml.new_tag('root'))
for condition in xml_parsed.find_all('condition'):
output_xml.find('root').append(condition)
print(output_xml.prettify())
这将打印以下内容:
<root>
<condition>
<comparison compare="and">
<operand idref="Context1" type="boolean">
</operand>
<operand idref="Context2" type="boolean">
</operand>
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context5" type="boolean">
</operand>
<operand idref="Context6" type="boolean">
</operand>
</comparison>
</condition>
<condition>
<comparison compare="and">
<operand idref="Context9" type="boolean">
</operand>
<operand idref="Context10" type="boolean">
</operand>
</comparison>
</condition>
</root>