从一个文件创建多个 XML 个文件
Create several XML files from one
我有一个 XML 格式如下的文件:
<Main1>
<Sub1>
<Name>Test</Name>
<ID>12345</ID>
<Sub2>
<Prop>
<Key>A</Key>
<Value>Apple</Value>
</Prop>
<Prop>
<Key>B</Key>
<Value>Ball</Value>
</Prop>
</Sub2>
<Sub3>
<Order>
<OID>54321</OID>
<ODate>2016-01-01</ODate>
</Order>
</Sub3>
</Sub1>
</Main1>
我正在尝试 python 导入此 xml 并将其拆分为三个不同的文件:一个文件用于个人姓名和 ID,一个文件用于属性,一个文件用于订单信息。但是,当我拆分它时,我想将客户 ID 添加到 属性 和订单文件中。所以 属性 文件最终可能看起来像:
<Orders>
<Order>
<ID>12345</ID>
<OID>54321</OID>
<ODate>2016-01-01</ODate>
</Order>
</Orders>
使用lxml
and element.xpath()
到select您需要的节点,并根据需要将它们附加到新XML文档中的节点。
XPath is not a concept introduced by lxml
but a general query language for selecting nodes from an XML document supported by many things that deal with XML. Think of it as something similar to CSS selectors, but more powerful (also a bit more complicated). See XPath Syntax.
所以,例如,
tree.xpath('/Main1/Sub1')
会 select <Sub1 />
元素直接位于 <Main1 />
节点下方。
请注意 .xpath()
总是 returns 一个 selected 节点的列表 - 所以如果你只想要一个,请考虑到这一点。
所以,像这样的东西应该可以工作:
from copy import copy
from lxml import etree
def parse(filename):
parser = etree.XMLParser(remove_blank_text=True)
root = etree.parse(open(filename), parser=parser)
return root
def dump_to_file(root, filename_base, id_):
customer_id = id_.text.strip()
filename = '%s-%s.xml' % (filename_base, customer_id)
with open(filename, 'w') as xml_file:
etree.ElementTree(root).write(xml_file, pretty_print=True)
def dump_orders(id_, orders):
root = etree.XML('<Orders/>')
for order in orders:
order.append(copy(id_))
root.append(order)
dump_to_file(root, 'orders', id_)
def dump_properties(id_, properties):
root = etree.XML('<Properties/>')
for prop in properties:
prop.append(copy(id_))
root.append(prop)
dump_to_file(root, 'properties', id_)
def dump_customer(id_, name):
root = etree.XML('<Customer/>')
root.append(copy(id_))
root.append(copy(name))
dump_to_file(root, 'customer', id_)
root = parse('complete.xml')
customers = root.xpath('/Main1/Sub1')
for customer in customers:
name = customer.xpath('./Name')[0]
id_ = customer.xpath('./ID')[0]
dump_customer(id_, name)
properties = customer.xpath('./Sub2/Prop')
dump_properties(id_, properties)
orders = customer.xpath('./Sub3/Order')
dump_orders(id_, orders)
这将为每个客户创建三个这样的文件:
customer-12345.xml
<Customer>
<ID>12345</ID>
<Name>Test</Name>
</Customer>
orders-12345.xml
<Orders>
<Order>
<OID>54321</OID>
<ODate>2016-01-01</ODate>
<ID>12345</ID>
</Order>
</Orders>
properties-12345.xml
<Properties>
<Prop>
<Key>A</Key>
<Value>Apple</Value>
<ID>12345</ID>
</Prop>
<Prop>
<Key>B</Key>
<Value>Ball</Value>
<ID>12345</ID>
</Prop>
</Properties>
有关 XPath 语法的详细信息,请参阅 XPath Syntax page in the W3Schools Xpath Tutorial.
示例
要开始使用 XPath,fiddle 在众多 XPath testers.
文档之一中处理您的文档也很有帮助
我有一个 XML 格式如下的文件:
<Main1>
<Sub1>
<Name>Test</Name>
<ID>12345</ID>
<Sub2>
<Prop>
<Key>A</Key>
<Value>Apple</Value>
</Prop>
<Prop>
<Key>B</Key>
<Value>Ball</Value>
</Prop>
</Sub2>
<Sub3>
<Order>
<OID>54321</OID>
<ODate>2016-01-01</ODate>
</Order>
</Sub3>
</Sub1>
</Main1>
我正在尝试 python 导入此 xml 并将其拆分为三个不同的文件:一个文件用于个人姓名和 ID,一个文件用于属性,一个文件用于订单信息。但是,当我拆分它时,我想将客户 ID 添加到 属性 和订单文件中。所以 属性 文件最终可能看起来像:
<Orders>
<Order>
<ID>12345</ID>
<OID>54321</OID>
<ODate>2016-01-01</ODate>
</Order>
</Orders>
使用lxml
and element.xpath()
到select您需要的节点,并根据需要将它们附加到新XML文档中的节点。
XPath is not a concept introduced by lxml
but a general query language for selecting nodes from an XML document supported by many things that deal with XML. Think of it as something similar to CSS selectors, but more powerful (also a bit more complicated). See XPath Syntax.
所以,例如,
tree.xpath('/Main1/Sub1')
会 select <Sub1 />
元素直接位于 <Main1 />
节点下方。
请注意 .xpath()
总是 returns 一个 selected 节点的列表 - 所以如果你只想要一个,请考虑到这一点。
所以,像这样的东西应该可以工作:
from copy import copy
from lxml import etree
def parse(filename):
parser = etree.XMLParser(remove_blank_text=True)
root = etree.parse(open(filename), parser=parser)
return root
def dump_to_file(root, filename_base, id_):
customer_id = id_.text.strip()
filename = '%s-%s.xml' % (filename_base, customer_id)
with open(filename, 'w') as xml_file:
etree.ElementTree(root).write(xml_file, pretty_print=True)
def dump_orders(id_, orders):
root = etree.XML('<Orders/>')
for order in orders:
order.append(copy(id_))
root.append(order)
dump_to_file(root, 'orders', id_)
def dump_properties(id_, properties):
root = etree.XML('<Properties/>')
for prop in properties:
prop.append(copy(id_))
root.append(prop)
dump_to_file(root, 'properties', id_)
def dump_customer(id_, name):
root = etree.XML('<Customer/>')
root.append(copy(id_))
root.append(copy(name))
dump_to_file(root, 'customer', id_)
root = parse('complete.xml')
customers = root.xpath('/Main1/Sub1')
for customer in customers:
name = customer.xpath('./Name')[0]
id_ = customer.xpath('./ID')[0]
dump_customer(id_, name)
properties = customer.xpath('./Sub2/Prop')
dump_properties(id_, properties)
orders = customer.xpath('./Sub3/Order')
dump_orders(id_, orders)
这将为每个客户创建三个这样的文件:
customer-12345.xml
<Customer>
<ID>12345</ID>
<Name>Test</Name>
</Customer>
orders-12345.xml
<Orders>
<Order>
<OID>54321</OID>
<ODate>2016-01-01</ODate>
<ID>12345</ID>
</Order>
</Orders>
properties-12345.xml
<Properties>
<Prop>
<Key>A</Key>
<Value>Apple</Value>
<ID>12345</ID>
</Prop>
<Prop>
<Key>B</Key>
<Value>Ball</Value>
<ID>12345</ID>
</Prop>
</Properties>
有关 XPath 语法的详细信息,请参阅 XPath Syntax page in the W3Schools Xpath Tutorial.
示例要开始使用 XPath,fiddle 在众多 XPath testers.
文档之一中处理您的文档也很有帮助