使用 lxml 将一个 xml 复制到 python 中的另一个
copy one xml to an other in python using lxml
我在 python
中有以下代码
from lxml import etree
offers = etree.parse(r'prices.xml')
print("offers\n")
target = offers.xpath('//offer[./vendor/text()="Qtap"]')
length = len(target)
for i in range(length):
print(target[i])
etree.ElementTree(target[i]).write('output.xml', encoding='utf-8', xml_declaration=True)
我只是阅读了 xml 文件。使用 xpath 从中读取数据并希望将所有数据写入另一个文件。大约有 2000 个元素显示为长度,但脚本只写最后一个。抱歉,我知道我的问题很愚蠢,但这是我在 Python.
上的第一个程序
尽管接受的答案确实有效。我要放 xml 文件的简单示例。
作为输入,我得到了类似的东西:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<yml_catalog date="2022-02-16 18:16">
<shop>
<purchase_currencies>
<currency id="USD" rate="28.3"/>
<currency id="EUR" rate="32.18"/>
<currency id="UAH" rate="1"/>
</purchase_currencies>
<categories></categories>
<offers>
<offer id="SD00025386">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19289.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19289/19289_1.jpg</picture>
<name>Трап ANI Plast TA1612 горизонтальний з нержавіючою решіткою 150x150</name>
<available>true</available>
<oldCode>19289</oldCode>
<model>TA1612</model>
<purchase_price>405.158</purchase_price>
<currency>UAH</currency>
<retail_price>691</retail_price>
<retail_oldprice></retail_oldprice>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="1"></outlet>
<outlet id="86" name="Київ" instock="3"></outlet>
<outlet id="87" name="Україна" instock="16"></outlet>
</outlets>
<vendor>Ани Пласт</vendor>
<vendorCode>SD00025386</vendorCode>
</offer>
<offer id="SD00025387">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19290.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19290/19290_1.jpg</picture>
<name>Трап ANI Plast TA1712 вертикальний з нержавіючою решіткою 150х150</name>
<available>true</available>
<oldCode>19290</oldCode>
<model>TA1712</model>
<purchase_price>354.843</purchase_price>
<currency>UAH</currency>
<retail_price>605</retail_price>
<retail_oldprice></retail_oldprice>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="20"></outlet>
<outlet id="86" name="Київ" instock="3"></outlet>
<outlet id="87" name="Україна" instock="20"></outlet>
</outlets>
<vendor>Ани Пласт</vendor>
<vendorCode>SD00025387</vendorCode>
</offer>
<offer id="SD00022605">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_1.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_2.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_3.jpg</picture>
<name>Донний клапан для раковини Qtap L02 з переливом</name>
<available>true</available>
<oldCode>16508</oldCode>
<model>QTL02</model>
<purchase_price>4.635</purchase_price>
<currency>EUR</currency>
<retail_price>261</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="3"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022605</vendorCode>
</offer>
<offer id="SD00022606">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509/16509_1.jpg</picture>
<name>Донний клапан для раковини Qtap L01 з переливом</name>
<available>true</available>
<oldCode>16509</oldCode>
<model>QTL01</model>
<purchase_price>1.236</purchase_price>
<currency>EUR</currency>
<retail_price>70</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="1"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022606</vendorCode>
</offer>
我的任务是获得供应商 QTAP 的所有报价。所以输出将是:
<offer id="SD00022605">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_1.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_2.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_3.jpg</picture>
<name>Донний клапан для раковини Qtap L02 з переливом</name>
<available>true</available>
<oldCode>16508</oldCode>
<model>QTL02</model>
<purchase_price>4.635</purchase_price>
<currency>EUR</currency>
<retail_price>261</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="3"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022605</vendorCode>
</offer>
<offer id="SD00022606">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509/16509_1.jpg</picture>
<name>Донний клапан для раковини Qtap L01 з переливом</name>
<available>true</available>
<oldCode>16509</oldCode>
<model>QTL01</model>
<purchase_price>1.236</purchase_price>
<currency>EUR</currency>
<retail_price>70</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="1"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022606</vendorCode>
</offer>
是的,问题是每次迭代都会覆盖文件的循环
那是因为循环中的 write()
方法每次运行时都会覆盖前一个元素。试试这样:
qtaps = etree.XML("""<offers/>""".encode())
targets = offers.xpath('//offer[./vendor/text()="Qtap"]')
for target in targets:
qtaps.insert(-1,target)
with open('output.xml', 'wb') as doc:
doc.write(etree.tostring(qtaps, pretty_print = True, encoding='utf-8', xml_declaration=True))
看看它是否有效。
因为听起来您只需要删除 XML 中的 <offer>
个节点,请考虑 XPath 的通用兄弟 XSLT,special-purpose 语言旨在转换 XML 文件。 Python 的 lxml
库可以 运行 XSLT 1.0 脚本。
具体来说,身份模板和空模板可以删除所需的节点 (vendor!='Qtap'
),而无需单个 for
循环。下面将保留 XML 的原始结构,减少 <offer>
个节点。
XSLT (另存为.xsl文件,一个特殊的XML文件)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<!-- EMPTY TEMPLATE TO REMOVE CONTENT -->
<xsl:template match="offer[vendor!='Qtap']"/>
</xsl:stylesheet>
Python
import lxml.etree as lx
# PARSE XML AND XSLT
doc = lx.parse("input.xml")
style = lx.parse("style.xsl")
# CONFIGURE AND RUN TRANSFORMER
transformer = lx.XSLT(style)
result = transformer(doc)
# OUTPUT TO FILE
result.write_output("output.xml")
我在 python
中有以下代码from lxml import etree
offers = etree.parse(r'prices.xml')
print("offers\n")
target = offers.xpath('//offer[./vendor/text()="Qtap"]')
length = len(target)
for i in range(length):
print(target[i])
etree.ElementTree(target[i]).write('output.xml', encoding='utf-8', xml_declaration=True)
我只是阅读了 xml 文件。使用 xpath 从中读取数据并希望将所有数据写入另一个文件。大约有 2000 个元素显示为长度,但脚本只写最后一个。抱歉,我知道我的问题很愚蠢,但这是我在 Python.
上的第一个程序尽管接受的答案确实有效。我要放 xml 文件的简单示例。 作为输入,我得到了类似的东西:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<yml_catalog date="2022-02-16 18:16">
<shop>
<purchase_currencies>
<currency id="USD" rate="28.3"/>
<currency id="EUR" rate="32.18"/>
<currency id="UAH" rate="1"/>
</purchase_currencies>
<categories></categories>
<offers>
<offer id="SD00025386">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19289.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19289/19289_1.jpg</picture>
<name>Трап ANI Plast TA1612 горизонтальний з нержавіючою решіткою 150x150</name>
<available>true</available>
<oldCode>19289</oldCode>
<model>TA1612</model>
<purchase_price>405.158</purchase_price>
<currency>UAH</currency>
<retail_price>691</retail_price>
<retail_oldprice></retail_oldprice>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="1"></outlet>
<outlet id="86" name="Київ" instock="3"></outlet>
<outlet id="87" name="Україна" instock="16"></outlet>
</outlets>
<vendor>Ани Пласт</vendor>
<vendorCode>SD00025386</vendorCode>
</offer>
<offer id="SD00025387">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19290.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/9/19290/19290_1.jpg</picture>
<name>Трап ANI Plast TA1712 вертикальний з нержавіючою решіткою 150х150</name>
<available>true</available>
<oldCode>19290</oldCode>
<model>TA1712</model>
<purchase_price>354.843</purchase_price>
<currency>UAH</currency>
<retail_price>605</retail_price>
<retail_oldprice></retail_oldprice>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="20"></outlet>
<outlet id="86" name="Київ" instock="3"></outlet>
<outlet id="87" name="Україна" instock="20"></outlet>
</outlets>
<vendor>Ани Пласт</vendor>
<vendorCode>SD00025387</vendorCode>
</offer>
<offer id="SD00022605">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_1.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_2.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_3.jpg</picture>
<name>Донний клапан для раковини Qtap L02 з переливом</name>
<available>true</available>
<oldCode>16508</oldCode>
<model>QTL02</model>
<purchase_price>4.635</purchase_price>
<currency>EUR</currency>
<retail_price>261</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="3"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022605</vendorCode>
</offer>
<offer id="SD00022606">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509/16509_1.jpg</picture>
<name>Донний клапан для раковини Qtap L01 з переливом</name>
<available>true</available>
<oldCode>16509</oldCode>
<model>QTL01</model>
<purchase_price>1.236</purchase_price>
<currency>EUR</currency>
<retail_price>70</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="1"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022606</vendorCode>
</offer>
我的任务是获得供应商 QTAP 的所有报价。所以输出将是:
<offer id="SD00022605">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_1.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_2.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16508/16508_3.jpg</picture>
<name>Донний клапан для раковини Qtap L02 з переливом</name>
<available>true</available>
<oldCode>16508</oldCode>
<model>QTL02</model>
<purchase_price>4.635</purchase_price>
<currency>EUR</currency>
<retail_price>261</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="3"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022605</vendorCode>
</offer>
<offer id="SD00022606">
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509.jpg</picture>
<picture>https://isw.b2b-sandi.com.ua/imagecache/full/1/6/16509/16509_1.jpg</picture>
<name>Донний клапан для раковини Qtap L01 з переливом</name>
<available>true</available>
<oldCode>16509</oldCode>
<model>QTL01</model>
<purchase_price>1.236</purchase_price>
<currency>EUR</currency>
<retail_price>70</retail_price>
<retail_oldprice/>
<retail_currency>UAH</retail_currency>
<outlets>
<outlet id="85" name="Харків" instock="0"/>
<outlet id="86" name="Київ" instock="0"/>
<outlet id="87" name="Україна" instock="1"/>
</outlets>
<vendor>Qtap</vendor>
<vendorCode>SD00022606</vendorCode>
</offer>
是的,问题是每次迭代都会覆盖文件的循环
那是因为循环中的 write()
方法每次运行时都会覆盖前一个元素。试试这样:
qtaps = etree.XML("""<offers/>""".encode())
targets = offers.xpath('//offer[./vendor/text()="Qtap"]')
for target in targets:
qtaps.insert(-1,target)
with open('output.xml', 'wb') as doc:
doc.write(etree.tostring(qtaps, pretty_print = True, encoding='utf-8', xml_declaration=True))
看看它是否有效。
因为听起来您只需要删除 XML 中的 <offer>
个节点,请考虑 XPath 的通用兄弟 XSLT,special-purpose 语言旨在转换 XML 文件。 Python 的 lxml
库可以 运行 XSLT 1.0 脚本。
具体来说,身份模板和空模板可以删除所需的节点 (vendor!='Qtap'
),而无需单个 for
循环。下面将保留 XML 的原始结构,减少 <offer>
个节点。
XSLT (另存为.xsl文件,一个特殊的XML文件)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<!-- EMPTY TEMPLATE TO REMOVE CONTENT -->
<xsl:template match="offer[vendor!='Qtap']"/>
</xsl:stylesheet>
Python
import lxml.etree as lx
# PARSE XML AND XSLT
doc = lx.parse("input.xml")
style = lx.parse("style.xsl")
# CONFIGURE AND RUN TRANSFORMER
transformer = lx.XSLT(style)
result = transformer(doc)
# OUTPUT TO FILE
result.write_output("output.xml")