通过 XSLT sheet 传递 XML 会导致节点重复
Passing XML through XSLT sheet results in duplicated nodes
我正在使用 Python 编辑一整篇 XML 文档。这是一份每个月生成的文档,然后手动编辑下来,我正在尝试至少使该过程的一部分自动化。这个的第一部分工作得很好。但是,当我 运行 通过 XSLT 文档 (change()) 通过最终重新排序更改文档时,我得到重新排序的元素和原始元素的原始顺序,但我不知道为什么。我原以为这是因为我一遍又一遍地重写同一个文件,但重复项直到 change() 运行s 之后才出现。所以我认为这与我使用 XSLT 的方式有关,但我是这方面的真正初学者。所以如果你想帮我拍我,我将不胜感激。
from __future__ import print_function
from lxml import etree
import xml.etree.ElementTree as et
def adultSmash():
def adultGrab(): #grab all adult events
src_tree = et.parse('quartertwo.xml')
src_root = src_tree.getroot()
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in src_root.findall('event'):
agerange = event.find('AgeRanges')
if agerange is None:
continue
ageranges = agerange.text
if ageranges == 'Adult':
dest_root.append(event)
et.ElementTree(dest_root).write('dest_tree.xml')
def clean():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in dest_root.findall('event'):
book = event.find('EventType') #
books = book.text
if books == 'Book Groups':
dest_root.remove(event)
elif books == 'Book Sales':
dest_root.remove(event)
elif books == 'Bookmobile Stop':
dest_root.remove(event)
et.ElementTree(dest_root).write('dest_tree.xml')
def cleanNodes():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
foos = dest_tree.findall('event')
for event in foos:
bars = event.findall('Notes')
for Notes in bars:
event.remove(Notes)
et.ElementTree(dest_root).write('dest_tree.xml')
def change():
dom = et.parse('dest_tree.xml')
xslt = et.parse('change.xslt')
transform = et.XSLT(xslt)
newdom = transform(dom)
log = open('dest_tree.xml', 'w')
print(str(newdom), file = log)
adultGrab()
clean()
cleanNodes()
change()
这是XML
<?xml version="1.0" encoding="utf-8"?>
<events>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
</events>
这是我用来更改它的 XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml" />
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="event">
<xsl:copy>
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="title" />
<xsl:apply-templates select="RelatedLocations" />
<xsl:apply-templates select="Date" />
<xsl:apply-templates select="DateYear" />
<xsl:apply-templates select="DateMonth" />
<xsl:apply-templates select="DateDay" />
<xsl:apply-templates select="Body" />
<xsl:apply-templates select="AgeRanges" />
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
</xsl:copy>
</xsl:template>
最后是这样的结果:
<?xml version="1.0" encoding="UTF-8"?>
<events>
<event>
<title>Blah</title>
<RelatedLocations>Derp</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
<AgeRanges>Adult</AgeRanges>
<AgeRanges>Adult</AgeRanges>
<title></title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
因此,我们将不胜感激。
您在输出中得到了重复的节点,因为您将模板应用到相同的节点两次。例如,您这样做:
<xsl:apply-templates select="title" />
然后:
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
title
元素既不是 Location
也不是 EventType
,因此第二条指令再次对其应用模板。
我正在使用 Python 编辑一整篇 XML 文档。这是一份每个月生成的文档,然后手动编辑下来,我正在尝试至少使该过程的一部分自动化。这个的第一部分工作得很好。但是,当我 运行 通过 XSLT 文档 (change()) 通过最终重新排序更改文档时,我得到重新排序的元素和原始元素的原始顺序,但我不知道为什么。我原以为这是因为我一遍又一遍地重写同一个文件,但重复项直到 change() 运行s 之后才出现。所以我认为这与我使用 XSLT 的方式有关,但我是这方面的真正初学者。所以如果你想帮我拍我,我将不胜感激。
from __future__ import print_function
from lxml import etree
import xml.etree.ElementTree as et
def adultSmash():
def adultGrab(): #grab all adult events
src_tree = et.parse('quartertwo.xml')
src_root = src_tree.getroot()
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in src_root.findall('event'):
agerange = event.find('AgeRanges')
if agerange is None:
continue
ageranges = agerange.text
if ageranges == 'Adult':
dest_root.append(event)
et.ElementTree(dest_root).write('dest_tree.xml')
def clean():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in dest_root.findall('event'):
book = event.find('EventType') #
books = book.text
if books == 'Book Groups':
dest_root.remove(event)
elif books == 'Book Sales':
dest_root.remove(event)
elif books == 'Bookmobile Stop':
dest_root.remove(event)
et.ElementTree(dest_root).write('dest_tree.xml')
def cleanNodes():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
foos = dest_tree.findall('event')
for event in foos:
bars = event.findall('Notes')
for Notes in bars:
event.remove(Notes)
et.ElementTree(dest_root).write('dest_tree.xml')
def change():
dom = et.parse('dest_tree.xml')
xslt = et.parse('change.xslt')
transform = et.XSLT(xslt)
newdom = transform(dom)
log = open('dest_tree.xml', 'w')
print(str(newdom), file = log)
adultGrab()
clean()
cleanNodes()
change()
这是XML
<?xml version="1.0" encoding="utf-8"?>
<events>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
</events>
这是我用来更改它的 XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml" />
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="event">
<xsl:copy>
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="title" />
<xsl:apply-templates select="RelatedLocations" />
<xsl:apply-templates select="Date" />
<xsl:apply-templates select="DateYear" />
<xsl:apply-templates select="DateMonth" />
<xsl:apply-templates select="DateDay" />
<xsl:apply-templates select="Body" />
<xsl:apply-templates select="AgeRanges" />
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
</xsl:copy>
</xsl:template>
最后是这样的结果:
<?xml version="1.0" encoding="UTF-8"?>
<events>
<event>
<title>Blah</title>
<RelatedLocations>Derp</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
<AgeRanges>Adult</AgeRanges>
<AgeRanges>Adult</AgeRanges>
<title></title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
因此,我们将不胜感激。
您在输出中得到了重复的节点,因为您将模板应用到相同的节点两次。例如,您这样做:
<xsl:apply-templates select="title" />
然后:
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
title
元素既不是 Location
也不是 EventType
,因此第二条指令再次对其应用模板。