Python lxml 如何 post 使用 XPath 再次处理 XPath 结果
Python lxml how to post process XPath result again with XPath
我有以下 XML 文件:
<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Unified Streaming Platform(version=1.7.8) -->
<MPD
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:mpeg:dash:schema:mpd:2011"
xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
type="static"
mediaPresentationDuration="PT1H43M36.832S"
maxSegmentDuration="PT3S"
minBufferTime="PT10S"
profiles="urn:mpeg:dash:profile:isoff-live:2011,urn:com:dashif:dash264">
<Period>
<BaseURL>dash/</BaseURL>
<AdaptationSet group="1" contentType="audio" lang="tr" minBandwidth="157405" maxBandwidth="157405"
segmentAlignment="true" audioSamplingRate="48000" mimeType="audio/mp4" codecs="mp4a.40.2">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2">
</AudioChannelConfiguration>
<Representation id="audio_tur=157405" bandwidth="157405">
</Representation>
</AdaptationSet>
<AdaptationSet group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000"
minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true"
frameRate="25" mimeType="video/mp4" startWithSAP="1">
<Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E"
scanType="progressive">
</Representation>
<Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E"
scanType="progressive">
</Representation>
<Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3"
codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3"
codecs="avc1.4D4028" scanType="progressive">
</Representation>
<Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028"
scanType="progressive">
</Representation>
</AdaptationSet>
<AdaptationSet
group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000"
minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33"
frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
<Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
</Representation>
</AdaptationSet>
</Period>
</MPD>
和运行下面的Python代码就可以了:
from lxml import etree
file = "Data.xml"
namespaces = {'ns':'urn:mpeg:dash:schema:mpd:2011'}
tree = etree.parse(file)
root = tree.getroot()
for r in root.xpath('//ns:AdaptationSet[@contentType="video"]',namespaces=namespaces):
print etree.tostring(r)
for bandwidth in r.xpath('//ns:Representation/@bandwidth',namespaces=namespaces):
print bandwidth
我现在的问题是,第二个循环没有使用之前 xpath 的结果,而是再次使用完整的树!这就是为什么结果也包括音频表示的原因。详细内容如下所示:
<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000" minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true" frameRate="25" mimeType="video/mp4" startWithSAP="1">
<Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E" scanType="progressive">
</Representation>
<Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E" scanType="progressive">
</Representation>
<Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3" codecs="avc1.4D4028" scanType="progressive">
</Representation>
<Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028" scanType="progressive">
</Representation>
</AdaptationSet>
157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000
<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000" minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33" frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
<Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
</Representation>
</AdaptationSet>
157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000
因此,即使找到了正确的 AdaptionSet,对于两次迭代,也会处理完整的树。我知道我可以构建一个 XPath 来获得带宽,但我之前需要 AdaptionSet 并且很想在第二个循环中只使用第一个循环的结果。我该怎么做?
您必须在 XPath 的开头添加 .
以使其 相对于 当前上下文节点,在本例中由变量 [=12] 引用=]:
r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces)
此行为在 XPath 1.0 documentation 中提到如下:
//para
selects all the para descendants of the document root and thus selects all para elements in the same document as the context node
.//para
selects the para element descendants of the context node
尝试使用相对 xpath -
for bandwidth in r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces):
.
将使 xpath 从当前元素开始。如果未指定 .
,如您所见,xpath 将从根节点开始查询。
我有以下 XML 文件:
<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Unified Streaming Platform(version=1.7.8) -->
<MPD
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:mpeg:dash:schema:mpd:2011"
xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
type="static"
mediaPresentationDuration="PT1H43M36.832S"
maxSegmentDuration="PT3S"
minBufferTime="PT10S"
profiles="urn:mpeg:dash:profile:isoff-live:2011,urn:com:dashif:dash264">
<Period>
<BaseURL>dash/</BaseURL>
<AdaptationSet group="1" contentType="audio" lang="tr" minBandwidth="157405" maxBandwidth="157405"
segmentAlignment="true" audioSamplingRate="48000" mimeType="audio/mp4" codecs="mp4a.40.2">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2">
</AudioChannelConfiguration>
<Representation id="audio_tur=157405" bandwidth="157405">
</Representation>
</AdaptationSet>
<AdaptationSet group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000"
minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true"
frameRate="25" mimeType="video/mp4" startWithSAP="1">
<Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E"
scanType="progressive">
</Representation>
<Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E"
scanType="progressive">
</Representation>
<Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3"
codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F"
scanType="progressive">
</Representation>
<Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3"
codecs="avc1.4D4028" scanType="progressive">
</Representation>
<Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028"
scanType="progressive">
</Representation>
</AdaptationSet>
<AdaptationSet
group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000"
minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33"
frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
<Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
</Representation>
</AdaptationSet>
</Period>
</MPD>
和运行下面的Python代码就可以了:
from lxml import etree
file = "Data.xml"
namespaces = {'ns':'urn:mpeg:dash:schema:mpd:2011'}
tree = etree.parse(file)
root = tree.getroot()
for r in root.xpath('//ns:AdaptationSet[@contentType="video"]',namespaces=namespaces):
print etree.tostring(r)
for bandwidth in r.xpath('//ns:Representation/@bandwidth',namespaces=namespaces):
print bandwidth
我现在的问题是,第二个循环没有使用之前 xpath 的结果,而是再次使用完整的树!这就是为什么结果也包括音频表示的原因。详细内容如下所示:
<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000" minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true" frameRate="25" mimeType="video/mp4" startWithSAP="1">
<Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E" scanType="progressive">
</Representation>
<Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E" scanType="progressive">
</Representation>
<Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
</Representation>
<Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3" codecs="avc1.4D4028" scanType="progressive">
</Representation>
<Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028" scanType="progressive">
</Representation>
</AdaptationSet>
157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000
<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000" minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33" frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
<Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
</Representation>
</AdaptationSet>
157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000
因此,即使找到了正确的 AdaptionSet,对于两次迭代,也会处理完整的树。我知道我可以构建一个 XPath 来获得带宽,但我之前需要 AdaptionSet 并且很想在第二个循环中只使用第一个循环的结果。我该怎么做?
您必须在 XPath 的开头添加 .
以使其 相对于 当前上下文节点,在本例中由变量 [=12] 引用=]:
r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces)
此行为在 XPath 1.0 documentation 中提到如下:
//para
selects all the para descendants of the document root and thus selects all para elements in the same document as the context node
.//para
selects the para element descendants of the context node
尝试使用相对 xpath -
for bandwidth in r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces):
.
将使 xpath 从当前元素开始。如果未指定 .
,如您所见,xpath 将从根节点开始查询。