使用 python 在 xml 中找到相应的属性

find corresponding attribute in xml with python

我有这个xml:

<?xml version="1.0" encoding="utf-8" ?> 
<ArrayOfEMObject2 xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.blue-order.com/ma/essencemanagerws/EssenceManager">
    <EMObject2>
        <emguid>727ef486-31b3-48c3-b38e-39561995ef80</emguid> 
        <orgname>2435e6b6-e19a-4ca5-a708-47f7d9387bb9.wav</orgname> 
        <streamclass>AUDIO</streamclass> 
        <streamtype>WAV</streamtype> 
        <prefusage>BROWSE</prefusage> 
    </EMObject2>
    <EMObject2>
        <emguid>e866abef-7571-45a7-84be-85f2ffc35b31</emguid> 
        <orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname> 
        <streamclass>AUDIO</streamclass> 
        <streamtype>MP3</streamtype> 
        <prefusage>AUX</prefusage> 
    </EMObject2>
    <EMObject2>
        <emguid>f02ab3db-93c8-4cbf-82b8-5fb06704a4ea</emguid> 
        <orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname> 
        <streamclass>AUDIO</streamclass> 
        <streamtype>MP3</streamtype> 
        <prefusage>AUX</prefusage> 
    </EMObject2>

如果streamtypeMP3,我需要相应的emguidorgname

我已经有了这个:

from xml.etree import ElementTree
# ...
namespace = '{http://www.blue-order.com/ma/essencemanagerws/EssenceManager}'
for child in root.findall('.//{}streamtype'.format(namespace)):
    if child.text == 'MP3':

我应该如何进行这里操作?

试试这个。

from simplified_scrapy import SimplifiedDoc,utils

xml = '''
<?xml version="1.0" encoding="utf-8" ?> 
<ArrayOfEMObject2 xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.blue-order.com/ma/essencemanagerws/EssenceManager">
    <EMObject2>
        <emguid>727ef486-31b3-48c3-b38e-39561995ef80</emguid> 
        <orgname>2435e6b6-e19a-4ca5-a708-47f7d9387bb9.wav</orgname> 
        <streamclass>AUDIO</streamclass> 
        <streamtype>WAV</streamtype> 
        <prefusage>BROWSE</prefusage> 
    </EMObject2>
    <EMObject2>
        <emguid>e866abef-7571-45a7-84be-85f2ffc35b31</emguid> 
        <orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname> 
        <streamclass>AUDIO</streamclass> 
        <streamtype>MP3</streamtype> 
        <prefusage>AUX</prefusage> 
    </EMObject2>
    <EMObject2>
        <emguid>f02ab3db-93c8-4cbf-82b8-5fb06704a4ea</emguid> 
        <orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname> 
        <streamclass>AUDIO</streamclass> 
        <streamtype>MP3</streamtype> 
        <prefusage>AUX</prefusage> 
    </EMObject2>
'''
doc = SimplifiedDoc(xml)
lst = doc.selects('streamtype').contains('MP3').parent
print ([(l.emguid.text,l.orgname.text) for l in lst])

# Or
lst = doc.selects('EMObject2')
for l in lst:
    if l.streamtype.text=='MP3':
        print (l.emguid.text,l.orgname.text)

结果:

[('e866abef-7571-45a7-84be-85f2ffc35b31', '201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3'), ('f02ab3db-93c8-4cbf-82b8-5fb06704a4ea', '201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3')]
e866abef-7571-45a7-84be-85f2ffc35b31 201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3
f02ab3db-93c8-4cbf-82b8-5fb06704a4ea 201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3

这里尝试寻找 EMObject2 个实例并检查它们的子实例。

namespace = '{http://www.blue-order.com/ma/essencemanagerws/EssenceManager}'
tags = {'{}{}'.format(namespace, tag): tag
        for tag in ('orgname', 'streamtype', 'emguid')}
for node in root.findall('.//{}EMObject2'.format(namespace)):
    match = dict()
    for child in node:
        if child.tag in tags:
            match[tags[child.tag]] = child.text
    try:
        if match['streamtype'] == 'MP3':
            print(match['orgname'], match['emguid'])
    except KeyError:
        pass

(我必须通过添加结束标记来修复你的 XML 才能将其添加到 运行。)

您可以找到并检查 streamtype 标签,然后像这样检索其他信息:

from xml.etree import ElementTree
# ...
namespace = '{http://www.blue-order.com/ma/essencemanagerws/EssenceManager}'
for child in root.findall('.//{}EMObject2'.format(namespace)):
    if child.find('{}streamtype'.format(namespace)).text == 'MP3':
        print(child.find('{}emguid'.format(namespace)).text)
        print(child.find('{}orgname'.format(namespace)).text)