从具有奇数树结构的 xml 中提取 etree 数据

etree data extraction from xml with odd tree structure

这里是 xml 数据的一部分,然后再继续

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xmeml>
<xmeml version="5">
<sequence id="episode1">

    <media>
        <video> 
            <track>

                <generatoritem id="Gen Subtitle1">

                    <effect>
                        <name>Gen Subtitle</name>
                        <effectid>Gen Subtitle</effectid>
                        <effectcategory>Text</effectcategory>
                        <effecttype>generator</effecttype>
                        <mediatype>video</mediatype>
                        <parameter>
                            <parameterid>part1</parameterid>
                            <name>Text Settings</name>
                            <value/>
                        </parameter>
                        <parameter>
                            <parameterid>str</parameterid>
                            <name>Text</name>
                            <value>You're a coward for picking on people&#13;who are weaker than you.</value>
                        </parameter>
                        <parameter>
                            <parameterid>font</parameterid>
                            <name>Font</name>
                            <value>Arial</value>
                        </parameter>

                    </effect>

    </media>
</sequence>
</xmeml>

现在你可以看到树从 <effect> 开始,里面有多个 <parameters> 但我只从 <parameters> 中删除 <value> 也包含

<parameterid>str</parameterid>
<name>Text</name>

所以我可以获得 "That child is so cute. And he's smart."

的输出

这是我的代码

lst = tree.findall('xmeml/sequence/media/video/track/generatoritem/effect/parameter/value')
    counts = tree.findall('.//value')

    for each in counts:
        print(each.text)

这就是我得到的

And he's smart.
Arial

见下文

import xml.etree.ElementTree as ET

xml = '''<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xmeml>
<xmeml version="5">
<sequence id="episode1">


                    <effect>
                        <name>Gen Subtitle</name>
                        <effectid>Gen Subtitle</effectid>
                        <effectcategory>Text</effectcategory>
                        <effecttype>generator</effecttype>
                        <mediatype>video</mediatype>
                        <parameter>
                            <parameterid>part1</parameterid>
                            <name>Text Settings</name>
                            <value/>
                        </parameter>
                        <parameter>
                            <parameterid>str</parameterid>
                            <name>Text</name>
                            <value>That child is so cute. And he's smart</value>
                        </parameter>
                        <parameter>
                            <parameterid>font</parameterid>
                            <name>Font</name>
                            <value>Arial</value>
                        </parameter>

                    </effect>
</sequence>
</xmeml>'''

root = ET.fromstring(xml)

str_params = root.findall('.//parameter/[parameterid="str"]')
for param in str_params:
    if param.find('./name').text == 'Text':
        print('The text: {}'.format(param.find('./value').text))
        break

输出

The text: That child is so cute. And he's smart