如何将 .xml 元素属性导出到另一个现有的 .xml?

How to export .xml element attributes to another existing .xml?

我在每个电影目录中有2个xml文件,一个叫mymovies.xml,另一个叫moviename.nfo(都是xml个文件)。

我想做的是“提取”子属性:语言、类型、频道:

<AudioTracks>
    <AudioTrack Language="German" Type="DTS-HD Master" Channels="7.1" />
    <AudioTrack Language="German" Type="Dolby Digital" Channels="2.0" />
    <AudioTrack Language="English" Type="DTS-HD Master" Channels="7.1" />
</AudioTracks>

和'import'将它们转化为moviename.nfo格式:

<fileinfo>
    <streamdetails>
        <audio>
            <codec>dtshdmaster</codec>
            <language>ger</language>
            <channels>8</channels>
        </audio>
         <audio>
            <codec>dolbydigital</codec>
            <language>ger</language>
            <channels>2</channels>
        </audio>
        <audio>
            <codec>dtshdmaster</codec>
            <language>eng</language>
            <channels>8</channels>
        </audio>
    </streamdetails>
</fileinfo>

moviename.nfo 示例:

   <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
  <title>Barry Lyndon</title>
  <originaltitle>Barry Lyndon</originaltitle>
  <sorttitle>Barry Lyndon</sorttitle>
  <set>
  </set>
  <rating>8</rating>
  <year>1975</year>
  <top250>
  </top250>
  <votes>
  </votes> 
  <tagline>
  </tagline>
  <runtime>185</runtime>
  <thumb>
  </thumb>
  <mpaa>Rated PG-13</mpaa>
  <playcount>0</playcount>
  <watched>false</watched>
  <id>tt0072684</id>
  <filenameandpath>
  </filenameandpath>
  <country>Germany</country>
  <trailer>
  </trailer>
  <certification>Germany:FSK ab 12 freigegeben</certification>
  <genre>War</genre>
  <genre>Drama</genre>
  <genre>Romance</genre>
  <studio>Peregrine</studio>
  <credits>Stanley Kubrick, William Makepeace Thackeray</credits>
  <director>Stanley Kubrick</director>
  <createdby>My Movies</createdby>
</movie>

预期输出:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
  <title>Barry Lyndon</title>
  <originaltitle>Barry Lyndon</originaltitle>
  <sorttitle>Barry Lyndon</sorttitle>
  <set>
  </set>
  <rating>8</rating>
  <year>1975</year>
  <top250>
  </top250>
  <votes>
  </votes>
  <tagline>
  </tagline>
  <runtime>185</runtime>
  <thumb>
  </thumb>
  <mpaa>Rated PG-13</mpaa>
  <playcount>0</playcount>
  <watched>false</watched>
  <id>tt0072684</id>
  <filenameandpath>
  </filenameandpath>
  <country>Germany</country>
  <trailer>
  </trailer>
  <fileinfo>
    <streamdetails>
      <audio>
          <codec>dtshdmaster</codec>
          <language>ger</language>
          <channels>8</channels>
       </audio>
       <audio>
          <codec>dolbydigital</codec>
          <language>ger</language>
          <channels>2</channels>
      </audio>
      <audio>
          <codec>dtshdmaster</codec>
          <language>eng</language>
          <channels>8</channels>
      </audio>
    </streamdetails>
  </fileinfo>
  <certification>Germany:FSK ab 12 freigegeben</certification>
  <genre>War</genre>
  <genre>Drama</genre>
  <genre>Romance</genre>
  <studio>Peregrine</studio>
  <credits>Stanley Kubrick, William Makepeace Thackeray</credits>
  <director>Stanley Kubrick</director>
  <createdby>My Movies</createdby>
</movie>

到目前为止我有:

import xml.etree.ElementTree as ET

root_node = ET.parse('mymovies.xml').getroot()

for tag in root_node.findall('AudioTracks/AudioTrack'):

value = tag.attrib['Language']
print(value)

value = tag.attrib['Type']
print (value)

value = tag.attrib ['Channels']

print (value)

输出为:

English
DTS-HD Master
5.1
English
Dolby Digital
2.0
French
Dolby Digital
5.1
Spanish
Dolby Digital
5.1
Portuguese
Dolby Digital
5.1

我现在想知道的是:

看看你是否可以使用它。我做了一些假设来重新翻译这些值(语言、编解码器、频道)。

import xml.etree.ElementTree as ET
import re


# https://web.archive.org/web/20120301034645/http://effbot.org/zone/element-lib.htm#prettyprint
# in-place prettyprint formatter
def indent(elem, level=0):
    i = "\n" + level * "  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for elem in elem:
            indent(elem, level + 1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i


pattern = re.compile(r'[^A-Za-z]')

source = '''\
<AudioTracks>
    <AudioTrack Language="German" Type="DTS-HD Master" Channels="7.1" />
    <AudioTrack Language="German" Type="Dolby Digital" Channels="2.0" />
    <AudioTrack Language="English" Type="DTS-HD Master" Channels="7.1" />
</AudioTracks>
'''

template = '''\
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
  <title>Barry Lyndon</title>
  <originaltitle>Barry Lyndon</originaltitle>
  <sorttitle>Barry Lyndon</sorttitle>
  <set>
  </set>
  <rating>8</rating>
  <year>1975</year>
  <top250>
  </top250>
  <votes>
  </votes> 
  <tagline>
  </tagline>
  <runtime>185</runtime>
  <thumb>
  </thumb>
  <mpaa>Rated PG-13</mpaa>
  <playcount>0</playcount>
  <watched>false</watched>
  <id>tt0072684</id>
  <filenameandpath>
  </filenameandpath>
  <country>Germany</country>
  <trailer>
  </trailer>
  <certification>Germany:FSK ab 12 freigegeben</certification>
  <genre>War</genre>
  <genre>Drama</genre>
  <genre>Romance</genre>
  <studio>Peregrine</studio>
  <credits>Stanley Kubrick, William Makepeace Thackeray</credits>
  <director>Stanley Kubrick</director>
  <createdby>My Movies</createdby>
</movie>
'''

fileinfo = ET.Element('fileinfo')
e = ET.Element('streamdetails')
fileinfo.append(e)  # wrap in fileinfo

st = ET.fromstring(source)
for at in st.findall('./AudioTrack'):
    codec = pattern.sub('', at.attrib['Type']).lower()
    channels = str(sum(map(int, at.attrib['Channels'].split('.'))))
    language = at.attrib['Language'][:3].lower()
    ET.SubElement(e,
                  'audio',
                  codec=codec,
                  language=language,
                  channels=channels)

out = ET.fromstring(template)
for i, c in enumerate(out):
    if c.tag == 'certification':
        out.insert(i, fileinfo)
        break

indent(out)
print(ET.tostring(out).decode('utf8'))