如何将 .xml 元素属性导出到另一个现有的 .xml?
How to export .xml element attributes to another existing .xml?
我在每个电影目录中有2个xml文件,一个叫mymovies.xml,另一个叫moviename.nfo(都是xml个文件)。
我想做的是“提取”子属性:语言、类型、频道:
<AudioTracks>
<AudioTrack Language="German" Type="DTS-HD Master" Channels="7.1" />
<AudioTrack Language="German" Type="Dolby Digital" Channels="2.0" />
<AudioTrack Language="English" Type="DTS-HD Master" Channels="7.1" />
</AudioTracks>
和'import'将它们转化为moviename.nfo格式:
<fileinfo>
<streamdetails>
<audio>
<codec>dtshdmaster</codec>
<language>ger</language>
<channels>8</channels>
</audio>
<audio>
<codec>dolbydigital</codec>
<language>ger</language>
<channels>2</channels>
</audio>
<audio>
<codec>dtshdmaster</codec>
<language>eng</language>
<channels>8</channels>
</audio>
</streamdetails>
</fileinfo>
moviename.nfo 示例:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Barry Lyndon</title>
<originaltitle>Barry Lyndon</originaltitle>
<sorttitle>Barry Lyndon</sorttitle>
<set>
</set>
<rating>8</rating>
<year>1975</year>
<top250>
</top250>
<votes>
</votes>
<tagline>
</tagline>
<runtime>185</runtime>
<thumb>
</thumb>
<mpaa>Rated PG-13</mpaa>
<playcount>0</playcount>
<watched>false</watched>
<id>tt0072684</id>
<filenameandpath>
</filenameandpath>
<country>Germany</country>
<trailer>
</trailer>
<certification>Germany:FSK ab 12 freigegeben</certification>
<genre>War</genre>
<genre>Drama</genre>
<genre>Romance</genre>
<studio>Peregrine</studio>
<credits>Stanley Kubrick, William Makepeace Thackeray</credits>
<director>Stanley Kubrick</director>
<createdby>My Movies</createdby>
</movie>
预期输出:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Barry Lyndon</title>
<originaltitle>Barry Lyndon</originaltitle>
<sorttitle>Barry Lyndon</sorttitle>
<set>
</set>
<rating>8</rating>
<year>1975</year>
<top250>
</top250>
<votes>
</votes>
<tagline>
</tagline>
<runtime>185</runtime>
<thumb>
</thumb>
<mpaa>Rated PG-13</mpaa>
<playcount>0</playcount>
<watched>false</watched>
<id>tt0072684</id>
<filenameandpath>
</filenameandpath>
<country>Germany</country>
<trailer>
</trailer>
<fileinfo>
<streamdetails>
<audio>
<codec>dtshdmaster</codec>
<language>ger</language>
<channels>8</channels>
</audio>
<audio>
<codec>dolbydigital</codec>
<language>ger</language>
<channels>2</channels>
</audio>
<audio>
<codec>dtshdmaster</codec>
<language>eng</language>
<channels>8</channels>
</audio>
</streamdetails>
</fileinfo>
<certification>Germany:FSK ab 12 freigegeben</certification>
<genre>War</genre>
<genre>Drama</genre>
<genre>Romance</genre>
<studio>Peregrine</studio>
<credits>Stanley Kubrick, William Makepeace Thackeray</credits>
<director>Stanley Kubrick</director>
<createdby>My Movies</createdby>
</movie>
到目前为止我有:
import xml.etree.ElementTree as ET
root_node = ET.parse('mymovies.xml').getroot()
for tag in root_node.findall('AudioTracks/AudioTrack'):
value = tag.attrib['Language']
print(value)
value = tag.attrib['Type']
print (value)
value = tag.attrib ['Channels']
print (value)
输出为:
English
DTS-HD Master
5.1
English
Dolby Digital
2.0
French
Dolby Digital
5.1
Spanish
Dolby Digital
5.1
Portuguese
Dolby Digital
5.1
我现在想知道的是:
- 如何导入 2 个 ElementTrees?
- 如何将具体的解析信息写到另一个文件中?
- 我怎样才能使属性准确地达到我需要的水平和形式?
看看你是否可以使用它。我做了一些假设来重新翻译这些值(语言、编解码器、频道)。
import xml.etree.ElementTree as ET
import re
# https://web.archive.org/web/20120301034645/http://effbot.org/zone/element-lib.htm#prettyprint
# in-place prettyprint formatter
def indent(elem, level=0):
i = "\n" + level * " "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for elem in elem:
indent(elem, level + 1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
pattern = re.compile(r'[^A-Za-z]')
source = '''\
<AudioTracks>
<AudioTrack Language="German" Type="DTS-HD Master" Channels="7.1" />
<AudioTrack Language="German" Type="Dolby Digital" Channels="2.0" />
<AudioTrack Language="English" Type="DTS-HD Master" Channels="7.1" />
</AudioTracks>
'''
template = '''\
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Barry Lyndon</title>
<originaltitle>Barry Lyndon</originaltitle>
<sorttitle>Barry Lyndon</sorttitle>
<set>
</set>
<rating>8</rating>
<year>1975</year>
<top250>
</top250>
<votes>
</votes>
<tagline>
</tagline>
<runtime>185</runtime>
<thumb>
</thumb>
<mpaa>Rated PG-13</mpaa>
<playcount>0</playcount>
<watched>false</watched>
<id>tt0072684</id>
<filenameandpath>
</filenameandpath>
<country>Germany</country>
<trailer>
</trailer>
<certification>Germany:FSK ab 12 freigegeben</certification>
<genre>War</genre>
<genre>Drama</genre>
<genre>Romance</genre>
<studio>Peregrine</studio>
<credits>Stanley Kubrick, William Makepeace Thackeray</credits>
<director>Stanley Kubrick</director>
<createdby>My Movies</createdby>
</movie>
'''
fileinfo = ET.Element('fileinfo')
e = ET.Element('streamdetails')
fileinfo.append(e) # wrap in fileinfo
st = ET.fromstring(source)
for at in st.findall('./AudioTrack'):
codec = pattern.sub('', at.attrib['Type']).lower()
channels = str(sum(map(int, at.attrib['Channels'].split('.'))))
language = at.attrib['Language'][:3].lower()
ET.SubElement(e,
'audio',
codec=codec,
language=language,
channels=channels)
out = ET.fromstring(template)
for i, c in enumerate(out):
if c.tag == 'certification':
out.insert(i, fileinfo)
break
indent(out)
print(ET.tostring(out).decode('utf8'))
我在每个电影目录中有2个xml文件,一个叫mymovies.xml,另一个叫moviename.nfo(都是xml个文件)。
我想做的是“提取”子属性:语言、类型、频道:
<AudioTracks>
<AudioTrack Language="German" Type="DTS-HD Master" Channels="7.1" />
<AudioTrack Language="German" Type="Dolby Digital" Channels="2.0" />
<AudioTrack Language="English" Type="DTS-HD Master" Channels="7.1" />
</AudioTracks>
和'import'将它们转化为moviename.nfo格式:
<fileinfo>
<streamdetails>
<audio>
<codec>dtshdmaster</codec>
<language>ger</language>
<channels>8</channels>
</audio>
<audio>
<codec>dolbydigital</codec>
<language>ger</language>
<channels>2</channels>
</audio>
<audio>
<codec>dtshdmaster</codec>
<language>eng</language>
<channels>8</channels>
</audio>
</streamdetails>
</fileinfo>
moviename.nfo 示例:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Barry Lyndon</title>
<originaltitle>Barry Lyndon</originaltitle>
<sorttitle>Barry Lyndon</sorttitle>
<set>
</set>
<rating>8</rating>
<year>1975</year>
<top250>
</top250>
<votes>
</votes>
<tagline>
</tagline>
<runtime>185</runtime>
<thumb>
</thumb>
<mpaa>Rated PG-13</mpaa>
<playcount>0</playcount>
<watched>false</watched>
<id>tt0072684</id>
<filenameandpath>
</filenameandpath>
<country>Germany</country>
<trailer>
</trailer>
<certification>Germany:FSK ab 12 freigegeben</certification>
<genre>War</genre>
<genre>Drama</genre>
<genre>Romance</genre>
<studio>Peregrine</studio>
<credits>Stanley Kubrick, William Makepeace Thackeray</credits>
<director>Stanley Kubrick</director>
<createdby>My Movies</createdby>
</movie>
预期输出:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Barry Lyndon</title>
<originaltitle>Barry Lyndon</originaltitle>
<sorttitle>Barry Lyndon</sorttitle>
<set>
</set>
<rating>8</rating>
<year>1975</year>
<top250>
</top250>
<votes>
</votes>
<tagline>
</tagline>
<runtime>185</runtime>
<thumb>
</thumb>
<mpaa>Rated PG-13</mpaa>
<playcount>0</playcount>
<watched>false</watched>
<id>tt0072684</id>
<filenameandpath>
</filenameandpath>
<country>Germany</country>
<trailer>
</trailer>
<fileinfo>
<streamdetails>
<audio>
<codec>dtshdmaster</codec>
<language>ger</language>
<channels>8</channels>
</audio>
<audio>
<codec>dolbydigital</codec>
<language>ger</language>
<channels>2</channels>
</audio>
<audio>
<codec>dtshdmaster</codec>
<language>eng</language>
<channels>8</channels>
</audio>
</streamdetails>
</fileinfo>
<certification>Germany:FSK ab 12 freigegeben</certification>
<genre>War</genre>
<genre>Drama</genre>
<genre>Romance</genre>
<studio>Peregrine</studio>
<credits>Stanley Kubrick, William Makepeace Thackeray</credits>
<director>Stanley Kubrick</director>
<createdby>My Movies</createdby>
</movie>
到目前为止我有:
import xml.etree.ElementTree as ET
root_node = ET.parse('mymovies.xml').getroot()
for tag in root_node.findall('AudioTracks/AudioTrack'):
value = tag.attrib['Language']
print(value)
value = tag.attrib['Type']
print (value)
value = tag.attrib ['Channels']
print (value)
输出为:
English
DTS-HD Master
5.1
English
Dolby Digital
2.0
French
Dolby Digital
5.1
Spanish
Dolby Digital
5.1
Portuguese
Dolby Digital
5.1
我现在想知道的是:
- 如何导入 2 个 ElementTrees?
- 如何将具体的解析信息写到另一个文件中?
- 我怎样才能使属性准确地达到我需要的水平和形式?
看看你是否可以使用它。我做了一些假设来重新翻译这些值(语言、编解码器、频道)。
import xml.etree.ElementTree as ET
import re
# https://web.archive.org/web/20120301034645/http://effbot.org/zone/element-lib.htm#prettyprint
# in-place prettyprint formatter
def indent(elem, level=0):
i = "\n" + level * " "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for elem in elem:
indent(elem, level + 1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
pattern = re.compile(r'[^A-Za-z]')
source = '''\
<AudioTracks>
<AudioTrack Language="German" Type="DTS-HD Master" Channels="7.1" />
<AudioTrack Language="German" Type="Dolby Digital" Channels="2.0" />
<AudioTrack Language="English" Type="DTS-HD Master" Channels="7.1" />
</AudioTracks>
'''
template = '''\
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Barry Lyndon</title>
<originaltitle>Barry Lyndon</originaltitle>
<sorttitle>Barry Lyndon</sorttitle>
<set>
</set>
<rating>8</rating>
<year>1975</year>
<top250>
</top250>
<votes>
</votes>
<tagline>
</tagline>
<runtime>185</runtime>
<thumb>
</thumb>
<mpaa>Rated PG-13</mpaa>
<playcount>0</playcount>
<watched>false</watched>
<id>tt0072684</id>
<filenameandpath>
</filenameandpath>
<country>Germany</country>
<trailer>
</trailer>
<certification>Germany:FSK ab 12 freigegeben</certification>
<genre>War</genre>
<genre>Drama</genre>
<genre>Romance</genre>
<studio>Peregrine</studio>
<credits>Stanley Kubrick, William Makepeace Thackeray</credits>
<director>Stanley Kubrick</director>
<createdby>My Movies</createdby>
</movie>
'''
fileinfo = ET.Element('fileinfo')
e = ET.Element('streamdetails')
fileinfo.append(e) # wrap in fileinfo
st = ET.fromstring(source)
for at in st.findall('./AudioTrack'):
codec = pattern.sub('', at.attrib['Type']).lower()
channels = str(sum(map(int, at.attrib['Channels'].split('.'))))
language = at.attrib['Language'][:3].lower()
ET.SubElement(e,
'audio',
codec=codec,
language=language,
channels=channels)
out = ET.fromstring(template)
for i, c in enumerate(out):
if c.tag == 'certification':
out.insert(i, fileinfo)
break
indent(out)
print(ET.tostring(out).decode('utf8'))