如何使用 Python 中的 ElementTree 解析同一标签的值?
How to parse value from same tag using ElementTree in Python?
我正在使用 python 来解析 XML 文件,但我遇到了问题。我以字典的形式获取值,但如果有两个或更多相同的值,则它们不会重复。我确定有办法解决它,但我是 python 的新手,正在解析 XML...
这里是XML的例子:
<Root>
<Child1>
</Child1>
<Child2>
<Data DId = "1">
<Group ID = "">
<Sport Name="Cricket" Team="6" />
<Sport Name="Football" Team="6" />
<Sport Name="Hockey" Team="5" />
</Group>
</Data>
<Data DId = "2">
<Group ID = "">
<Sport Name="Rugby" Team="6" />
<Sport Name="Baseball" Team="10" />
<Sport Name="Swimming" Team="6" />
</Group>
</Data>
</Child2>
</Root>
我想获取由数据分隔的 Sport 标签值。
我试过的代码是:
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
dict1 = {}
for i in root.iter('Sport'):
dict1[i.attrib['Name']] = [j.text for j in i]
dict1[i.attrib['Team']] = [k.text for k in i]
print(dict1)
但我无法获得每项运动的团队价值。
试试这个库。
from simplified_scrapy import SimplifiedDoc, utils
xml = '''
<Root>
<Child1>
</Child1>
<Child2>
<Data DId = "1">
<Group ID = "">
<Sport Name="Cricket" Team="6" />
<Sport Name="Football" Team="6" />
<Sport Name="Hockey" Team="5" />
</Group>
</Data>
<Data DId = "2">
<Group ID = "">
<Sport Name="Rugby" Team="6" />
<Sport Name="Baseball" Team="10" />
<Sport Name="Swimming" Team="6" />
</Group>
</Data>
</Child2>
</Root>
'''
# xml = utils.getFileContent('test.xml')
dict1 = {}
doc = SimplifiedDoc(xml)
datas = doc.selects('Data')
for i in datas:
dic = {}
for j in i.selects('Sport'):
dic[j['Name']] = j['Team']
dict1[i['DId']] = dic
print(dict1)
结果:
{'1': {'Cricket': '6', 'Football': '6', 'Hockey': '5'}, '2': {'Rugby': '6', 'Baseball': '10', 'Swimming': '6'}}
我正在使用 python 来解析 XML 文件,但我遇到了问题。我以字典的形式获取值,但如果有两个或更多相同的值,则它们不会重复。我确定有办法解决它,但我是 python 的新手,正在解析 XML...
这里是XML的例子:
<Root>
<Child1>
</Child1>
<Child2>
<Data DId = "1">
<Group ID = "">
<Sport Name="Cricket" Team="6" />
<Sport Name="Football" Team="6" />
<Sport Name="Hockey" Team="5" />
</Group>
</Data>
<Data DId = "2">
<Group ID = "">
<Sport Name="Rugby" Team="6" />
<Sport Name="Baseball" Team="10" />
<Sport Name="Swimming" Team="6" />
</Group>
</Data>
</Child2>
</Root>
我想获取由数据分隔的 Sport 标签值。 我试过的代码是:
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
dict1 = {}
for i in root.iter('Sport'):
dict1[i.attrib['Name']] = [j.text for j in i]
dict1[i.attrib['Team']] = [k.text for k in i]
print(dict1)
但我无法获得每项运动的团队价值。
试试这个库。
from simplified_scrapy import SimplifiedDoc, utils
xml = '''
<Root>
<Child1>
</Child1>
<Child2>
<Data DId = "1">
<Group ID = "">
<Sport Name="Cricket" Team="6" />
<Sport Name="Football" Team="6" />
<Sport Name="Hockey" Team="5" />
</Group>
</Data>
<Data DId = "2">
<Group ID = "">
<Sport Name="Rugby" Team="6" />
<Sport Name="Baseball" Team="10" />
<Sport Name="Swimming" Team="6" />
</Group>
</Data>
</Child2>
</Root>
'''
# xml = utils.getFileContent('test.xml')
dict1 = {}
doc = SimplifiedDoc(xml)
datas = doc.selects('Data')
for i in datas:
dic = {}
for j in i.selects('Sport'):
dic[j['Name']] = j['Team']
dict1[i['DId']] = dic
print(dict1)
结果:
{'1': {'Cricket': '6', 'Football': '6', 'Hockey': '5'}, '2': {'Rugby': '6', 'Baseball': '10', 'Swimming': '6'}}