如何获取 xml 个元素,这些元素的子元素具有特定的标签和属性
How to get xml elements which have childs with a certain tag and attribute
我想找到 xml 个具有特定子元素的元素。子元素需要有一个给定的标签和一个设置为特定值的属性。
举个具体的例子(基于official documentation)。我想找到所有具有子元素 neighbor
且属性为 name="Austria"
:
的 country
元素
import xml.etree.ElementTree as ET
data = """<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
<partner name="Austria"/>
</country>
<country name="Panama">
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
"""
root = ET.fromstring(data)
我试过没有成功的:
countries1 = root.findall('.//country[neighbor@name="Austria"]')
countries2 = root.findall('.//country[neighbor][@name="Austria"]')
countries3 = root.findall('.//country[neighbor[@name="Austria"]]')
全部给出:
SyntaxError: invalid predicate
以下解决方案显然是错误的,因为找到的元素太多:
countries4 = root.findall('.//country/*[@name="Austria"]')
countries5 = root.findall('.//country/[neighbor]')
其中 countries4
包含具有属性 name="Austria"
的所有元素,但包括 partner
元素。 countries5
包含具有 any 个相邻元素作为子元素的所有元素。
import xml.etree.ElementTree as ET
data = """<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
<partner name="Austria"/>
</country>
<country name="Panama">
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
<country name="Liechtenstein">
<neighbor name="Austria" direction="dummy"/>
<neighbor name="Switzerland" direction="W"/>
</country>
</data>
"""
root = ET.fromstring(data)
for x in root.findall(".//country/neighbor[@name='Austria']"):
print(x.attrib)
输出:
{'name': 'Austria', 'direction': 'E'}
{'name': 'Austria', 'direction': 'dummy'}
//
: Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree.
[@attrib='value']
: Selects all elements for which the given attribute has the given value. The value cannot contain quotes
for x in root.find('.'):
if x[0].attrib['name'] == 'Austria':
print(x.attrib['name'])
输出:
Liechtenstein
I want to find all country elements which have a child element neighbor with attribute name="Austria"
见下文
import xml.etree.ElementTree as ET
data = """<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
<partner name="Austria"/>
</country>
<country name="Panama">
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
"""
root = ET.fromstring(data)
countries_with_austria_as_neighbor = [c.attrib['name'] for c in root.findall('.//country') if
'Austria' in [n.attrib['name'] for n in c.findall('neighbor')]]
print(countries_with_austria_as_neighbor)
输出
['Liechtenstein']
我想找到 xml 个具有特定子元素的元素。子元素需要有一个给定的标签和一个设置为特定值的属性。
举个具体的例子(基于official documentation)。我想找到所有具有子元素 neighbor
且属性为 name="Austria"
:
country
元素
import xml.etree.ElementTree as ET
data = """<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
<partner name="Austria"/>
</country>
<country name="Panama">
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
"""
root = ET.fromstring(data)
我试过没有成功的:
countries1 = root.findall('.//country[neighbor@name="Austria"]')
countries2 = root.findall('.//country[neighbor][@name="Austria"]')
countries3 = root.findall('.//country[neighbor[@name="Austria"]]')
全部给出:
SyntaxError: invalid predicate
以下解决方案显然是错误的,因为找到的元素太多:
countries4 = root.findall('.//country/*[@name="Austria"]')
countries5 = root.findall('.//country/[neighbor]')
其中 countries4
包含具有属性 name="Austria"
的所有元素,但包括 partner
元素。 countries5
包含具有 any 个相邻元素作为子元素的所有元素。
import xml.etree.ElementTree as ET
data = """<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
<partner name="Austria"/>
</country>
<country name="Panama">
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
<country name="Liechtenstein">
<neighbor name="Austria" direction="dummy"/>
<neighbor name="Switzerland" direction="W"/>
</country>
</data>
"""
root = ET.fromstring(data)
for x in root.findall(".//country/neighbor[@name='Austria']"):
print(x.attrib)
输出:
{'name': 'Austria', 'direction': 'E'}
{'name': 'Austria', 'direction': 'dummy'}
//
: Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree.
[@attrib='value']
: Selects all elements for which the given attribute has the given value. The value cannot contain quotes
for x in root.find('.'):
if x[0].attrib['name'] == 'Austria':
print(x.attrib['name'])
输出:
Liechtenstein
I want to find all country elements which have a child element neighbor with attribute name="Austria"
见下文
import xml.etree.ElementTree as ET
data = """<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
<partner name="Austria"/>
</country>
<country name="Panama">
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
"""
root = ET.fromstring(data)
countries_with_austria_as_neighbor = [c.attrib['name'] for c in root.findall('.//country') if
'Austria' in [n.attrib['name'] for n in c.findall('neighbor')]]
print(countries_with_austria_as_neighbor)
输出
['Liechtenstein']