Python 3 用 ElementTree 解析 xml 文件
Python 3 parse xml file with ElementTree
帮助,我有以下 XML 文件,我正在尝试从中读取和提取数据,下面是 xml 文件的摘录,
<Variable name="Inboard_ED_mm" state="Output" type="double[]">17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154<Properties><Property name="index">25</Property><Property name="description"></Property><Property name="upperBound">0</Property><Property name="hasUpperBound">false</Property><Property name="lowerBound">0</Property><Property name="hasLowerBound">false</Property><Property name="units"></Property><Property name="enumeratedValues"></Property><Property name="enumeratedAliases"></Property><Property name="validity">true</Property><Property name="autoSize">true</Property><Property name="userSlices"></Property></Properties></Variable>
我正在尝试提取以下内容,17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154
我已经完成了这里的示例,xml.etree.ElementTree — The ElementTree XML API 并且我可以让示例运行,但是当我修改上面的代码 xml 时,代码 returns什么都没有!
这是我的代码,
import xml.etree.ElementTree as ET
work_dir = r"C:\Temp\APROCONE\Python"
with open(model.xml, 'rt') as f:
tree = ET.parse(f)
root = tree.getroot()
for Variable in root.findall('Variable'):
type = Variable.find('type').text
name = Variable.get('name')
print(name, type)
有什么想法吗?在此先感谢您的帮助。
编辑:
感谢所有发表评论的人。在您的建议下,我进行了游戏和搜索并获得了以下代码,
with open(os.path.join(work_dir, "output.txt"), "w") as f:
for child1Tag in root.getchildren():
for child2Tag in child1Tag.getchildren():
for child3Tag in child2Tag.getchildren():
for child4Tag in child3Tag.getchildren():
for child5Tag in child4Tag.getchildren():
name = child5Tag.get('name')
if name == "Inboard_ED_mm":
print(child5Tag.attrib, file=f)
print(name, file=f)
print(child5Tag.text, file=f)
给return以下,
{'name': 'Inboard_ED_mm', 'state': 'Output', 'type': 'double[]'}
Inboard_ED_mm
17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154
我知道,这不是世界上最好的代码,任何关于如何简化代码的想法都非常欢迎。
你的根节点已经是Variable
标签,所以带Variable
标签的findall
是找不到的,只能搜索子节点。您应该简单地输出根节点的 text
属性:
print(root.text)
你说上面是XML文件的"extract"。 XML 的结构很重要。以上是否只是位于根节点内?
for Variable in root.findall('Variable'):
print(Variable.get('name'), Variable.text)
或者它是否存在于 XML 树结构的某个已知级别的更深处?
for Variable in root.findall('Path/To/Variable'):
print(Variable.get('name'), Variable.text)
或者它是否存在于 XML 树结构中某个未指定的更深层次?
for Variable in root.findall('.//Variable'):
print(Variable.get('name'), Variable.text)
展示最后两个:
>>> import xml.etree.ElementTree as ET
>>> src = """
<root>
<SubNode>
<Variable name='x'>17.154, ..., 17.154<Properties>...</Properties></Variable>
<Variable name='y'>14.174, ..., 15.471<Properties>...</Properties></Variable>
</SubNode>
</root>"""
>>> root = ET.fromstring(src)
>>> for Variable in root.findall('SubNode/Variable'):
print(Variable.get('name'), Variable.text)
x 17.154, ..., 17.154
y 14.174, ..., 15.471
>>>
>>> for Variable in root.findall('.//Variable'):
print(Variable.get('name'), Variable.text)
x 17.154, ..., 17.154
y 14.174, ..., 15.471
更新
根据您的new/clearer/updated问题,您正在寻找:
for child in root.findall("*/*/*/*/Variable[@name='Inboard_ED_mm']"):
print(child.attrib, file=f)
print(child.get('name'), file=f)
print(child.text, file=f)
或
for child in root.findall(".//Variable[@name='Inboard_ED_mm']"):
print(child.attrib, file=f)
print(child.get('name'), file=f)
print(child.text, file=f)
通过标记 1 到 4 的确切标记名,我们可以为您提供更准确的 XPath,而不是依赖 */*/*/*/
。
帮助,我有以下 XML 文件,我正在尝试从中读取和提取数据,下面是 xml 文件的摘录,
<Variable name="Inboard_ED_mm" state="Output" type="double[]">17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154<Properties><Property name="index">25</Property><Property name="description"></Property><Property name="upperBound">0</Property><Property name="hasUpperBound">false</Property><Property name="lowerBound">0</Property><Property name="hasLowerBound">false</Property><Property name="units"></Property><Property name="enumeratedValues"></Property><Property name="enumeratedAliases"></Property><Property name="validity">true</Property><Property name="autoSize">true</Property><Property name="userSlices"></Property></Properties></Variable>
我正在尝试提取以下内容,17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154
我已经完成了这里的示例,xml.etree.ElementTree — The ElementTree XML API 并且我可以让示例运行,但是当我修改上面的代码 xml 时,代码 returns什么都没有!
这是我的代码,
import xml.etree.ElementTree as ET
work_dir = r"C:\Temp\APROCONE\Python"
with open(model.xml, 'rt') as f:
tree = ET.parse(f)
root = tree.getroot()
for Variable in root.findall('Variable'):
type = Variable.find('type').text
name = Variable.get('name')
print(name, type)
有什么想法吗?在此先感谢您的帮助。
编辑: 感谢所有发表评论的人。在您的建议下,我进行了游戏和搜索并获得了以下代码,
with open(os.path.join(work_dir, "output.txt"), "w") as f:
for child1Tag in root.getchildren():
for child2Tag in child1Tag.getchildren():
for child3Tag in child2Tag.getchildren():
for child4Tag in child3Tag.getchildren():
for child5Tag in child4Tag.getchildren():
name = child5Tag.get('name')
if name == "Inboard_ED_mm":
print(child5Tag.attrib, file=f)
print(name, file=f)
print(child5Tag.text, file=f)
给return以下,
{'name': 'Inboard_ED_mm', 'state': 'Output', 'type': 'double[]'}
Inboard_ED_mm
17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154
我知道,这不是世界上最好的代码,任何关于如何简化代码的想法都非常欢迎。
你的根节点已经是Variable
标签,所以带Variable
标签的findall
是找不到的,只能搜索子节点。您应该简单地输出根节点的 text
属性:
print(root.text)
你说上面是XML文件的"extract"。 XML 的结构很重要。以上是否只是位于根节点内?
for Variable in root.findall('Variable'):
print(Variable.get('name'), Variable.text)
或者它是否存在于 XML 树结构的某个已知级别的更深处?
for Variable in root.findall('Path/To/Variable'):
print(Variable.get('name'), Variable.text)
或者它是否存在于 XML 树结构中某个未指定的更深层次?
for Variable in root.findall('.//Variable'):
print(Variable.get('name'), Variable.text)
展示最后两个:
>>> import xml.etree.ElementTree as ET
>>> src = """
<root>
<SubNode>
<Variable name='x'>17.154, ..., 17.154<Properties>...</Properties></Variable>
<Variable name='y'>14.174, ..., 15.471<Properties>...</Properties></Variable>
</SubNode>
</root>"""
>>> root = ET.fromstring(src)
>>> for Variable in root.findall('SubNode/Variable'):
print(Variable.get('name'), Variable.text)
x 17.154, ..., 17.154
y 14.174, ..., 15.471
>>>
>>> for Variable in root.findall('.//Variable'):
print(Variable.get('name'), Variable.text)
x 17.154, ..., 17.154
y 14.174, ..., 15.471
更新
根据您的new/clearer/updated问题,您正在寻找:
for child in root.findall("*/*/*/*/Variable[@name='Inboard_ED_mm']"):
print(child.attrib, file=f)
print(child.get('name'), file=f)
print(child.text, file=f)
或
for child in root.findall(".//Variable[@name='Inboard_ED_mm']"):
print(child.attrib, file=f)
print(child.get('name'), file=f)
print(child.text, file=f)
通过标记 1 到 4 的确切标记名,我们可以为您提供更准确的 XPath,而不是依赖 */*/*/*/
。