解析 XML:如何使用 Python 从 XML 文件中具有相同名称但不同文本的行中获取所有信息?

Parsing XML: How can I get all the information from lines with same name but different text in XML file using Python?

我正在尝试解析 ICD10 XML 文件,但在提取信息时遇到了一些问题。

<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
  <name>A00.0</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
  <inclusionTerm>
    <note>Classical cholera</note>
    <note>Classical cholera again</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.1</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
  <inclusionTerm>
    <note>Cholera eltor</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.9</name>
  <desc>Cholera, unspecified</desc>
</diag>
</diag>

使用这个:

from xml.etree import ElementTree as ET
root = ET.parse('cut.xml')
diag = root.find(".//*[name='A00.0']")
inclusionTerm = diag.find('inclusionTerm')
if inclusionTerm is not None:
    print('Inclusion Term: '+diag.find('inclusionTerm').find('note').text)

该代码仅打印 A00.0 ID 中“包含项”内的第一个注释。如何编写代码以获取 'inclusionTerm' 中的所有 'notes'?

可以编写一个 XPath 表达式来访问所有 note 个元素:

from xml.etree import ElementTree as ET

xml = '''<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
  <name>A00.0</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
  <inclusionTerm>
    <note>Classical cholera</note>
    <note>Classical cholera again</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.1</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
  <inclusionTerm>
    <note>Cholera eltor</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.9</name>
  <desc>Cholera, unspecified</desc>
</diag>
</diag>'''

root = ET.fromstring(xml)

notes = root.findall('.//diag[name="A00.0"]/inclusionTerm/note')

for note in notes:
  print(note.text)