解析 XML:如何使用 Python 从 XML 文件中具有相同名称但不同文本的行中获取所有信息?
Parsing XML: How can I get all the information from lines with same name but different text in XML file using Python?
我正在尝试解析 ICD10 XML 文件,但在提取信息时遇到了一些问题。
<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
<name>A00.0</name>
<desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
<inclusionTerm>
<note>Classical cholera</note>
<note>Classical cholera again</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.1</name>
<desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
<inclusionTerm>
<note>Cholera eltor</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.9</name>
<desc>Cholera, unspecified</desc>
</diag>
</diag>
使用这个:
from xml.etree import ElementTree as ET
root = ET.parse('cut.xml')
diag = root.find(".//*[name='A00.0']")
inclusionTerm = diag.find('inclusionTerm')
if inclusionTerm is not None:
print('Inclusion Term: '+diag.find('inclusionTerm').find('note').text)
该代码仅打印 A00.0 ID 中“包含项”内的第一个注释。如何编写代码以获取 'inclusionTerm' 中的所有 'notes'?
可以编写一个 XPath 表达式来访问所有 note
个元素:
from xml.etree import ElementTree as ET
xml = '''<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
<name>A00.0</name>
<desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
<inclusionTerm>
<note>Classical cholera</note>
<note>Classical cholera again</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.1</name>
<desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
<inclusionTerm>
<note>Cholera eltor</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.9</name>
<desc>Cholera, unspecified</desc>
</diag>
</diag>'''
root = ET.fromstring(xml)
notes = root.findall('.//diag[name="A00.0"]/inclusionTerm/note')
for note in notes:
print(note.text)
我正在尝试解析 ICD10 XML 文件,但在提取信息时遇到了一些问题。
<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
<name>A00.0</name>
<desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
<inclusionTerm>
<note>Classical cholera</note>
<note>Classical cholera again</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.1</name>
<desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
<inclusionTerm>
<note>Cholera eltor</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.9</name>
<desc>Cholera, unspecified</desc>
</diag>
</diag>
使用这个:
from xml.etree import ElementTree as ET
root = ET.parse('cut.xml')
diag = root.find(".//*[name='A00.0']")
inclusionTerm = diag.find('inclusionTerm')
if inclusionTerm is not None:
print('Inclusion Term: '+diag.find('inclusionTerm').find('note').text)
该代码仅打印 A00.0 ID 中“包含项”内的第一个注释。如何编写代码以获取 'inclusionTerm' 中的所有 'notes'?
可以编写一个 XPath 表达式来访问所有 note
个元素:
from xml.etree import ElementTree as ET
xml = '''<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
<name>A00.0</name>
<desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
<inclusionTerm>
<note>Classical cholera</note>
<note>Classical cholera again</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.1</name>
<desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
<inclusionTerm>
<note>Cholera eltor</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.9</name>
<desc>Cholera, unspecified</desc>
</diag>
</diag>'''
root = ET.fromstring(xml)
notes = root.findall('.//diag[name="A00.0"]/inclusionTerm/note')
for note in notes:
print(note.text)