如何通过 xml 在 Python 中加载特定段落的 xml 文件？

Question

我有一个 xml 文件及其结构，

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<book>
    <toc>        <tocdiv pagenum="564">
            <title>9thmemo</title>
            <tocdiv pagenum="588">
                <title>b</title>
            </tocdiv>
        </tocdiv></toc>
    <chapter><title>9thmemo</title>
        <para>...</para>
        <para>...</para></chapter>
    <chapter>...</chapter>
    <chapter>...</chapter>
</book>

<book>...</book>里面有好几章，每章都有一个标题，我只想看完这一章的全部内容，“9thmemo”（不是其他的）我尝试通过以下代码阅读：

from xml.dom import minidom

filename = "result.xml"
file = minidom.parse(filename)
chapters = file.getElementsByTagName('chapter')
for i in range(10):
    print(chapters[i])

我只得到每一章的地址... 如果我添加一些 sub-element，例如 chapters[i].title，它会显示找不到此属性

Answer 1

I only want to read all content of this chapter,"9thmemo"(not others)

代码的问题是它不会尝试定位特定的 'chapter' 而答案代码使用 xpath 来定位它。

试试下面的方法

import xml.etree.ElementTree as ET


xml = '''<?xml version="1.0" encoding="UTF-8"?>
<book>
   <toc>
      <tocdiv pagenum="564">
         <title>9thmemo</title>
         <tocdiv pagenum="588">
            <title>b</title>
         </tocdiv>
      </tocdiv>
   </toc>
   <chapter>
      <title>9thmemo</title>
      <para>A</para>
      <para>B</para>
   </chapter>
   <chapter>...</chapter>
   <chapter>...</chapter>
</book>'''

root = ET.fromstring(xml)
chapter = root.find('.//chapter/[title="9thmemo"]')
para_data = ','.join(p.text for p in chapter.findall('para'))
print(para_data)

输出

A,B

如何通过 xml 在 Python 中加载特定段落的 xml 文件？

How to load xml file with specifc paragraph by xml in Python?

python

xml

minidom

xml-parsing