Python minidom - 解析 XML 文件并写入 CSV

Python minidom - Parse XML file and write to CSV

我正在尝试解析 XML 文件,然后将检索到的 objects 选定内容写入 csv 文件。

这是我的基本 XML 文件:

<?xml version="1.0"?>
<library owner="John Q. Reader">
    <book>
        <title>Sandman Volume 1: Preludes and Nocturnes</title>
        <author>Neil Gaiman</author>
    </book>
    <book>
        <title>Good Omens</title>
        <author>Neil Gamain</author>
        <author>Terry Pratchett</author>
    </book>
    <book>
        <title>"Repent, Harlequin!" Said the Tick-Tock Man</title>
        <author>Harlan Ellison</author>
    </book>
    </book>
</library>

我用 Python 2.7 和 minidom 编写了一个基本脚本。这是:


# Test Parser

from xml.dom.minidom import parse
import xml.dom.minidom

def printLibrary(myLibrary):
    books = myLibrary.getElementsByTagName("book")
    for book in books:
        print "*****Book*****"
        print "Title: %s" % book.getElementsByTagName("title")[0].childNodes[0].data
        a = for author in book.getElementsByTagName("author"):
            print "Author: %s" % author.childNodes[0].data
            a.csv.writer()
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]

# Get book elements in library
books = myLibrary.getElementsByTagName("book")

# Print each book's title
printLibrary(myLibrary)

到目前为止,此脚本在 Win7 中从命令行 运行 显示每本书的书名和作者。

我想要将这些结果输出到 csv 文件,所以它看起来像这样:

标题、作者 书名、作者 书名、作者 书名、作者 书名、作者 等等

但是,我无法让它工作 - 我是 Python 的新手,我从事 IT 工作,SQL 并且我擅长基础编程。

任何帮助将不胜感激!!

使用csv模块。

# Test Parser

from xml.dom.minidom import parse
import csv 


def writeToCSV(myLibrary):
    csvfile = open('output.csv', 'w')
    fieldnames = ['title', 'author']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()

    books = myLibrary.getElementsByTagName("book")
    for book in books:
        titleValue = book.getElementsByTagName("title")[0].childNodes[0].data
        for author in book.getElementsByTagName("author"):
            authorValue = author.childNodes[0].data
            writer.writerow({'title': titleValue, 'author': authorValue})

doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]

# Get book elements in library
books = myLibrary.getElementsByTagName("book")

# Print each book's title
writeToCSV(myLibrary)

输出文件:

title,author
Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman
Good Omens,Neil Gamain
Good Omens,Terry Pratchett
"""Repent, Harlequin!"" Said the Tick-Tock Man",Harlan Ellison