Python minidom - 解析 XML 文件并写入 CSV
Python minidom - Parse XML file and write to CSV
我正在尝试解析 XML 文件,然后将检索到的 objects 选定内容写入 csv 文件。
这是我的基本 XML 文件:
<?xml version="1.0"?>
<library owner="John Q. Reader">
<book>
<title>Sandman Volume 1: Preludes and Nocturnes</title>
<author>Neil Gaiman</author>
</book>
<book>
<title>Good Omens</title>
<author>Neil Gamain</author>
<author>Terry Pratchett</author>
</book>
<book>
<title>"Repent, Harlequin!" Said the Tick-Tock Man</title>
<author>Harlan Ellison</author>
</book>
</book>
</library>
我用 Python 2.7 和 minidom 编写了一个基本脚本。这是:
# Test Parser
from xml.dom.minidom import parse
import xml.dom.minidom
def printLibrary(myLibrary):
books = myLibrary.getElementsByTagName("book")
for book in books:
print "*****Book*****"
print "Title: %s" % book.getElementsByTagName("title")[0].childNodes[0].data
a = for author in book.getElementsByTagName("author"):
print "Author: %s" % author.childNodes[0].data
a.csv.writer()
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Get book elements in library
books = myLibrary.getElementsByTagName("book")
# Print each book's title
printLibrary(myLibrary)
到目前为止,此脚本在 Win7 中从命令行 运行 显示每本书的书名和作者。
我想要将这些结果输出到 csv 文件,所以它看起来像这样:
标题、作者
书名、作者
书名、作者
书名、作者
书名、作者
等等
但是,我无法让它工作 - 我是 Python 的新手,我从事 IT 工作,SQL 并且我擅长基础编程。
任何帮助将不胜感激!!
使用csv模块。
# Test Parser
from xml.dom.minidom import parse
import csv
def writeToCSV(myLibrary):
csvfile = open('output.csv', 'w')
fieldnames = ['title', 'author']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
books = myLibrary.getElementsByTagName("book")
for book in books:
titleValue = book.getElementsByTagName("title")[0].childNodes[0].data
for author in book.getElementsByTagName("author"):
authorValue = author.childNodes[0].data
writer.writerow({'title': titleValue, 'author': authorValue})
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Get book elements in library
books = myLibrary.getElementsByTagName("book")
# Print each book's title
writeToCSV(myLibrary)
输出文件:
title,author
Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman
Good Omens,Neil Gamain
Good Omens,Terry Pratchett
"""Repent, Harlequin!"" Said the Tick-Tock Man",Harlan Ellison
我正在尝试解析 XML 文件,然后将检索到的 objects 选定内容写入 csv 文件。
这是我的基本 XML 文件:
<?xml version="1.0"?>
<library owner="John Q. Reader">
<book>
<title>Sandman Volume 1: Preludes and Nocturnes</title>
<author>Neil Gaiman</author>
</book>
<book>
<title>Good Omens</title>
<author>Neil Gamain</author>
<author>Terry Pratchett</author>
</book>
<book>
<title>"Repent, Harlequin!" Said the Tick-Tock Man</title>
<author>Harlan Ellison</author>
</book>
</book>
</library>
我用 Python 2.7 和 minidom 编写了一个基本脚本。这是:
# Test Parser
from xml.dom.minidom import parse
import xml.dom.minidom
def printLibrary(myLibrary):
books = myLibrary.getElementsByTagName("book")
for book in books:
print "*****Book*****"
print "Title: %s" % book.getElementsByTagName("title")[0].childNodes[0].data
a = for author in book.getElementsByTagName("author"):
print "Author: %s" % author.childNodes[0].data
a.csv.writer()
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Get book elements in library
books = myLibrary.getElementsByTagName("book")
# Print each book's title
printLibrary(myLibrary)
到目前为止,此脚本在 Win7 中从命令行 运行 显示每本书的书名和作者。
我想要将这些结果输出到 csv 文件,所以它看起来像这样:
标题、作者 书名、作者 书名、作者 书名、作者 书名、作者 等等
但是,我无法让它工作 - 我是 Python 的新手,我从事 IT 工作,SQL 并且我擅长基础编程。
任何帮助将不胜感激!!
使用csv模块。
# Test Parser
from xml.dom.minidom import parse
import csv
def writeToCSV(myLibrary):
csvfile = open('output.csv', 'w')
fieldnames = ['title', 'author']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
books = myLibrary.getElementsByTagName("book")
for book in books:
titleValue = book.getElementsByTagName("title")[0].childNodes[0].data
for author in book.getElementsByTagName("author"):
authorValue = author.childNodes[0].data
writer.writerow({'title': titleValue, 'author': authorValue})
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Get book elements in library
books = myLibrary.getElementsByTagName("book")
# Print each book's title
writeToCSV(myLibrary)
输出文件:
title,author
Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman
Good Omens,Neil Gamain
Good Omens,Terry Pratchett
"""Repent, Harlequin!"" Said the Tick-Tock Man",Harlan Ellison