如何将多个 XML 文件解析为多个 CSV 文件?

How to parse several XML file into several CSV file?

我使用此代码解析了 XML 文件,该代码适用于单个 xml 输入到单个 csv 输出。我尝试使用 glob 处理多个输入以及多个 csv 输出,但我知道这是不正确的。

import glob
import xml.etree.ElementTree as et
import csv

for file in glob.glob('./*.xml'):
    with open(file) as f:
        tree = et.parse(f)
        nodes = tree.getroot()

        with open(f'{f[:-4]}edited.csv', 'w') as ff:
            cols = ['dateTime','x','y','z','motion','isMoving','stepCount','groupAreaId','commit']
            nodewriter = csv.writer(ff)
            nodewriter.writerow(cols)
            for node in nodes:
                values = [ node.attrib.get(kk, '') for kk in cols]
                nodewriter.writerow(values)

我应该如何更改以获得多个 csv 输出?

您可以创建文件名列表,然后在其中写入 xml 文件。如果输出文件已经在目录中,则使用 glob 可以获得名称。如果文件不存在,下面的代码将使用给定的文件名

创建
csvFileNames = ['outputfile1.csv', 'outputfile2.csv']
for file in csvFileNames:
    with open(file, 'w') as f:
        wtr = csv.writer(f)
        wtr.writerows( [[1, 2], [2, 3], [4, 5]]) # write what you want

要从目录中获取 XML 文件名,您可以尝试以下代码:

from os import listdir
filenames = listdir('.') # here dot is used because script and csv files are in the same directory, if XML files are in other directory then set the path inside listdir
xmlFileNames = [ filename for filename in filenames if filename.endswith( ".xml" ) ]

# get xml file names like this, xmlFileNames = ["abc.xml", "ef.xml"]
resultCsvFileNameList = [fname.replace(".xml", ".csv") for fname in xmlFileNames ]

您的代码当前正在使用文件句柄来形成输出文件名。使用 file 代替 f,如下所示:

import glob
import xml.etree.ElementTree as et
import csv

for file in glob.glob('./*.xml'):
    with open(file) as f:
        tree = et.parse(f)
        nodes = tree.getroot()

        with open(f'{file[:-4]}edited.csv', 'w') as ff:
            cols = ['dateTime','x','y','z','motion','isMoving','stepCount','groupAreaId','commit']
            nodewriter = csv.writer(ff)
            nodewriter.writerow(cols)
            for node in nodes:
                values = [ node.attrib.get(kk, '') for kk in cols]
                nodewriter.writerow(values)