尝试在 for 循环中仅使用多个条目填充列表 returns 单个条目

Question

我有一个 csv 文件，其中包含我想从中提取数据的 URL，但我的脚本目前只能设法获取要附加的最后一个条目。这是脚本：

import os
import glob
import time
from urllib.request import urlopen
import pandas as pd
import xml.etree.ElementTree as ET
count=0
files=glob.glob('./extract/isbnlist/Reihe*_isbn-dnb2.csv',recursive=True) #searches all files in folder
print(files)

for file in files:
    if count==0:
        csvfile = pd.read_csv(file, sep='\t', encoding='utf-8')
        for row in csvfile['URL']:
            print('row: ' + row)
            with urlopen(str(row)) as response:
                doc = ET.parse(response)  
                root = doc.getroot()
                namespaces = {  # Manually extracted from the XML file, but there could be code written to automatically do that.
            "zs": "http://www.loc.gov/zing/srw/",
            "": "http://www.loc.gov/MARC21/slim",
                }
            datafield_nodes_path = "./zs:records/zs:record/zs:recordData/record/datafield"  # XPath
            datafield_attribute_filters = [ #which fields to extract
            {
            "tag": "100", #author
            "ind1": "1",
            "ind2": " ",
            }]
            #datafield_attribute_filters = []  # Decomment this line to clear filters (and process each datafield node)
            aut = []
            for datafield_node in root.iterfind(datafield_nodes_path, namespaces=namespaces):
                if datafield_attribute_filters:
                    skip_node = True
                    for attr_dict in datafield_attribute_filters:
                        for k, v in attr_dict.items():
                            if datafield_node.get(k) != v:
                                break
                        else:
                            skip_node = False
                            break
                    if skip_node:
                        continue
                for subfield_node in datafield_node.iterfind("./subfield[@code='a']", namespaces=namespaces):
                    aut.append(subfield_node.text) #this gets the author name and title
                    
            print(aut)
        count+=1

这是 csv 文件：

    URL
0   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783960382850&recordSchema=MARC21-xml
1   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783963622106&recordSchema=MARC21-xml
2   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D-&recordSchema=MARC21-xml
3   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783806241280&recordSchema=MARC21-xml
4   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783890296005&recordSchema=MARC21-xml
5   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783110699111&recordSchema=MARC21-xml
6   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783110698930&recordSchema=MARC21-xml
7   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783110699104&recordSchema=MARC21-xml
8   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783963621093&recordSchema=MARC21-xml
9   http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9783451716034&recordSchema=MARC21-xml
10  http://services.dnb.de/sru/dnb?version=1.1&operation=searchRetrieve&query=ISBN%3D9788791953514&recordSchema=MARC21-xml

当我执行脚本时，输出是：

['Schmidt, Horst']

但我还需要其他结果。我怎样才能做到这一点？感谢任何帮助。

编辑：link 到 Pastebin 上的完整 csv 文件，文件名是：Reihe-21A51.csv_extract.csv_isbn-dnb2.csv

Answer 1

正如@Tranbi 指出的那样，我不得不将 aut=[] 移出循环现在

for file in files:
    if count==0: #to only go through the first file, instead of all files in the folder
        csvfile = pd.read_csv(file, sep='\t', encoding='utf-8')
        aut = []

而不是

aut = []
            for datafield_node in root.iterfind(datafield_nodes_path, namespaces=namespaces):

尝试在 for 循环中仅使用多个条目填充列表 returns 单个条目

Attempt to populate list with multiple entries in a for loop only returns a single entry

csv

python-3.x

pandas