Python 一旦退出 for 循环,列表大小就会发生变化

Python list size changes once it gets out of for loop

MFCC_coeffs = 12
train_data = []
current_block = []
MAX_ROWS = 29
row_counter = 0

for line in f:
    element = line.split(' ')
    if(len(element) == MFCC_coeffs+1):
        row_counter = row_counter + 1
        element = element[:-1]
        element = [float(i) for i in element]
        current_block.append(element)
        # print("HERE")
        # print(f"element = {element}, length = {len(element)}")

    elif(len(element) == 1):
        if row_counter<MAX_ROWS:
            padding = MAX_ROWS-row_counter
            while(padding):
                pad_row = [0]*MFCC_coeffs
                current_block.append(pad_row)
                padding = padding-1
            
        row_counter = 0    
        # print(f"element = {element}, length = {len(element)}")
        # print(f"current_block = {current_block}, shape = {np.shape(current_block)}")
        train_data.append(current_block)
        # print(f"train_data shape = {np.shape(train_data)}") ## PRINTS CORRECT SIZE AT THE END OF THE FILE. E.G. (370,29,12)
        current_block.clear()
        continue
    else:
        assert("Wrong Data")
    
print(f"train_data = {train_data}, shape = {np.shape(train_data)}")    ## SIZE TO (370,0)

在前面的代码块中,我正在读取一个文本文件并将其存储到一个 train_data 变量中。当我逐行浏览文本文件时,train_data 被附加到最后达到大小 (370*29*12),这是正确的。但是,一旦我退出文件读取代码块,train_data 的最终大小就会重置为 (370*0)。输出正确的部分和错误的部分我都用大写字母注释了。

问题是,您将对 current_block 的相同引用附加到 train_data。考虑这个例子:

current_block = []
train_data = []

current_block = [1]
train_data.append(current_block)  # train_data = [[1]]
current_block.clear()  # train_data = [[]]

current_block.append(2) # train_data = [[2]]
train_data.append(current_block)  # train_data = [[2], [2]]
current_block.clear()  # train_data = [[], []]

如您所见,current_block.clear() 将列表 - 以及随后 train_data 中的所有引用。

解决方案是将副本附加到 train_data:

train_data.append(current_block[:])

这样,下一个 current_block.clear() 将不会清除 train_data 中已有的任何数据。