代码未输出到正确的文件夹 Python

Code not outputting to correct folder Python

所以我有一些代码可以打开一个包含文件路径列表的文本文件,如下所示:

C:/Users/User/Desktop/mini_mouse/1980

C:/Users/User/Desktop/mini_mouse/1982

C:/Users/User/Desktop/mini_mouse/1984

然后逐行单独打开这些文件,并对文件进行一些过滤。然后我希望它将结果输出到一个完全不同的文件夹,名为:

output_location = 'C:/Users/User/Desktop/test2/'

就目前而言,我的代码当前将结果输出到原始文件打开的位置,即如果它打开文件 C:/Users/User/Desktop/mini_mouse/1980,输出将位于名为 ' 1980_filtered'。但是,我希望输出进入 output_location。谁能看到我目前哪里出错了?任何帮助将不胜感激!这是我的代码:

import os

def main():
stop_words_path = 'C:/Users/User/Desktop/NLTK-stop-word-list.txt'
stopwords = get_stop_words_list(stop_words_path)
output_location = 'C:/Users/User/Desktop/test2/'

list_file = 'C:/Users/User/Desktop/list_of_files.txt'

with open(list_file, 'r') as f:
    for file_name in f:
        #print(file_name)
        if file_name.endswith('\n'):
            file_name = file_name[:-1]
        #print(file_name)
        file_path = os.path.join(file_name)  # joins the new path of the file to the current file in order to access the file

        filestring = ''  # file string which will take all the lines in the file and add them to itself
        with open(file_path, 'r') as f2:  # open the file
            print('just opened ' + file_name)
            print('\n')
            for line in f2:  # read file line by line
                
                x = remove_stop_words(line, stopwords)  # remove stop words from line
                filestring += x  # add newly filtered line to the file string
                filestring += '\n'  # Create new line
            
        new_file_path = os.path.join(output_location, file_name) + '_filtered'  # creates a new file of the file that is currenlty being filtered of stopwords
        with open(new_file_path, 'a') as output_file:  # opens output file
            output_file.write(filestring)


if __name__ == "__main__":
    main()

假设您正在使用 Windows(因为您有一个普通的 Windows 文件系统),您必须在路径名中使用反斜杠。请注意,这仅适用于 Windows。我知道这很烦人,所以我为您更改了它(不客气 :))。您还必须使用两个反斜杠,因为它会尝试将其用作转义字符。

import os

def main():
stop_words_path = 'C:\Users\User\Desktop\NLTK-stop-word-list.txt'
stopwords = get_stop_words_list(stop_words_path)
output_location = 'C:\Users\User\Desktop\test2\'

list_file = 'C:\Users\User\Desktop\list_of_files.txt'

with open(list_file, 'r') as f:
    for file_name in f:
        #print(file_name)
        if file_name.endswith('\n'):
            file_name = file_name[:-1]
        #print(file_name)
        file_path = os.path.join(file_name)  # joins the new path of the file to the current file in order to access the file

        filestring = ''  # file string which will take all the lines in the file and add them to itself
        with open(file_path, 'r') as f2:  # open the file
            print('just opened ' + file_name)
            print('\n')
            for line in f2:  # read file line by line

                x = remove_stop_words(line, stopwords)  # remove stop words from line
                filestring += x  # add newly filtered line to the file string
                filestring += '\n'  # Create new line

        new_file_path = os.path.join(output_location, file_name) + '_filtered'  # creates a new file of the file that is currenlty being filtered of stopwords
        with open(new_file_path, 'a') as output_file:  # opens output file
            output_file.write(filestring)


if __name__ == "__main__":
    main()

根据你的代码,它看起来像行中的问题:

new_file_path = os.path.join(output_location, file_name) + '_filtered'

在Python的os.path.join()输入中的任何绝对路径(或Windows中的驱动器号)将丢弃所有内容在它之前并从新的绝对路径(或驱动器号)重新启动连接。由于您直接从 list_of_files.txt 调用 file_name 并且您已将每个路径格式化为相对于 C: 驱动器,每次调用 os.path.join() 都会删除 output_location 并重置为原始文件路径。

有关此行为的更好解释,请参阅 Why doesn't os.path.join() work in this case?

构建输出路径时,您可以从路径 "C:/Users/User/Desktop/mini_mouse/1980" 中删除文件名,例如“1980”,然后根据 output_location[=26= 加入] 变量和隔离文件名。