代码未输出到正确的文件夹 Python
Code not outputting to correct folder Python
所以我有一些代码可以打开一个包含文件路径列表的文本文件,如下所示:
C:/Users/User/Desktop/mini_mouse/1980
C:/Users/User/Desktop/mini_mouse/1982
C:/Users/User/Desktop/mini_mouse/1984
然后逐行单独打开这些文件,并对文件进行一些过滤。然后我希望它将结果输出到一个完全不同的文件夹,名为:
output_location = 'C:/Users/User/Desktop/test2/'
就目前而言,我的代码当前将结果输出到原始文件打开的位置,即如果它打开文件 C:/Users/User/Desktop/mini_mouse/1980,输出将位于名为 ' 1980_filtered'。但是,我希望输出进入 output_location。谁能看到我目前哪里出错了?任何帮助将不胜感激!这是我的代码:
import os
def main():
stop_words_path = 'C:/Users/User/Desktop/NLTK-stop-word-list.txt'
stopwords = get_stop_words_list(stop_words_path)
output_location = 'C:/Users/User/Desktop/test2/'
list_file = 'C:/Users/User/Desktop/list_of_files.txt'
with open(list_file, 'r') as f:
for file_name in f:
#print(file_name)
if file_name.endswith('\n'):
file_name = file_name[:-1]
#print(file_name)
file_path = os.path.join(file_name) # joins the new path of the file to the current file in order to access the file
filestring = '' # file string which will take all the lines in the file and add them to itself
with open(file_path, 'r') as f2: # open the file
print('just opened ' + file_name)
print('\n')
for line in f2: # read file line by line
x = remove_stop_words(line, stopwords) # remove stop words from line
filestring += x # add newly filtered line to the file string
filestring += '\n' # Create new line
new_file_path = os.path.join(output_location, file_name) + '_filtered' # creates a new file of the file that is currenlty being filtered of stopwords
with open(new_file_path, 'a') as output_file: # opens output file
output_file.write(filestring)
if __name__ == "__main__":
main()
假设您正在使用 Windows(因为您有一个普通的 Windows 文件系统),您必须在路径名中使用反斜杠。请注意,这仅适用于 Windows。我知道这很烦人,所以我为您更改了它(不客气 :))。您还必须使用两个反斜杠,因为它会尝试将其用作转义字符。
import os
def main():
stop_words_path = 'C:\Users\User\Desktop\NLTK-stop-word-list.txt'
stopwords = get_stop_words_list(stop_words_path)
output_location = 'C:\Users\User\Desktop\test2\'
list_file = 'C:\Users\User\Desktop\list_of_files.txt'
with open(list_file, 'r') as f:
for file_name in f:
#print(file_name)
if file_name.endswith('\n'):
file_name = file_name[:-1]
#print(file_name)
file_path = os.path.join(file_name) # joins the new path of the file to the current file in order to access the file
filestring = '' # file string which will take all the lines in the file and add them to itself
with open(file_path, 'r') as f2: # open the file
print('just opened ' + file_name)
print('\n')
for line in f2: # read file line by line
x = remove_stop_words(line, stopwords) # remove stop words from line
filestring += x # add newly filtered line to the file string
filestring += '\n' # Create new line
new_file_path = os.path.join(output_location, file_name) + '_filtered' # creates a new file of the file that is currenlty being filtered of stopwords
with open(new_file_path, 'a') as output_file: # opens output file
output_file.write(filestring)
if __name__ == "__main__":
main()
根据你的代码,它看起来像行中的问题:
new_file_path = os.path.join(output_location, file_name) + '_filtered'
在Python的os.path.join()输入中的任何绝对路径(或Windows中的驱动器号)将丢弃所有内容在它之前并从新的绝对路径(或驱动器号)重新启动连接。由于您直接从 list_of_files.txt 调用 file_name 并且您已将每个路径格式化为相对于 C: 驱动器,每次调用 os.path.join() 都会删除 output_location 并重置为原始文件路径。
有关此行为的更好解释,请参阅 Why doesn't os.path.join() work in this case?。
构建输出路径时,您可以从路径 "C:/Users/User/Desktop/mini_mouse/1980" 中删除文件名,例如“1980”,然后根据 output_location[=26= 加入] 变量和隔离文件名。
所以我有一些代码可以打开一个包含文件路径列表的文本文件,如下所示:
C:/Users/User/Desktop/mini_mouse/1980
C:/Users/User/Desktop/mini_mouse/1982
C:/Users/User/Desktop/mini_mouse/1984
然后逐行单独打开这些文件,并对文件进行一些过滤。然后我希望它将结果输出到一个完全不同的文件夹,名为:
output_location = 'C:/Users/User/Desktop/test2/'
就目前而言,我的代码当前将结果输出到原始文件打开的位置,即如果它打开文件 C:/Users/User/Desktop/mini_mouse/1980,输出将位于名为 ' 1980_filtered'。但是,我希望输出进入 output_location。谁能看到我目前哪里出错了?任何帮助将不胜感激!这是我的代码:
import os
def main():
stop_words_path = 'C:/Users/User/Desktop/NLTK-stop-word-list.txt'
stopwords = get_stop_words_list(stop_words_path)
output_location = 'C:/Users/User/Desktop/test2/'
list_file = 'C:/Users/User/Desktop/list_of_files.txt'
with open(list_file, 'r') as f:
for file_name in f:
#print(file_name)
if file_name.endswith('\n'):
file_name = file_name[:-1]
#print(file_name)
file_path = os.path.join(file_name) # joins the new path of the file to the current file in order to access the file
filestring = '' # file string which will take all the lines in the file and add them to itself
with open(file_path, 'r') as f2: # open the file
print('just opened ' + file_name)
print('\n')
for line in f2: # read file line by line
x = remove_stop_words(line, stopwords) # remove stop words from line
filestring += x # add newly filtered line to the file string
filestring += '\n' # Create new line
new_file_path = os.path.join(output_location, file_name) + '_filtered' # creates a new file of the file that is currenlty being filtered of stopwords
with open(new_file_path, 'a') as output_file: # opens output file
output_file.write(filestring)
if __name__ == "__main__":
main()
假设您正在使用 Windows(因为您有一个普通的 Windows 文件系统),您必须在路径名中使用反斜杠。请注意,这仅适用于 Windows。我知道这很烦人,所以我为您更改了它(不客气 :))。您还必须使用两个反斜杠,因为它会尝试将其用作转义字符。
import os
def main():
stop_words_path = 'C:\Users\User\Desktop\NLTK-stop-word-list.txt'
stopwords = get_stop_words_list(stop_words_path)
output_location = 'C:\Users\User\Desktop\test2\'
list_file = 'C:\Users\User\Desktop\list_of_files.txt'
with open(list_file, 'r') as f:
for file_name in f:
#print(file_name)
if file_name.endswith('\n'):
file_name = file_name[:-1]
#print(file_name)
file_path = os.path.join(file_name) # joins the new path of the file to the current file in order to access the file
filestring = '' # file string which will take all the lines in the file and add them to itself
with open(file_path, 'r') as f2: # open the file
print('just opened ' + file_name)
print('\n')
for line in f2: # read file line by line
x = remove_stop_words(line, stopwords) # remove stop words from line
filestring += x # add newly filtered line to the file string
filestring += '\n' # Create new line
new_file_path = os.path.join(output_location, file_name) + '_filtered' # creates a new file of the file that is currenlty being filtered of stopwords
with open(new_file_path, 'a') as output_file: # opens output file
output_file.write(filestring)
if __name__ == "__main__":
main()
根据你的代码,它看起来像行中的问题:
new_file_path = os.path.join(output_location, file_name) + '_filtered'
在Python的os.path.join()输入中的任何绝对路径(或Windows中的驱动器号)将丢弃所有内容在它之前并从新的绝对路径(或驱动器号)重新启动连接。由于您直接从 list_of_files.txt 调用 file_name 并且您已将每个路径格式化为相对于 C: 驱动器,每次调用 os.path.join() 都会删除 output_location 并重置为原始文件路径。
有关此行为的更好解释,请参阅 Why doesn't os.path.join() work in this case?。
构建输出路径时,您可以从路径 "C:/Users/User/Desktop/mini_mouse/1980" 中删除文件名,例如“1980”,然后根据 output_location[=26= 加入] 变量和隔离文件名。