Iterating through directory error: FileNotFoundError: [Errno 2] No such file or directory
Iterating through directory error: FileNotFoundError: [Errno 2] No such file or directory
我编写了一个脚本来遍历目录中的多个文本文件,并计算每个文件中包含的单词也包含在字典文件中。我用目录中的两个文件编写并测试了脚本,并使其完美运行,脚本吐出两个准确的整数,每个文件一个。但是,一旦我将新文件添加到目录中,就会出现 FileNotFound 错误。文件肯定在里面!谁能告诉我导致此问题的代码是什么?我在 Whosebug 上浏览过各种其他此类帖子,但都没有成功。新添加的文件具有与现有两个相同的所有属性。
代码(word_count_from_dictionary-iterating.py):
import os
import sys
import nltk
nltk.download()
from nltk import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
import io
files_path = sys.argv[1]
textfile_dictionary = sys.argv[2]
for filename in os.listdir(files_path):
if filename.endswith(".txt"):
#accessing file for processing
file = open(filename, "rt")
text = file.read()
#tokenize text file
tokens = word_tokenize(text)
#remove non-alphabetical characters
words = []
for word in tokens:
if word.isalpha():
words.append(word)
#remove stopwords
stop_words = stopwords.words("english")
words_without_stops = []
for w in words:
if not w in stop_words:
words_without_stops.append(w)
#lemmatize remaining tokens and print
lemmatizer = WordNetLemmatizer()
lemmas = []
for x in words_without_stops:
lemmatizer.lemmatize(x)
lemmas.append(x)
#turn dictionary held in text file into a list of tokens
file = io.open(textfile_dictionary, mode="r", encoding="utf8")
dictionaryread = file.read()
dictionary = dictionaryread.split()
#count instances of each word in dictionary in the novel and add them up
word_count = 0
for element in dictionary:
for lemma in lemmas:
if lemma == element:
word_count = word_count + 1
print(word_count)
目录中只有两个测试文件的命令行结果:
c@Computer:~/Dropbox/programming/first_project$ python3 word_count_from_dictionary_iterating.py directoryaddress dictionary.txt
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
241
229
将新文件 (newfile.txt) 添加到目录后的结果:
c@Computer:~/Dropbox/programming/first_project$ python3 word_count_from_dictionary_iterating.py directoryaddress happy_words.txt
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
241
229
Traceback (most recent call last):
File "word_count_from_dictionary_iterating.py", line 17, in <module>
file = open(filename, "rt")
FileNotFoundError: [Errno 2] No such file or directory: 'newfile.txt'
如果我 运行 ls 在目录上,文件就会出现。如果我将未经迭代循环调整的脚本应用到 newfile.txt,它就可以工作。但它只是在目录中循环时不起作用。
感谢任何帮助,我是编程新手。
问题是当您 运行 file = open(filename, "rt")
时,它正在您开始 Python (~/Dropbox/programming/first_project/
) 的目录中查找文件名,但您希望它阅读 ~/Dropbox/programming/first_project/directoryaddress
.
为确保您读取正确的文件,您应该将它的完整路径作为 filename
传递,或者,如果您知道您总是会在某个子目录中找到它,只需在文件名之前添加路径尝试阅读它 file = open(files_path+"/"+filename, "rt")
(有更简洁的方法来组合路径,例如标准库 pathlib)。
我编写了一个脚本来遍历目录中的多个文本文件,并计算每个文件中包含的单词也包含在字典文件中。我用目录中的两个文件编写并测试了脚本,并使其完美运行,脚本吐出两个准确的整数,每个文件一个。但是,一旦我将新文件添加到目录中,就会出现 FileNotFound 错误。文件肯定在里面!谁能告诉我导致此问题的代码是什么?我在 Whosebug 上浏览过各种其他此类帖子,但都没有成功。新添加的文件具有与现有两个相同的所有属性。
代码(word_count_from_dictionary-iterating.py):
import os
import sys
import nltk
nltk.download()
from nltk import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
import io
files_path = sys.argv[1]
textfile_dictionary = sys.argv[2]
for filename in os.listdir(files_path):
if filename.endswith(".txt"):
#accessing file for processing
file = open(filename, "rt")
text = file.read()
#tokenize text file
tokens = word_tokenize(text)
#remove non-alphabetical characters
words = []
for word in tokens:
if word.isalpha():
words.append(word)
#remove stopwords
stop_words = stopwords.words("english")
words_without_stops = []
for w in words:
if not w in stop_words:
words_without_stops.append(w)
#lemmatize remaining tokens and print
lemmatizer = WordNetLemmatizer()
lemmas = []
for x in words_without_stops:
lemmatizer.lemmatize(x)
lemmas.append(x)
#turn dictionary held in text file into a list of tokens
file = io.open(textfile_dictionary, mode="r", encoding="utf8")
dictionaryread = file.read()
dictionary = dictionaryread.split()
#count instances of each word in dictionary in the novel and add them up
word_count = 0
for element in dictionary:
for lemma in lemmas:
if lemma == element:
word_count = word_count + 1
print(word_count)
目录中只有两个测试文件的命令行结果:
c@Computer:~/Dropbox/programming/first_project$ python3 word_count_from_dictionary_iterating.py directoryaddress dictionary.txt
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
241
229
将新文件 (newfile.txt) 添加到目录后的结果:
c@Computer:~/Dropbox/programming/first_project$ python3 word_count_from_dictionary_iterating.py directoryaddress happy_words.txt
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
241
229
Traceback (most recent call last):
File "word_count_from_dictionary_iterating.py", line 17, in <module>
file = open(filename, "rt")
FileNotFoundError: [Errno 2] No such file or directory: 'newfile.txt'
如果我 运行 ls 在目录上,文件就会出现。如果我将未经迭代循环调整的脚本应用到 newfile.txt,它就可以工作。但它只是在目录中循环时不起作用。
感谢任何帮助,我是编程新手。
问题是当您 运行 file = open(filename, "rt")
时,它正在您开始 Python (~/Dropbox/programming/first_project/
) 的目录中查找文件名,但您希望它阅读 ~/Dropbox/programming/first_project/directoryaddress
.
为确保您读取正确的文件,您应该将它的完整路径作为 filename
传递,或者,如果您知道您总是会在某个子目录中找到它,只需在文件名之前添加路径尝试阅读它 file = open(files_path+"/"+filename, "rt")
(有更简洁的方法来组合路径,例如标准库 pathlib)。