读取 GoogleNews-vectors-negative300.bin 文件时出现权限被拒绝错误

Question

我正在尝试读取不同的语言编码模型，如 golve、fasttext 和 word3vec 并检测讽刺，但我无法读取 google 的语言编码文件。它给出了权限被拒绝的错误。我该怎么办？

我尝试了不同的编码并授予了文件的所有权限，但仍然没有成功

EMBEDDING_FILE = 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'
def get_coefs(word, *arr): return word, np.asarray(arr, dtype='float32')
embeddings_index = dict(get_coefs(*o.rstrip().rsplit(' ')) for o in open(EMBEDDING_FILE,encoding="ISO-8859-1"))
embed_size = 300
word_index = tokenizer.word_index
nb_words = min(max_features, len(word_index))
embedding_matrix = np.zeros((nb_words, embed_size))
for word, i in word_index.items():
    if i >= max_features: continue
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None: embedding_matrix[i] = embedding_vector


PermissionError                           Traceback (most recent call last)
<ipython-input-10-5d122ae40ef0> in <module>
      1 EMBEDDING_FILE = 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'
      2 def get_coefs(word, *arr): return word, np.asarray(arr, dtype='float32')
----> 3 embeddings_index = dict(get_coefs(*o.rstrip().rsplit(' ')) for o in open(EMBEDDING_FILE,encoding="ISO-8859-1"))
      4 embed_size = 300
      5 word_index = tokenizer.word_index

PermissionError: [Errno 13] Permission denied: 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'

Answer 1

无论您如何尝试或出于何种目的尝试打开文件，您都可能会遇到相同的 IO 相关错误 – 所以这并不是关于 nlp 或 word2vec，甚至 jupyter-notebook。

请注意，有时我们会考虑其他问题的错误会被报告为 "permission" 问题 - 因为在某种程度上，您不能对种类那样做路径或文件。

您已将文件路径指定为 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'，尾随 / 通常表示某物是一个目录。这可能是个问题。

此外，我认为这个特定文件的大小通常超过 3 GB - 一些 DOS 后裔文件系统，或者只有 32 位的 Python 解释器，在处理超过特定大小的文件时可能会出现问题比如 2GB 或 4GB。

读取 GoogleNews-vectors-negative300.bin 文件时出现权限被拒绝错误

permission denied error while reading the GoogleNews-vectors-negative300.bin file

python

nlp

word2vec

jupyter-notebook