gensim Getting Started Error: No such file or directory: 'text8'
gensim Getting Started Error: No such file or directory: 'text8'
我正在 python 中学习 word2vec 和 GloVe 模型,所以我正在研究这个可用的 here。
我在Idle3中一步步编译这些代码后:
>>>from gensim.models import word2vec
>>>import logging
>>>logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
>>>sentences = word2vec.Text8Corpus('text8')
>>>model = word2vec.Word2Vec(sentences, size=200)
我收到这个错误:
2017-01-13 11:15:41,471 : INFO : collecting all words and their counts
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
model = word2vec.Word2Vec(sentences, size=200)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 469, in __init__
self.build_vocab(sentences, trim_rule=trim_rule)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 533, in build_vocab
self.scan_vocab(sentences, progress_per=progress_per, trim_rule=trim_rule) # initial survey
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 545, in scan_vocab
for sentence_no, sentence in enumerate(sentences):
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 1536, in __iter__
with utils.smart_open(self.fname) as fin:
File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 127, in smart_open
return file_smart_open(parsed_uri.uri_path, mode)
File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 558, in file_smart_open
return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'text8'
我该如何纠正这个问题?
预先感谢您的帮助。
您似乎缺少此处使用的文件。具体来说,它试图打开 text8
但找不到它(因此 FileNotFoundError
)。
您可以从 here as is stated in the documentation for Text8Corpus
:
下载文件本身
Docstring:
Iterate over sentences from the "text8" corpus, unzipped from http://mattmahoney.net/dc/text8.zip .
并使其可用。 提取它然后将它作为参数提供给Text8Corpus
:
sentences = word2vec.Text8Corpus('/path/to/text8')
我正在 python 中学习 word2vec 和 GloVe 模型,所以我正在研究这个可用的 here。
我在Idle3中一步步编译这些代码后:
>>>from gensim.models import word2vec
>>>import logging
>>>logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
>>>sentences = word2vec.Text8Corpus('text8')
>>>model = word2vec.Word2Vec(sentences, size=200)
我收到这个错误:
2017-01-13 11:15:41,471 : INFO : collecting all words and their counts
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
model = word2vec.Word2Vec(sentences, size=200)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 469, in __init__
self.build_vocab(sentences, trim_rule=trim_rule)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 533, in build_vocab
self.scan_vocab(sentences, progress_per=progress_per, trim_rule=trim_rule) # initial survey
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 545, in scan_vocab
for sentence_no, sentence in enumerate(sentences):
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 1536, in __iter__
with utils.smart_open(self.fname) as fin:
File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 127, in smart_open
return file_smart_open(parsed_uri.uri_path, mode)
File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 558, in file_smart_open
return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'text8'
我该如何纠正这个问题? 预先感谢您的帮助。
您似乎缺少此处使用的文件。具体来说,它试图打开 text8
但找不到它(因此 FileNotFoundError
)。
您可以从 here as is stated in the documentation for Text8Corpus
:
Docstring:
Iterate over sentences from the "text8" corpus, unzipped from http://mattmahoney.net/dc/text8.zip .
并使其可用。 提取它然后将它作为参数提供给Text8Corpus
:
sentences = word2vec.Text8Corpus('/path/to/text8')