Returns 一个空列表而不是双字母组
Returns an empty list instead of bigrams
下面提到的代码returns预期的输出。
[('the', 23135851162), ('of', 13151942776), ('and', 12997637966),
('to', 12136980858), ('a', 9081174698)]
from itertools import islice
import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell()
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_dictionary_en_82_765.txt")
sym_spell.load_dictionary(dictionary_path, 0, 1)
# Print out first 5 elements to demonstrate that dictionary is
# successfully loaded
print(list(islice(sym_spell.words.items(), 5)))
但是下一段代码returns一个空列表.
from itertools import islice
import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell()
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_dictionary_en_82_765.txt")
sym_spell.load_bigram_dictionary(dictionary_path, 0, 2)
# Print out first 5 elements to demonstrate that dictionary is
# successfully loaded
print(list(islice(sym_spell.bigrams.items(), 5)))
预期输出为:
[('abcs of', 10956800), ('aaron and', 10721728), ('abbott and',
7861376), ('abbreviations and', 13518272), ('aberdeen and', 7347776)]
根据此页面:
https://symspellpy.readthedocs.io/en/latest/examples/dictionary.html
我想知道我在第二部分代码中犯的错误。
链接页面和您的问题中给出的第二个示例引用了错误的数据文件。您必须参考包含的二元数据文件。
解释示例的文档显示了每个示例的预期数据格式,并且格式不同。然而,这两个示例引用了同一个数据文件。这一定有一处不对,错误在于第二个例子应该引用二元组数据文件。
下面是可以正常工作的完整代码:
from itertools import islice
import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell()
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_bigramdictionary_en_243_342.txt") # << - fixed to refer to the bigram data file
sym_spell.load_bigram_dictionary(dictionary_path, 0, 2)
# Print out first 5 elements to demonstrate that dictionary is
# successfully loaded
print(list(islice(sym_spell.bigrams.items(), 5)))
结果:
[('abcs of', 10956800), ('aaron and', 10721728), ('abbott and', 7861376), ('abbreviations and', 13518272), ('aberdeen and', 7347776)]
下面提到的代码returns预期的输出。
[('the', 23135851162), ('of', 13151942776), ('and', 12997637966), ('to', 12136980858), ('a', 9081174698)]
from itertools import islice
import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell()
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_dictionary_en_82_765.txt")
sym_spell.load_dictionary(dictionary_path, 0, 1)
# Print out first 5 elements to demonstrate that dictionary is
# successfully loaded
print(list(islice(sym_spell.words.items(), 5)))
但是下一段代码returns一个空列表.
from itertools import islice
import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell()
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_dictionary_en_82_765.txt")
sym_spell.load_bigram_dictionary(dictionary_path, 0, 2)
# Print out first 5 elements to demonstrate that dictionary is
# successfully loaded
print(list(islice(sym_spell.bigrams.items(), 5)))
预期输出为:
[('abcs of', 10956800), ('aaron and', 10721728), ('abbott and', 7861376), ('abbreviations and', 13518272), ('aberdeen and', 7347776)]
根据此页面:
https://symspellpy.readthedocs.io/en/latest/examples/dictionary.html
我想知道我在第二部分代码中犯的错误。
链接页面和您的问题中给出的第二个示例引用了错误的数据文件。您必须参考包含的二元数据文件。
解释示例的文档显示了每个示例的预期数据格式,并且格式不同。然而,这两个示例引用了同一个数据文件。这一定有一处不对,错误在于第二个例子应该引用二元组数据文件。
下面是可以正常工作的完整代码:
from itertools import islice
import pkg_resources
from symspellpy import SymSpell
sym_spell = SymSpell()
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_bigramdictionary_en_243_342.txt") # << - fixed to refer to the bigram data file
sym_spell.load_bigram_dictionary(dictionary_path, 0, 2)
# Print out first 5 elements to demonstrate that dictionary is
# successfully loaded
print(list(islice(sym_spell.bigrams.items(), 5)))
结果:
[('abcs of', 10956800), ('aaron and', 10721728), ('abbott and', 7861376), ('abbreviations and', 13518272), ('aberdeen and', 7347776)]