TypeError: "hypothesis" expects pre-tokenized hypothesis (Iterable[str]):

Question

我正在尝试计算以下各项的流星分数：

print (nltk.translate.meteor_score.meteor_score(
    ["this is an apple", "that is an apple"], "an apple on this tree"))

但是我每次都遇到这个错误，我不知道如何解决它。

TypeError: "hypothesis" expects pre-tokenized hypothesis (Iterable[str]): an apple on this tree

我还尝试将“这棵树上的一个苹果”放入列表中

    from nltk.translate.meteor_score import meteor_score
import nltk 
print (nltk.translate.meteor_score.meteor_score(
    ["this is an apple", "that is an apple"], ["an apple on this tree"]))

但它给了我这个错误。

TypeError: "reference" expects pre-tokenized reference (Iterable[str]): this is an apple

Answer 1

查看库代码，假设假设应该是可迭代的。 https://www.nltk.org/_modules/nltk/translate/meteor_score.html。错误来自：

if isinstance(hypothesis, str):
        raise TypeError(
            f'"hypothesis" expects pre-tokenized hypothesis (Iterable[str]): {hypothesis}'
        )

尝试将“这棵树上的苹果”放入列表中。

Answer 2

实际上，我认为问题的正确答案是在调用函数之前对句子进行分词。例如：

    for line in zip(refs, hypos):
        ref = word_tokenize(line[0])
        hypo = word_tokenize(line[1])
        m_score += meteor_score([ref], hypo)

其中 ref 和 hypo 是一个句子串。

TypeError: "hypothesis" expects pre-tokenized hypothesis (Iterable[str]):

TypeError: "hypothesis" expects pre-tokenized hypothesis (Iterable[str]):

python

metrics

nlp

nltk