测量语法文本质量的模型

Question

我通过转换器模型生成文本，我正在寻找一种衡量语法文本质量的方法。点赞正文："Today is a good day. I slept well and got up good in the morning." 评分应高于："Yesterday I went into bed and. got Breakfast son."

是否有任何模型可以完成我以前没有找到的这项工作，或者是否有任何其他方法来衡量文本语法输出的质量？

我发现 spacy 可以选择显示文本是否有语法错误，但我更感兴趣的是包含文本长度和数量的分数它有错误。我还研究了 NLTK 的可读性，但这是针对文本的理解程度，这不仅仅取决于语法。

谢谢！

Answer 1

所以我找到了我要找的东西：在这个 paper the researchers tested different measures for their ability on checking grammar mistakes for text without references (what the GLEU-Score can be used for). They also tested the python-language-tool 中也用于 open-office 中的拼写检查。该工具能够测量文本中语法错误的数量。出于我的目的，我将错误量除以文本中的单词量，这给出了一个错误度量。

也许这对遇到同样问题的人有帮助。这里的示例代码，基于 pypi:

import language_tool_python
tool = language_tool_python.LanguageTool('en-US')
text = "this is a test tsentence, to check if all erors are found"
matches = tool.check(text)
len(matches)
>>>3

测量语法文本质量的模型

Model for measuring grammatical text quality

python

evaluation

nlp

nlg

tensorflow