为什么 URLGetRelations API 将某些句子错误标记为 "future" 时态?

Why does the URLGetRelations API mislabel some sentences as "future" tense?

我希望使用 URLGetRelations API 来帮助识别文本中的将来时态的句子。但是,我发现 API 被识别为将来时的句子中存在不准确之处。下面的例子,都被标识为 "future" 但这些可以说是不正确的。我在下面的 API 回复中看到有一些乱码(例如第一篇文章中的 "revivehim"),也许这是导致错误标记的原因?但是,如果您查看我将 API 指向的 URL,原始源文本中不会出现扭曲。

一个来自:http://www.reuters.com/article/new-york-police-idUSL2N15R02C

{ "sentence": " \"Oh my God,someone's hit,\" a tearful Liang recalled saying upon finding a bleedingGurley lying on a landing, as his girlfriend frantically tried to revivehim.", "subject": { "text": "his girlfriend"}, "action": { "text": "tried to revive","lemmatized": "try to revive", "verb": {"text": "revive", "tense": "future" } },"object": { "text": "him","sentimentFromSubject": { "type": "negative","score": "-0.70197" } } },

两个来自:http://www.cnn.com/2016/02/11/us/nypd-officer-trial/

{ "sentence": " On Thursdayevening, about an hour before the verdict, the jury asked Justice Danny Chun toread them the charges and legal definitions, the second time this week.","subject": { "text": "Justice Danny Chun" },"action": { "text": "to read","lemmatized": "to read", "verb": {"text": "read", "tense": "future" } },"object": { "text": "the charges and legaldefinitions", "sentiment": { "type":"negative", "score": "-0.597878" } } },

我认为是 "to" 这个词造成了这种混淆。看到包含单词 "to" 的将来时态动词短语是很常见的,例如 "I am going to eat that later" 和 "We are planning to fly tonight." 您还会看到像 "Joe to appear on TV tonight" 这样暗示将来时态的措辞,即使它不是100% 语法正确。在您分享的案例中,单词 "to" 用作不定式,但由于与将来时动词短语的结构非常相似,它们被归类为将来时。通过第二个示例可以很容易地看出这是如何发生的:主题是 "Justice Danny Chun",动作是 "to read",宾语是 "the charges..."。系统将其视为一个读作 "Justice Danny Chun to read the charges" 的句子,这给了我们将来时态。