Wordnet 查询 return 个例句

Wordnet query to return example sentences

我有一个用例,我需要知道以下内容:

  1. 该词的同义词(仅同义词就足够了)
  2. 单词的所有含义,其中每个含义包含 - 在该含义上与该单词匹配的同义词、该含义的例句(如果有)、该含义的词性。

示例 - this query linkcarry这个词的截图:

对于每个 'sense',我们有词性(如 V),匹配该意义的同义词,(如 transport 在第一个意义上,pack , take in the second sense, etc), example sentences containing the word in that sense (This train is carrying nuclear waste, carry the suitcase to the car, etc in first sense, I always carry money etc in the second sense等)。

如何从 Wordnet MySQL database 执行此操作?我 运行 这个查询,它 returns 单词的含义列表:

SELECT a.lemma, c.definition FROM words a INNER JOIN senses b ON a.wordid = b.wordid INNER JOIN synsets c ON b.synsetid = c.synsetid WHERE a.lemma = 'carry';

如何获取每个意义的同义词、例句、词性以及特定于该意义的同义词?我查询了 vframesentencesvframesentencemaps tables,看到了带有 %s 等占位符的例句,并基于 wordid 列我尝试将它们与words table,但得到了非常错误的结果。

编辑:

对于单词 carry,如果我 运行 这些查询,我会正确地获得同义词和意义:

1. select * from words where lemma='carry' //yield wordid as 21354
2. select * from senses where wordid=21354 //yield 41 sysnsetids, like 201062889
3. select * from synsets where synsetid=201062889 //yields the explanation "serve as a means for expressing something"
4. select * from senses where synsetid=20106288` /yields all matching synonyms for that sense as wordids, including "carry" - like 21354, 29630, 45011
5. select * from words where wordid=29630 //yields 'convey'

所以我现在需要的是一种在 41 种意义中的每一种中找到单词 carry 的例句的方法。我该怎么做?

您可以从 samples table 中获取句子。例如:

SELECT sample FROM samples WHERE synsetid = 201062889;

产量:

The painting of Mary carries motherly love

His voice carried a lot of anger

因此您可以按如下方式扩展您的查询:

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    (SELECT 
            GROUP_CONCAT(a1.lemma)
        FROM
            words a1
                INNER JOIN
            senses b1 ON a1.wordid = b1.wordid
        WHERE
            b1.synsetid = b.synsetid
                AND a1.lemma <> a.lemma
        GROUP BY b.synsetid) AS `synonyms`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

注意:subselect 用GROUP_CONCAT returns 将每个意义的同义词作为逗号分隔的单行列表,以减少行数。如果愿意,您可以考虑在单独的查询中返回这些(或作为此查询的一部分,但其他所有内容都重复)。

更新 如果您确实需要同义词作为结果中的行,可以使用以下方法,但我不推荐这样做:同义词和例句都属于特定定义,因此同义词集将为每个例句重复。例如。如果某个特定定义有 4 个例句和 5 个同义词,则仅针对该定义,结果将有 4 x 5 = 20 行。

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    subq.lemma AS `synonym`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
        LEFT JOIN
    (SELECT 
        a1.lemma, b1.synsetid
    FROM
        senses b1
    INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = b.synsetid
        AND subq.lemma <> a.lemma
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;