如何在 GraphDB 全文搜索中创建自定义 AnalyzerFactory？

Question

（免费使用 GraphDB 8.1）。 http://graphdb.ontotext.com/documentation/free/full-text-search.html 说我可以通过实现接口 com.ontotext.trree.plugin.lucene.AnalyzerFactory 使用 luc:analyzer 参数为 GraphDB 全文搜索启用自定义 AnalyzerFactory。但是我在任何地方都找不到这个界面。它不在 jar graphdb-free-runtime-8.1.0.jar.

中

我检查了 http://ontotext.com/products/graphdb/editions/#feature-comparison-table 的特征矩阵，似乎这个特征 '"Connectors Lucene" 可用于 GraphDB 的免费版本。

com.ontotext.trree.plugin.lucene.AnalyzerFactory 界面位于哪个 jar 中？我需要在我的项目中导入什么来实现这个接口？

GraphDB 是否包含预先存在的 AnalyzerFactories 以使用 Lucene 其他分析器？（我有兴趣使用 FrenchAnalyzer）。

谢谢！

Answer 1

GraphDB 提供两种不同的基于 Lucene 的插件。

Lucene FTS 插件索引 RDF 分子，正确的文档 link 是：http://graphdb.ontotext.com/documentation/free/full-text-search.html
Lucene 连接器使用诸如 ?subject propertyPath ?object 到 id|fild 值的配置序列在 RDF 和 Lucene 文档模型之间执行在线同步。正确的文档 link 是：http://graphdb.ontotext.com/documentation/free/lucene-graphdb-connector.html

我鼓励您使用 Lucene 连接器，除非您没有 RDF 分子的特殊情况。下面是一个简单示例，说明如何使用法语分析器配置连接器，并为 urn:MyClass 类型的资源索引 rdfs:label 谓词的所有值。 Select 存储库并从 SPARQL 查询视图执行：

  PREFIX :<http://www.ontotext.com/connectors/lucene#>
  PREFIX inst:<http://www.ontotext.com/connectors/lucene/instance#>
  INSERT DATA {
    inst:labelFR-copy :createConnector '''
  {
    "fields": [
      {
        "indexed": true,
        "stored": true,
        "analyzed": true,
        "multivalued": true,
        "fieldName": "label",
        "propertyChain": [
          "http://www.w3.org/2000/01/rdf-schema#label"
        ],
        "facet": true
      }
    ],
    "types": [
      "urn:MyClass"
    ],
    "stripMarkup": false,
    "analyzer": "org.apache.lucene.analysis.fr.FrenchAnalyzer"
  }
  ''' .
  }

然后从导入 > 文本区域手动添加一些示例测试数据：

<urn:instance:test>  <http://www.w3.org/2000/01/rdf-schema#label> "C'est une example".
<urn:instance:test> a <urn:MyClass>.

提交事务后，连接器将更新 Lucene 索引。现在您可以运行搜索如下查询：

PREFIX : <http://www.ontotext.com/connectors/lucene#>
PREFIX inst: <http://www.ontotext.com/connectors/lucene/instance#>
SELECT ?entity ?snippetField ?snippetText {
    ?search a inst:labelFR ;
            :query "label:*" ;
            :entities ?entity .
    ?entity :snippets _:s .
    _:s :snippetField ?snippetField ;
        :snippetText ?snippetText .
}

要创建自定义分析器，请按照文档中的说明进行操作并扩展 org.apache.lucene.analysis.Analyzer class。将自定义分析器 JAR 放在 lib/plugins/lucene-connector/ 路径中。

如何在 GraphDB 全文搜索中创建自定义 AnalyzerFactory？

How to create a custom AnalyzerFactory in GraphDB full text search?

java

lucene

sparql

graphdb