有什么方法可以优化 SPARQL 查询吗？

Question

我将非托管三元组作为单个文档的一部分存储在我的内容数据库中。本质上，每个文档代表一个人，定义的三元组指定该人的经理的文档 URI。我正在尝试使用 SPARQL 来确定经理与层次结构中低于他们的所有人之间的路径长度。

文档中的三元组看起来像

<sem:triple xmlns:sem="http://marklogic.com/semantics">
    <sem:subject>http://rdf.abbvienet.com/infrastructure/person/10740024</sem:subject>
    <sem:predicate>http://schemas.abbvienet.com/ontologies/infrastructure.owl#manager</sem:predicate>
    <sem:object>http://rdf.abbvienet.com/infrastructure/person/10206242</sem:object>
</sem:triple>

我发现了以下 sparql 查询，它可用于 return 经理、在层次结构中低于他们的人，以及他们相距较远的节点数。

select  ?manager ?leaf (count(?mid) as ?distance) { 
  BIND(<http://rdf.abbvienet.com/infrastructure/person/10025613> as ?manager)
  ?leaf <http://schemas.abbvienet.com/ontologies/infrastructure.owl#manager>* ?mid .
  ?mid <http://schemas.abbvienet.com/ontologies/infrastructure.owl#manager>+ ?manager .
}
group by ?manager ?leaf 
order by ?manager ?leaf

这行得通，但速度很慢，即使我正在查看的层次结构树只有一层或两层深，大约 15s。我在数据库中有 63,139 个这种类型的管理器三元组。

Answer 1

我认为最大的问题是 BIND() - MarkLogic 8 没有优化您正在使用的模式。您可以尝试将常量替换为使用 ?manager 变量的地方，看看是否有很大的不同？即：

select  ?leaf (count(?mid) as ?distance) { 
  ?leaf <http://schemas.abbvienet.com/ontologies/infrastructure.owl#manager>* ?mid .
  ?mid <http://schemas.abbvienet.com/ontologies/infrastructure.owl#manager>+
    <http://rdf.abbvienet.com/infrastructure/person/10025613> .
}
group by ?leaf 
order by ?leaf

Whosebug 不是回答此类性能问题的好地方，因为它确实需要我们共同努力来帮助您的对话。也许您可以尝试联系 support or the MarkLogic developer mailing list 来解决此类问题？

有什么方法可以优化 SPARQL 查询吗？

Is there any way to optimize SPARQL queries?

sparql

marklogic

marklogic-8