SPARQL 查询中的表达式顺序

Question

下面的两个查询有什么区别吗？

select distinct ?i 
where{
    ?i rdf:type <http://foo/bar#A>. 
    FILTER EXISTS {
        ?i <http://foo/bar#hasB> ?b.
        ?b rdf:type <http://foo/bar#B1>.
    }            
}


select distinct ?i 
    where{
        FILTER EXISTS {
            ?i <http://foo/bar#hasB> ?b.
            ?b rdf:type <http://foo/bar#B1>.
        }
        ?i rdf:type <http://foo/bar#A>.             
    }

性能或结果存在差异？

Answer 1

首先，你不需要FILTER EXISTS。您可以使用基本图形模式（一组常规三重模式）重写您的查询。但是假设您使用的是 FILTER NOT EXISTS 或类似的东西。

结果

一般来说，order matters.

然而，top-down 评估语义主要在 OPTIONAL 的情况下起作用，而你的情况并非如此。因此，结果应该是相同的。

Top-down 评估语义可以被 bottom-up evaluation semantics. Fortunately, bottom-up semantics doesn't prescribe 覆盖以首先逻辑地评估 FILTER 尽管在 FILTER EXISTS 和 FILTER NOT EXISTS 的情况下是可能的。

SPARQL 代数 representation 对于两个查询是相同的：

(prefix ((rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
         (foobar: <http://foo/bar#>))
  (distinct
    (project (?i)
      (filter (exists
                 (bgp
                   (triple ?i foobar:B ?b)
                   (triple ?b rdf:type foobar:B1)
                 ))
        (bgp (triple ?i rdf:type foobar:A))))))

性能

天真地遵循 top-down 语义，引擎应该首先评估 ?i a foobar:A。

你很幸运，如果 ?i 只有一个绑定。
如果 ?i 存在数百万个绑定，而子模式更具选择性，那么您就没那么幸运了。

幸运的是，优化器会尝试根据其选择性对模式进行重新排序。但是，预测可能是错误的。

顺便说一下，rdf:type 谓词 is said to be 是 Virtuoso 中的性能杀手。

结果与表现

如果端点有查询执行时间限制并在达到超时时刷新部分结果，结果可能会有所不同：。

SPARQL 查询中的表达式顺序

Order of expressions in a SPARQL query

rdf

semantic-web

rdfs

sparql

triplestore

结果

性能

结果与表现