Sparql-gremlin 工具不使用索引
Sparql-gremlin tool don't use indexes
我使用 sparql-gremlin 3.4.0 和 janusgraph 0.3.1。在顶点 属性 'iri' 上创建索引后,gremlin 查询立即给出结果。相反,如果我在 sparql 中执行相同的查询,它不会使用任何索引。
在下面的示例中,我使用 force-index 选项来避免扫描查询。
有什么建议吗?
可能有两个问题需要考虑:(1) TinkerPop 没有将该查询适当地优化到 JanusGraph 可以轻松使用索引的状态,或者 (2) JanusGraph 没有优化要使用的查询的某些方面索引。对于后一种情况,JanusGraph 必须很好地优化 match()
步骤以使用索引,因为这是 sparql-gremlin 在其翻译过程中使用的核心步骤。我很确定它不会那样做。对于前一种情况,JanusGraph 可能依赖于 TinkerPop 将 match()
转换为更易于使用的东西 - 在您的示例中,希望 JanusGraph 能够处理您编写的最初测试查询 - g.V().has('iri', ...)
。我认为 explain()
会告诉你那里发生了什么,就像我用 TinkerGraph 测试你的示例的变体时对我所做的那样:
gremlin> s.sparql("SELECT ?x WHERE { ?x v:name 'marko' }").explain()
==>Traversal Explanation
===========================================================================================================================================================================================
Original Traversal [InjectStep([SELECT ?x WHERE { ?x v:name 'marko' }])]
ConnectiveStrategy [D] [InjectStep([SELECT ?x WHERE { ?x v:name 'marko' }])]
SparqlStrategy [D] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
MatchPredicateStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
FilterRankingStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
EarlyLimitStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
InlineFilterStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
IncidentToAdjacentStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
AdjacentToIncidentStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
RepeatUnrollStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
CountStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
PathRetractionStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
LazyBarrierStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
TinkerGraphCountStrategy [P] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
TinkerGraphStepStrategy [P] [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
ProfileStrategy [F] [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
StandardVerificationStrategy [V] [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
Final Traversal [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
不太好。
因此,解决方案的选项是:
- JanusGraph 需要更好地优化
match()
以处理此类查询或
- TinkerPop 标准遍历策略应该更擅长将此类查询转换为更通用的模式或
sparql-gremlin
应该编译成与现有遍历策略更匹配的 Gremlin
关于最后一点,请注意如果 sparql-gremlin
生成此 match()
查询会发生什么情况:
gremlin> g.V().match(__.as('a').has('person','name','marko')).select('a').values('name').explain()
==>Traversal Explanation
==========================================================================================================================================================================================================
Original Traversal [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
ConnectiveStrategy [D] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
MatchPredicateStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
FilterRankingStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
EarlyLimitStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
InlineFilterStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
IncidentToAdjacentStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
AdjacentToIncidentStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
RepeatUnrollStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
CountStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
PathRetractionStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
LazyBarrierStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
TinkerGraphCountStrategy [P] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
TinkerGraphStepStrategy [P] [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
ProfileStrategy [F] [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
StandardVerificationStrategy [V] [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
Final Traversal [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
好多了。所以,我倾向于认为这是 TinkerPop 需要解决的一个普遍问题,它涉及最后两点的某种组合。当然,如果 JanusGraph 可以进一步优化 match()
就好了。 None of this当然是你问题的解决方案,但它至少应该解释发生了什么以及问题出在哪里。我创建了 TINKERPOP-2325 以供进一步讨论和跟踪。
我使用 sparql-gremlin 3.4.0 和 janusgraph 0.3.1。在顶点 属性 'iri' 上创建索引后,gremlin 查询立即给出结果。相反,如果我在 sparql 中执行相同的查询,它不会使用任何索引。 在下面的示例中,我使用 force-index 选项来避免扫描查询。
有什么建议吗?
可能有两个问题需要考虑:(1) TinkerPop 没有将该查询适当地优化到 JanusGraph 可以轻松使用索引的状态,或者 (2) JanusGraph 没有优化要使用的查询的某些方面索引。对于后一种情况,JanusGraph 必须很好地优化 match()
步骤以使用索引,因为这是 sparql-gremlin 在其翻译过程中使用的核心步骤。我很确定它不会那样做。对于前一种情况,JanusGraph 可能依赖于 TinkerPop 将 match()
转换为更易于使用的东西 - 在您的示例中,希望 JanusGraph 能够处理您编写的最初测试查询 - g.V().has('iri', ...)
。我认为 explain()
会告诉你那里发生了什么,就像我用 TinkerGraph 测试你的示例的变体时对我所做的那样:
gremlin> s.sparql("SELECT ?x WHERE { ?x v:name 'marko' }").explain()
==>Traversal Explanation
===========================================================================================================================================================================================
Original Traversal [InjectStep([SELECT ?x WHERE { ?x v:name 'marko' }])]
ConnectiveStrategy [D] [InjectStep([SELECT ?x WHERE { ?x v:name 'marko' }])]
SparqlStrategy [D] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
MatchPredicateStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
FilterRankingStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
EarlyLimitStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
InlineFilterStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
IncidentToAdjacentStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
AdjacentToIncidentStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
RepeatUnrollStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
CountStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
PathRetractionStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
LazyBarrierStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
TinkerGraphCountStrategy [P] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
TinkerGraphStepStrategy [P] [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
ProfileStrategy [F] [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
StandardVerificationStrategy [V] [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
Final Traversal [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
不太好。
因此,解决方案的选项是:
- JanusGraph 需要更好地优化
match()
以处理此类查询或 - TinkerPop 标准遍历策略应该更擅长将此类查询转换为更通用的模式或
sparql-gremlin
应该编译成与现有遍历策略更匹配的 Gremlin
关于最后一点,请注意如果 sparql-gremlin
生成此 match()
查询会发生什么情况:
gremlin> g.V().match(__.as('a').has('person','name','marko')).select('a').values('name').explain()
==>Traversal Explanation
==========================================================================================================================================================================================================
Original Traversal [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
ConnectiveStrategy [D] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
MatchPredicateStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
FilterRankingStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
EarlyLimitStrategy [O] [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
alue)]
InlineFilterStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
IncidentToAdjacentStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
AdjacentToIncidentStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
RepeatUnrollStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
CountStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
PathRetractionStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
LazyBarrierStrategy [O] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
TinkerGraphCountStrategy [P] [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
TinkerGraphStepStrategy [P] [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
ProfileStrategy [F] [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
StandardVerificationStrategy [V] [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
Final Traversal [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
好多了。所以,我倾向于认为这是 TinkerPop 需要解决的一个普遍问题,它涉及最后两点的某种组合。当然,如果 JanusGraph 可以进一步优化 match()
就好了。 None of this当然是你问题的解决方案,但它至少应该解释发生了什么以及问题出在哪里。我创建了 TINKERPOP-2325 以供进一步讨论和跟踪。