当在 has() 子句中使用索引属性时，JanusGraph 有时 returns 为空结果

Question

我们使用 JanusGraph（带 Cassandra）作为后端数据库，在 has() 子句中使用索引属性时面临一些问题。

以下查询 returns 一个空响应：

g.V().has('fooName', 'fooValue').has('barName', 'barValue')

但是，下面的查询 returns 正确响应：

g.V().has('barName', 'barValue').values('fooName')
==> fooValue

两个属性都已编入索引（复合）。数据集大约有 20k 个顶点，'fooName' 属性的值为 'fooValue'，这也通过以下有效查询得到证实：

g.V().has('fooName', 'fooValue').count()
==> 20000

这种情况间歇性发生，并非对所有顶点都发生。在 20k 个顶点中，大约 6k 个顶点显示了上述问题。添加属性值的方法对所有顶点都是一样的。

是否是这样的情况，如果我们为顶点添加一个复合索引，其中值的域将很小并且顶点集的范围将很少，而一个集在宇宙中所占的份额非常大，结果索引查询将因错误地声称不存在与谓词匹配的存在顶点而失败？

遍历报告当前顶点不存在，而不是触发完整扫描。如果是这种情况，综合指数在什么阈值开始失效，可以在哪里调整？

我们知道索引缓存和事务最近发生了变化。对于失败的索引查询，我们还观察到“g.V().has( 'fooName', 'fooValue' ).count()”被限制为查询限制值 4000 (如 .profile() 报告中所示）直到事务被提交。提交后，count() 跳回到预期值。（20,000）。这有关系吗？

Answer 1

@Anya 让我们看看https://github.com/JanusGraph/janusgraph/issues/735 和

This looks equal to [his] post on gitter on nov 15th. The query optimizer inserted a limit(4000), perhaps here too. Turning off smart-limit ( query.smart-limit=false ) solved the issue.

https://docs.janusgraph.org/basics/configuration-reference/

当在 has() 子句中使用索引 属性 时，JanusGraph 有时 returns 为空结果

JanusGraph sometimes returns empty results when an indexed property is used in the has() clause

janusgraph

当在 has() 子句中使用索引属性时，JanusGraph 有时 returns 为空结果