为什么在添加一个 Gremlin 顶点后查找成本如此之高？

Question

我发现以下 Gremlin 查询在 Cosmos DB 中收取 60K RU：

g.addV(label, 'plannedmeal')
    .property('partitionKey', '84ca17dd-c284-4f47-a839-a75bc27f9097')
    .as('meal')
    .V('19760224-7ac1-4316-b9a8-1f7a979274b8') <--- problem
    .as('food')
    .select('meal')
    .addE('contains')
    .to('food')
    .select('meal')

通过淘汰，我了解到.V('19760224-7ac1-4316-b9a8-1f7a979274b8')是昂贵的部分。我可以轻松地将查询拆分为 2，例如：

g.addV(label, 'plannedmeal')
    .property('partitionKey', '84ca17dd-c284-4f47-a839-a75bc27f9097')

g.V('ID_OF_NEW_ITEM')
    .as('meal')
    .V('19760224-7ac1-4316-b9a8-1f7a979274b8')
    .as('food')
    .select('meal')
    .addE('contains')
    .to('food')
    .select('meal')

作为参考，这总共花费大约 50 RU。我的问题是 - 为什么这两种方法之间有 59,950 RU 的差异？

编辑查看查询的执行配置文件后，出现在有问题的步骤中的 GetVerticies 操作似乎扫描了我图中的每个顶点。这就是问题所在，但仍然不清楚为什么通过 id 请求 V 如此昂贵。

Answer 1

这是由known limitation of Cosmos造成的。

Index utilization for Gremlin queries with mid-traversal .V() steps: Currently, only the first .V() call of a traversal will make use of the index to resolve any filters or predicates attached to it. Subsequent calls will not consult the index, which might increase the latency and cost of the query.

我调整了查询以使用文档建议的解决方法之一，并将其降至 23 RU。

g.addV(label, 'plannedmeal')
    .property('partitionKey', '84ca17dd-c284-4f47-a839-a75bc27f9097')
    .as('meal')
    .map(
       __.V('19760224-7ac1-4316-b9a8-1f7a979274b8')
    )
    .as('food')
    .select('meal')
    .addE('contains')
    .to('food')
    .select('meal')

为什么在添加一个 Gremlin 顶点后查找成本如此之高？

Why does looking up a Gremlin vertex after adding one cost so much?

gremlin

tinkerpop

azure-cosmosdb-gremlinapi