Gremlin：重复直到断点，并将顶点一起批处理以产生一个值

Question

我正在通过构建一个简单的传销网络来学习图形数据库（基本上一个用户可以赞助另一个用户，所有用户最多有一个赞助商）。我想运行查询：

从一个选定的用户转到另一个用户，直到满足某个谓词 - 然后将选定路径上所有用户的点加起来得到一个值（这个值应该被去重以防止用户分支时重复计算给多个用户）。
重复此步骤 3 次，但每次都从上一步到达的最后一个用户开始。
将总和输出为列表。

我一直在尝试以下查询：

    g.V(userID)
     .repeat(
       repeat(out('sponsors')
         .until(somePredicate)
         .out('hasPoints')
         .as('level') // How do I know the current loop iteration so I can store level1/level2/level3 in as step dynamically?
         // This is where I'm stuck, since I have no idea how to capture and sum all the points in this subtree.
         .in('hasPoints')
     )
     .times(3)
     // Also need to output the point sums as a list/map here, e.g. ["level1": 100, "level2": 100],
     // "level1" being the first iteration of repeat and so on.

有指针吗？

编辑：

这是示例数据的 Gremlin 脚本：

g.addV('user').property('id', 1).as('1')
  addV('user').property('id', 2).as('2').
  addV('user').property('id', 3).as('3').
  addV('user').property('id', 4).as('4').
  addV('user').property('id', 5).as('5').
  addV('user').property('id', 6).as('6').
  addV('user').property('id', 7).as('7').
  addV('point').property('value', 5).as('p1')
  addV('point').property('value', 5).as('p2').
  addV('point').property('value', 5).as('p3').
  addV('point').property('value', 5).as('p4').
  addV('point').property('value', 5).as('p5').
  addV('point').property('value', 5).as('p6').
  addV('point').property('value', 5).as('p7').
  addE('sponsors').from('1').to('2').
  addE('sponsors').from('1').to('3').
  addE('sponsors').from('1').to('4').
  addE('sponsors').from('2').to('5').
  addE('sponsors').from('3').to('6').
  addE('sponsors').from('4').to('7').
  addE('hasPoints').from('1').to('p1').
  addE('hasPoints').from('2').to('p2').
  addE('hasPoints').from('3').to('p3').
  addE('hasPoints').from('4').to('p4').
  addE('hasPoints').from('5').to('p5').
  addE('hasPoints').from('6').to('p6').
  addE('hasPoints').from('7').to('p7').
  iterate()

这是我写的一个查询，用于根据某些谓词将级别分组：

g.V()
    .has('id', 1)
    .repeat('x',
        identity()
            .repeat(
                out('sponsors')
                    .choose(loops('x'))
                    .option(0, identity().as('a1'))
                    .option(1, identity().as('a2'))
                    .option(2, identity().as('a3'))
            )
            .until(or(out('hasPoints').has('value', gte(5))))
            .sideEffect(
                choose(loops('x'))
                    .option(0, select(all, 'a1'))
                    .option(1, select(all, 'a2'))
                    .option(2, select(all, 'a3'))
                    .unfold()
                    .choose(loops('x'))
                    .option(0, store('b1'))
                    .option(1, store('b2'))
                    .option(2, store('b3'))
            )
    )
    .times(3)
    .cap('b1', 'b2', 'b3')

尽管我可以手动设置变量并选择正确的变量，但我还不知道如何动态地执行此操作 - 也就是说，在某些情况下，我可能需要它而不是 times(3) until，因此迭代计数不再事先已知。

Answer 1

我稍微修改了您的数据以包含一个小于 5 的 "point" 值，以证明它过滤正确，并将 "id" 属性更改为 T.id 以便在我测试时更容易阅读结果：

g.addV('user').property(id, 1).as('1').
  addV('user').property(id, 2).as('2').
  addV('user').property(id, 3).as('3').
  addV('user').property(id, 4).as('4').
  addV('user').property(id, 5).as('5').
  addV('user').property(id, 6).as('6').
  addV('user').property(id, 7).as('7').
  addV('point').property('value', 5).as('p1').
  addV('point').property('value', 5).as('p2').
  addV('point').property('value', 5).as('p3').
  addV('point').property('value', 5).as('p4').
  addV('point').property('value', 5).as('p5').
  addV('point').property('value', 4).as('p6').
  addV('point').property('value', 5).as('p7').
  addE('sponsors').from('1').to('2').
  addE('sponsors').from('1').to('3').
  addE('sponsors').from('1').to('4').
  addE('sponsors').from('2').to('5').
  addE('sponsors').from('3').to('6').
  addE('sponsors').from('4').to('7').
  addE('hasPoints').from('1').to('p1').
  addE('hasPoints').from('2').to('p2').
  addE('hasPoints').from('3').to('p3').
  addE('hasPoints').from('4').to('p4').
  addE('hasPoints').from('5').to('p5').
  addE('hasPoints').from('6').to('p6').
  addE('hasPoints').from('7').to('p7').
  iterate()

如果您只需要根据 repeat() 迭代的级别动态分组，那么您可以 group() on loops():

gremlin> g.V(1).
......1>   repeat(out('sponsors').
......2>          group('m').
......3>            by(loops()).
......4>            by(out('hasPoints').has('value',gte(5)).
......5>               values('value').sum())).
......6>   cap('m')
==>[0:15,1:10]

您提到您希望对这些值求和，您可以很容易地做到这一点：

gremlin> g.V(1).
......1>   repeat(out('sponsors').
......2>          group('m').
......3>            by(loops()).
......4>            by(out('hasPoints').has('value',gte(5)).
......5>               values('value').sum())).
......6>   cap('m').
......7>   unfold().
......8>   select(values).
......9>   sum()
==>25

当然如果你只需要总数你可以完全避免group():

gremlin> g.V(1).
......1>   repeat(out('sponsors').
......2>          store('m').
......3>            by(coalesce(out('hasPoints').has('value',gte(5)).values('value'), 
......4>                        constant(0)))).
......5>   cap('m').
......6>   sum(local)
==>25

最后，如果我们不再关心级别，那么我们可能会做得更好，完全消除 "m" 的副作用并节省开销：

gremlin> g.V(1).
......1>   repeat(out('sponsors')).
......2>     emit().
......3>   out('hasPoints').has('value',gte(5)).
......4>   values('value'). 
......5>   sum()
==>25

Gremlin：重复直到断点，并将顶点一起批处理以产生一个值

Gremlin: repeat until breakpoint, and batch the vertices together to produce a value

graph-databases

gremlin

janusgraph