查找具有指定标签和跃点的任何方向的每条路径

Find every paths in any direction with specified labels and hops

我有下图:

顶点和边已添加如下:

def graph=ConfiguredGraphFactory.open('Baptiste');def g = graph.traversal();
graph.addVertex(label, 'Group', 'text', 'BNP Paribas');
graph.addVertex(label, 'Group', 'text', 'BNP PARIBAS');
graph.addVertex(label, 'Company', 'text', 'JP Morgan Chase');
graph.addVertex(label, 'Location', 'text', 'France');
graph.addVertex(label, 'Location', 'text', 'United States');
graph.addVertex(label, 'Location', 'text', 'Europe');
def v1 = g.V().has('text', 'JP Morgan Chase').next();def v2 = g.V().has(text, 'BNP Paribas').next();v1.addEdge('partOf',v2);
def v1 = g.V().has('text', 'JP Morgan Chase').next();def v2 = g.V().has(text, 'United States').next();v1.addEdge('doesBusinessIn',v2);
def v1 = g.V().has('text', 'BNP Paribas').next();def v2 = g.V().has(text, 'United States').next();v1.addEdge('doesBusinessIn',v2);
def v1 = g.V().has('text', 'BNP Paribas').next();def v2 = g.V().has(text, 'France').next();v1.addEdge('partOf',v2);
def v1 = g.V().has('text', 'BNP PARIBAS').next();def v2 = g.V().has(text, 'Europe').next();v1.addEdge('partOf',v2);

而且我需要一个查询,它 return 在给定特定顶点标签、边标签和可能的跳数的情况下为我提供每条可能的路径。 假设我需要最大跳数为 2 的路径以及此示例中的每个标签。我试过这个查询:

def graph=ConfiguredGraphFactory.open('TestGraph');
def g = graph.traversal();
g.V().has(label, within('Location', 'Company', 'Group'))
.repeat(bothE().has(label, within('doesBusinessIn', 'partOf')).bothV().has(label, within('Location', 'Company', 'Group')).simplePath())
.emit().times(2).path();

本次查询returns 20条路径(假设为return10条路径)。所以它在 2 个可能的方向上 returns 路径。 有没有办法指定我只需要一个方向? 我尝试在查询中添加 dedup() 但它 return 有 7 条路径而不是 10 条路径,所以它不是工作?

此外,每当我尝试查找具有 4 跳的路径时,它不会 return 我 "cyclic" 路径,例如 France -> BNP Paribas -> United States -> JP Morgan Chase -> BNP Paribas知道要在我的查询中添加什么以允许 returning 那种路径吗?

编辑: 感谢您的解决方案@DanielKuppitz。这似乎正是我要找的。

我使用构建在 Apache Tinkerpop 之上的 JanusGraph: 我尝试了第一个查询:

g.V().hasLabel('Location', 'Company', 'Group').
  repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
    emit().times(2).
  path().
  dedup().
    by(unfold().order().by(id).fold())

它引发了以下错误:

Error: org.janusgraph.graphdb.relations.RelationIdentifier cannot be cast to java.lang.Comparable

所以我移动了dedup命令。像这样进入重复循环:

g.V().hasLabel('Location', 'Company', 'Group').
      repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath().dedup().by(unfold().order().by(id).fold())).
      emit().times(2).
      path().

而且它只return编辑了 6 条路径:

[
  [
    "JP Morgan Chase",
    "doesBusinessIn",
    "United States"
  ],
  [
    "JP Morgan Chase",
    "partOf",
    "BNP Paribas"
  ],
  [
    "JP Morgan Chase",
    "partOf",
    "BNP Paribas",
    "partOf",
    "France"
  ],
  [
    "Europe",
    "partOf",
    "BNP PARIBAS"
  ],
  [
    "BNP PARIBAS",
    "partOf",
    "Europe"
  ],
  [
    "United States",
    "doesBusinessIn",
    "JP Morgan Chase"
  ]
]

我不确定这里发生了什么......有什么想法吗?

Is there a way to specify that I need only 1 direction?

你有点需要双向遍历,所以你最后必须过滤重复的路径("duplicated" 在这种情况下意味着 2 条路径包含相同的元素)。为此,您可以 dedup() 按确定的元素顺序排列路径;最简单的方法是按 id.

对元素进行排序
g.V().hasLabel('Location', 'Company', 'Group').
  repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
    emit().times(2).
  path().
  dedup().
    by(unfold().order().by(id).fold())

Any idea what to add in my query to allow returning those kinds of paths (cyclic)?

您的查询明确阻止了通过 simplePath() 步骤的循环路径,因此不太清楚您希望在哪些情况下允许它们。如果循环仅由路径中的第一个和最后一个元素创建,我假设您可以接受循环路径。在这种情况下,查询看起来更像这样:

g.V().hasLabel('Location', 'Company', 'Group').as('a').
  repeat(bothE('doesBusinessIn', 'partOf').otherV()).
    emit().
    until(loops().is(4).or().cyclicPath()).
  filter(simplePath().or().where(eq('a'))).
  path().
  dedup().
    by(unfold().order().by(id).fold())

下面是 2 个查询的输出(忽略额外的 map() 步骤,它只是为了提高输出的可读性)。

gremlin> g.V().hasLabel('Location', 'Company', 'Group').
......1>   repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
......2>     emit().times(2).
......3>   path().
......4>   dedup().
......5>     by(unfold().order().by(id).fold()).
......6>   map(unfold().coalesce(values('text'), label()).fold())
==>[BNP Paribas,doesBusinessIn,United States]
==>[BNP Paribas,partOf,France]
==>[BNP Paribas,partOf,JP Morgan Chase]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase]
==>[BNP Paribas,partOf,JP Morgan Chase,doesBusinessIn,United States]
==>[BNP PARIBAS,partOf,Europe]
==>[JP Morgan Chase,doesBusinessIn,United States]
==>[JP Morgan Chase,partOf,BNP Paribas,doesBusinessIn,United States]
==>[JP Morgan Chase,partOf,BNP Paribas,partOf,France]
==>[France,partOf,BNP Paribas,doesBusinessIn,United States]

gremlin> g.V().hasLabel('Location', 'Company', 'Group').as('a').
......1>   repeat(bothE('doesBusinessIn', 'partOf').otherV()).
......2>     emit().
......3>     until(loops().is(4).or().cyclicPath()).
......4>   filter(simplePath().or().where(eq('a'))).
......5>   path().
......6>   dedup().
......7>     by(unfold().order().by(id).fold()).
......8>   map(unfold().coalesce(values('text'), label()).fold())
==>[BNP Paribas,doesBusinessIn,United States]
==>[BNP Paribas,partOf,France]
==>[BNP Paribas,partOf,JP Morgan Chase]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,BNP Paribas]
==>[BNP Paribas,partOf,France,partOf,BNP Paribas]
==>[BNP Paribas,partOf,JP Morgan Chase,doesBusinessIn,United States]
==>[BNP Paribas,partOf,JP Morgan Chase,partOf,BNP Paribas]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase,partOf,BNP Paribas]
==>[BNP PARIBAS,partOf,Europe]
==>[BNP PARIBAS,partOf,Europe,partOf,BNP PARIBAS]
==>[JP Morgan Chase,doesBusinessIn,United States]
==>[JP Morgan Chase,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase]
==>[JP Morgan Chase,partOf,BNP Paribas,doesBusinessIn,United States]
==>[JP Morgan Chase,partOf,BNP Paribas,partOf,France]
==>[JP Morgan Chase,partOf,BNP Paribas,partOf,JP Morgan Chase]
==>[JP Morgan Chase,doesBusinessIn,United States,doesBusinessIn,BNP Paribas,partOf,France]
==>[JP Morgan Chase,doesBusinessIn,United States,doesBusinessIn,BNP Paribas,partOf,JP Morgan Chase]
==>[France,partOf,BNP Paribas,doesBusinessIn,United States]
==>[France,partOf,BNP Paribas,partOf,France]
==>[France,partOf,BNP Paribas,partOf,JP Morgan Chase,doesBusinessIn,United States]
==>[United States,doesBusinessIn,JP Morgan Chase,doesBusinessIn,United States]
==>[United States,doesBusinessIn,BNP Paribas,doesBusinessIn,United States]
==>[United States,doesBusinessIn,JP Morgan Chase,partOf,BNP Paribas,doesBusinessIn,United States]
==>[Europe,partOf,BNP PARIBAS,partOf,Europe]

更新(基于最新评论)

由于 JanusGraph 具有不可比较的边标识符,因此您需要在所有边上具有唯一的可比较 属性。这可以像随机 UUID 一样简单。

这就是我更新您的示例图表的方式:

g.addV('Group').property('text', 'BNP Paribas').as('a').
  addV('Group').property('text', 'BNP PARIBAS').as('b').
  addV('Company').property('text', 'JP Morgan Chase').as('c').
  addV('Location').property('text', 'France').as('d').
  addV('Location').property('text', 'United States').as('e').
  addV('Location').property('text', 'Europe').as('f').
  addE('partOf').from('c').to('a').
    property('uuid', UUID.randomUUID().toString()).
  addE('doesBusinessIn').from('c').to('e').
    property('uuid', UUID.randomUUID().toString()).
  addE('doesBusinessIn').from('a').to('e').
    property('uuid', UUID.randomUUID().toString()).
  addE('partOf').from('a').to('d').
    property('uuid', UUID.randomUUID().toString()).
  addE('partOf').from('b').to('f').
    property('uuid', UUID.randomUUID().toString()).
  iterate()

现在,我们有了可以唯一标识边的属性,我们还需要所有顶点的唯一属性(相同数据类型)。幸运的是,现有的 text 属性似乎已经足够好了(否则它会和边缘一样 - 只需添加一个随机 UUID)。更新后的查询现在如下所示:

g.V().hasLabel('Location', 'Company', 'Group').
  repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
    emit().times(2).
  path().
  dedup().
    by(unfold().values('text','uuid').order().fold())

g.V().hasLabel('Location', 'Company', 'Group').as('a').
  repeat(bothE('doesBusinessIn', 'partOf').otherV()).
    emit().
    until(loops().is(4).or().cyclicPath()).
  filter(simplePath().or().where(eq('a'))).
  path().
  dedup().
    by(unfold().values('text','uuid').order().fold())

结果当然和上面一样