如何在 Neo4j 中使用 Python 按权重在节点之间随机行走?

How can I randomly walk between nodes by weights in Neo4j with Python?

我使用以下代码在 Neo4j 中创建了节点,

from py2neo import Graph, Node, Relationship

g = Graph(password='neo4j')
tx = g.begin()

node1 = Node('Node', name='Node-1')
node2 = Node('Node', name='Node-2')
node3 = Node('Node', name='Node-3')
node4 = Node('Node', name='Node-4')
node5 = Node('Node', name='Node-5')
node6 = Node('Node', name='Node-6')
node7 = Node('Node', name='Node-7')

tx.create(node1)
tx.create(node2)
tx.create(node3)
tx.create(node4)
tx.create(node5)
tx.create(node6)
tx.create(node7)

rel12 = Relationship(node1, '0.2', node2, weight=0.2)
rel13 = Relationship(node1, '0.2', node3, weight=0.2)
rel14 = Relationship(node1, '0.6', node4, weight=0.6)
rel45 = Relationship(node4, '0.5', node5, weight=0.5)
rel46 = Relationship(node4, '0.3', node6, weight=0.3)
rel47 = Relationship(node4, '0.2', node7, weight=0.2)

tx.create(rel12)
tx.create(rel13)
tx.create(rel14)
tx.create(rel45)
tx.create(rel46)
tx.create(rel47)

tx.commit()

这是Neo4j界面中的图表,

我想select一个节点的名称,然后,我想随机走到另一个节点。但是随机selection应该是这样的,

import random

random.choices(['Node-2', 'Node-3', 'Node-4'], weights=(0.2, 0.2, 0.6))

我可以select下面代码的节点,但我不知道如何随机走到另一个节点。

from py2neo import Graph
from py2neo.matching import NodeMatcher

g = Graph(password='neo4j')
nodes = NodeMatcher(g)
node1 = nodes.match('Node', name='Node-1').first()

如果以node-1为起点,可以走的路,

Node-1 -> Node-2
Node-1 -> Node-3
Node-1 -> Node-4 -> Node-5
Node-1 -> Node-4 -> Node-6
Node-1 -> Node-4 -> Node-7

有什么想法吗?提前致谢。

Py2neo 支持进行 Cypher 查询,以及 here is a nice hello-world tutorial 如何做到这一点。

因此,我将提供一个带注释的 Cypher 查询,希望对您有用。

但首先,请注意几点:

  • 不应该为服务于相同目的的关系提供几乎无限数量的类型(如“0.2”、“0.5”等),因为当您想按类型搜索特定关系(这是您想要做的最常见的事情之一)时,这是非常无益的,并且会导致大量的关系类型。因此,我在回答中假设利益关系实际上都具有 TO 类型。
  • 我的查询使用临时 Temp 节点来存储查询的临时状态,因为它遍历随机路径中的关系。 Temp 节点将在查询结束时被删除。

查询如下:

// Get the (assumed-unique) starting Node `n` 
MATCH (n:Node)
WHERE n.name = 'Node-1'

// Create (if necessary) the unique `Temp` node, and initialize
// it with the native ID of the starting node and an empty `pathRels` list
MERGE (temp:Temp)
SET temp = {id: ID(n), pathRels: []}
WITH temp

// apoc.periodic.commit() repeatedly executes the query passed to it
// until it returns 0 or NULL.
// The query passed here iteratively extends the list of relationships
// in `temp.pathRels`. In each iteration, if the current `temp.id`
// node has any outgoing `TO` relationships, the query:
// - appends to `temp.pathRels` a randomly-selected relationship, taking
//   into account the relationship weights (which MUST sum to 1.0),
// - sets `temp.id` to the ID of the end node of that selected relationship,
// - and returns 1.
// But if the current `temp.id` node has no outgoing `TO` relationships, then
// the query returns 0.
CALL apoc.periodic.commit(
  "
    MATCH (a:Node)
    WHERE ID(a) = $temp.id
    WITH a, [(a)-[rel:TO]->() | rel] AS rels
    LIMIT 1 // apoc.periodic.commit requires a LIMIT clause. `LIMIT 1` should be harmless here.
    CALL apoc.do.when(
      SIZE(rels) > 0,
      '
       WITH temp, a, REDUCE(s={x: rand()}, r IN rels | CASE
         WHEN s.x IS NULL THEN s
         WHEN s.x < r.weight THEN {x: NULL, pathRel: r}
         ELSE {x: s.x - r.weight} END
       ).pathRel AS pathRel
       SET temp.id = ID(ENDNODE(pathRel)), temp.pathRels = temp.pathRels + pathRel
       RETURN 1 AS result
      ',
      '
       RETURN 0 AS result
      ',
      {temp: $temp, a: a, rels: rels}
    ) YIELD value
    RETURN value.result
  ",
  {temp: temp}
) YIELD batchErrors

// Use the `temp.pathRels` list to generate the `weightedRandomPath`
// (or you could just return `pathRels` as-is).
// Then delete the `Temp` node, since it is no longer needed.
// Finally, return `weightedRandomPath`, and also the `batchErrors` returned by
// apoc.periodic.commit() (in case it had any errors). 
WITH temp, apoc.path.create(STARTNODE(temp.pathRels[0]), temp.pathRels) AS weightedRandomPath, batchErrors
DELETE temp
RETURN weightedRandomPath, batchErrors