如何在 Cypher (Neo4j) 中通过给定的面包屑字符串获取路径?

How to fetch a path by a given breadcrumb string in Cypher (Neo4j)?

初始情况

CREATE
    (root:Root {name:'Root'}),
    (dirA:Directory {name:'dir A'}),
    (dirB:Directory {name:'dir B'}),
    (dirC:Directory {name:'dir C'}),
    (dirD:Directory {name:'dir D'}),
    (dirE:Directory {name:'dir E'}),
    (dirF:Directory {name:'dir F'}),
    (dirG:Directory {name:'dir G'}),
    (root)-[:CONTAINS]->(dirA),
    (root)-[:CONTAINS]->(dirB),
    (dirA)-[:CONTAINS]->(dirC),
    (dirA)-[:CONTAINS]->(dirD),
    (dirD)-[:CONTAINS]->(dirE),
    (dirD)-[:CONTAINS]->(dirF),
    (dirD)-[:CONTAINS]->(dirG);

给定输入参数

示例:

WITH 'dir A/dir D/dir G' as inputString
WITH split(inputString, '/') AS directories
UNWIND
    directories AS directory
RETURN
    directory;

╒═══════════╕
│"directory"│
╞═══════════╡
│"dir A"    │
├───────────┤
│"dir D"    │
├───────────┤
│"dir G"    │
└───────────┘

待解决的挑战

对于指定的面包屑字符串 ("dir A/dir D/dir G"),我需要它在 Cypher 中的表示路径,这将是更复杂查询的一部分。我不能只在树中搜索面包屑的最后一个目录条目 ("dir G"),因为目录名称 不是唯一的 。我的请求如何在Cypher中实现?

预期结果:

╒═══════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│"path"                                                                                                         │
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│[{"name":"Root"},{},{"name":"dir A"},{"name":"dir A"},{},{"name":"dir D"},{"name":"dir D"},{},{"name":"dir G"}]│
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

对于这种情况,我建议让每个 :Directory 节点都具有完整路径作为 属性,这将使与目录及其路径的匹配更容易:

MATCH path = (:Root)-[:CONTAINS*]->(d:Directory)
WITH d, [node in tail(nodes(path)) | node.name] as directories
WITH d, apoc.text.join(directories, '/') as pathString
SET d.path = pathString

(如果目录在树中移动,您可以使用类似的查询来更新目录(及其子目录)

有了这个集合,它可以很容易地匹配到路径的结束节点,即使你没有提供感兴趣路径上方的路径部分(你没有提到你提供的路径是否总是从根开始延伸,或者如果它只是路径的尾端):

WITH 'dir A/dir D/dir G' as inputString
MATCH (end:Directory)
WHERE end.path ENDS WITH inputString
RETURN end

因此,如果 :DIRECTORY(path) 被索引,那么您可以快速访问结束节点。现在去找其他人。

我们可以使用可变长度的路径表达式来查找这些节点的完整路径,使用 all() 谓词来确保路径中的每个节点都有一个来自拆分输入的名称,并且已检查扩张期间。这让我们得到了我们想要的节点的路径(浪费了对上面父节点的一次额外遍历),但它不能保证顺序,我们必须在之后过滤。

这应该适用于您的示例图:

WITH 'dir A/dir D/dir G' as inputString
WITH inputString, split(inputString, '/') as dirNames
MATCH (end:Directory)
WHERE end.path ENDS WITH inputString
MATCH path = (start)-[:CONTAINS*]->(end)
WHERE all(node in nodes(path) WHERE node.name IN dirNames)
WITH path
WHERE length(path) + 1 = size(dirNames) AND [node in nodes(path) | node.name] = dirNames
RETURN path