Neo4j 添加新节点时的查询性能

Query performance when adding a new node in Neo4j

我想知道为什么我的 Cypher 查询花费了过多的时间。

基本上,我有一个小家谱(两个家庭),我正在尝试向每个家庭添加一个新节点,该节点携带一些元数据,以便家庭更容易彼此隔离当他们被查询时。 (感谢@Tim Kuehn )。

一旦我 运行 查询来填充我的两个家庭,我就有了这个,它构建得很快没有问题:

接下来,我要创建上述新节点。第一个节点创建很快,应用于较小的家庭(我称他们为家庭 B):

// 'add a :Family node for each relational group, like so:'

CREATE (famB:Family) 
WITH famB
MATCH (a:Person {name:"Gramps Johnson"})-[:RELATED_TO*]->(b:Person)  
MERGE (famB:Family)<-[:FAMILY]-(a) 
MERGE (famB:Family)<-[:FAMILY]-(b) 

...这给了我这个。到目前为止,一切都很好!

但是,向前看,由于某种原因从未创建稍大的家族节点。代码是相同的,但查询只是 运行s 和 运行s...

// 'add a :Family node for each relational group, like so:'

CREATE (famA:Family) 
WITH famA
MATCH (a:Person {name:"Gramps Doe"})-[:RELATED_TO*]->(b:Person)  
MERGE (famA:Family)<-[:FAMILY]-(a) 
MERGE (famA:Family)<-[:FAMILY]-(b)

为什么会这样?

我的第一个想法是在 name 属性 上放置一个索引:

// put index' on the name properties of the nodes:
// CREATE INDEX ON :Person(name)  

但这并没有起到任何作用。

所以我试着查看 EXPLAIN,但它并没有真正告诉我任何信息。 (它在执行时也会 运行 在终端本身上永远存在。)

感谢您的帮助。

这是我创建图表的代码:

// FAMILY A2: create grandparents, their son.

CREATE (grampsdoe:Person {name: 'Gramps Doe', id:'1', Gender:'Male', Diagnosis: 'Alzheimers', `Is Alive?`: 'No', Handedness: 'Left', `Risk Score`: 'PURPLE'})
CREATE (gramsdoe:Person {name: 'Grams Doe', id:'2', Gender:'Female', Diagnosis: 'Alzheimers', `Is Alive?`: 'No', Handedness: 'Right', `Risk Score`: 'GIRAFFE'})
CREATE (daddoe:Person {name: 'Dad Doe', id:'3', Gender:'Male', Diagnosis: 'MCI', `Is Alive?`: 'No', Handedness: 'Right', `Risk Score`: 'GIRAFFE'})

CREATE
(grampsdoe)-[:RELATED_TO {relationship: 'Husband'}]->(gramsdoe),
(gramsdoe)-[:RELATED_TO {relationship: 'Wife'}]->(grampsdoe),
(grampsdoe)-[:RELATED_TO {relationship: 'Father'}]->(daddoe),
(gramsdoe)-[:RELATED_TO {relationship: 'Mother'}]->(daddoe),
(daddoe)-[:RELATED_TO {relationship: 'Son'}]->(grampsdoe),
(daddoe)-[:RELATED_TO {relationship: 'Son'}]->(gramsdoe)


// FAMILY A2: create grandparents, their daughter

CREATE (grampssmith:Person {name: 'Gramps Smith', id:'4', Gender:'Male', Diagnosis: 'Normal', `Is Alive?`: 'No', Handedness: 'Left', `Risk Score`: 'PURPLE'})
CREATE (gramssmith:Person {name: 'Grams Smith', id:'5', Gender:'Female', Diagnosis: 'Alzheimers', `Is Alive?`: 'No', Handedness: 'Ambidextrous', `Risk Score`: 'PURPLE'})
CREATE (momsmith:Person {name: 'Mom Doe', id:'6', Gender:'Female', Diagnosis: 'Alzheimers', `Is Alive?`: 'No', Handedness: 'Right', `Risk Score`: 'GIRAFFE'})

CREATE
(grampssmith)-[:RELATED_TO {relationship: 'Husband'}]->(gramssmith),
(gramssmith)-[:RELATED_TO {relationship: 'Wife'}]->(grampssmith),
(grampssmith)-[:RELATED_TO {relationship: 'Father'}]->(momsmith),
(gramssmith)-[:RELATED_TO {relationship: 'Mother'}]->(momsmith),
(momsmith)-[:RELATED_TO {relationship: 'Daughter'}]->(grampssmith),
(momsmith)-[:RELATED_TO {relationship: 'Daughter'}]->(gramssmith)


// FAMILY A3: 'Dad Doe' and 'Mom Smith' get married and have 2 kids who are twins
CREATE (lilbro:Person {name: 'Lil Bro', id:'7', Gender:'Male', Diagnosis: 'Normal', `Is Alive?`: 'Yes', Handedness: 'Right', `Risk Score`: 'PURPLE'})
CREATE (bigsis:Person {name: 'Big Sis', id:'8', Gender:'Female', Diagnosis: 'Normal', `Is Alive?`: 'Yes', Handedness: 'Right', `Risk Score`: 'PURPLE'})

CREATE (daddoe)-[:RELATED_TO {relationship: 'Husband'}]->(momsmith)
CREATE (momsmith)-[:RELATED_TO {relationship: 'Wife'}]->(daddoe) 

CREATE (lilbro)-[:RELATED_TO {relationship: 'Brother'}]->(bigsis)

CREATE
(lilbro)-[:RELATED_TO {relationship: 'Grandson'}]->(grampsdoe),
(grampsdoe)-[:RELATED_TO {relationship: 'Grandfather'}]->(lilbro),
(lilbro)-[:RELATED_TO {relationship: 'Grandson'}]->(grampssmith),
(grampssmith)-[:RELATED_TO {relationship: 'Grandfather'}]->(lilbro),

(lilbro)-[:RELATED_TO {relationship: 'Grandson'}]->(grampssmith),
(grampssmith)-[:RELATED_TO {relationship: 'Grandmother'}]->(lilbro),
(lilbro)-[:RELATED_TO {relationship: 'Grandson'}]->(gramssmith),
(gramssmith)-[:RELATED_TO {relationship: 'Grandmother'}]->(lilbro),


(lilbro)-[:RELATED_TO {relationship: 'Son'}]->(daddoe),
(daddoe)-[:RELATED_TO {relationship: 'Father'}]->(lilbro),
(lilbro)-[:RELATED_TO {relationship: 'Son'}]->(momsmith),
(momsmith)-[:RELATED_TO {relationship: 'Mother'}]->(lilbro),

(bigsis)-[:RELATED_TO {relationship: 'Sister'}]->(lilbro),

(bigsis)-[:RELATED_TO {relationship: 'Granddaughter'}]->(grampsdoe),
(grampsdoe)-[:RELATED_TO {relationship: 'Grandfather'}]->(bigsis),
(bigsis)-[:RELATED_TO {relationship: 'Granddaughter'}]->(grampssmith),
(grampssmith)-[:RELATED_TO {relationship: 'Grandfather'}]->(bigsis),

(bigsis)-[:RELATED_TO {relationship: 'Granddaughter'}]->(gramsdoe),
(gramsdoe)-[:RELATED_TO {relationship: 'Grandmother'}]->(bigsis),
(bigsis)-[:RELATED_TO {relationship: 'Granddaughter'}]->(gramssmith),
(gramssmith)-[:RELATED_TO {relationship: 'Grandfather'}]->(bigsis),


(bigsis)-[:RELATED_TO {relationship: 'Daughter'}]->(daddoe),
(daddoe)-[:RELATED_TO {relationship: 'Father'}]->(bigsis),
(bigsis)-[:RELATED_TO {relationship: 'Daughter'}]->(momsmith),
(momsmith)-[:RELATED_TO {relationship: 'Mother'}]->(bigsis)



// FAMILY B1: create grandparents, their son.

CREATE (grampsjohnson:Person {name: 'Gramps Johnson', id:'9', Gender:'Male', Diagnosis: 'Normal', `Is Alive?`: 'No', Handedness: 'Right', `Risk Score`: 'GIRAFFE'})
CREATE (gramsjohnson:Person {name: 'Grams Johnson', id:'10', Gender:'Female', Diagnosis: 'Normal', `Is Alive?`: 'No', Handedness: 'Right', `Risk Score`: 'GIRAFFE'})
CREATE (johnjohnson:Person {name: 'John Johnson', id:'11', Gender:'Male', Diagnosis: 'MCI', `Is Alive?`: 'Yes', Handedness: 'Right', `Risk Score`: 'GIRAFFE'})

CREATE
(grampsjohnson)-[:RELATED_TO {relationship: 'Husband'}]->(gramsjohnson),
(gramsjohnson)-[:RELATED_TO {relationship: 'Wife'}]->(grampsjohnson),
(grampsjohnson)-[:RELATED_TO {relationship: 'Father'}]->(johnjohnson),
(gramsjohnson)-[:RELATED_TO {relationship: 'Mother'}]->(johnjohnson),
(johnjohnson)-[:RELATED_TO {relationship: 'Son'}]->(grampsjohnson),
(johnjohnson)-[:RELATED_TO {relationship: 'Son'}]->(gramsjohnson)

Why would this happen?

发生这种情况的原因是第二个家庭不再是循环,而是 "everyone connected twice to everyone"。这意味着 "make a family node" 代码的这一部分:

MATCH (a:Person {name:"Gramps Doe"})-[:RELATED_TO*]->(b:Person)  

正在跟踪大量图表,结果系统停滞了。

由于目标组中有 8 个节点,我将路径限制在 1 到 8 跳的范围内 ([:RELATED_TO*1..8] ) -

CREATE (famA:Family) 
WITH famA
MATCH (a:Person {name:"Gramps Doe"})-[:RELATED_TO*1..8]->(b:Person)  
MERGE (famA:Family)<-[:FAMILY]-(a) 
MERGE (famA:Family)<-[:FAMILY]-(b)

然后 运行 完成。

为了让整个家庭的疾病出现一定次数:

// count the family members with a disease
MATCH (f:Family)<-[:FAMILY]-(person:Person) 
WHERE person.Diagnosis = "Alzheimers" 
WITH f, count(person) AS Count 
WHERE Count > 2 

// Then report the family members as a single collection
MATCH (a:Person)-[r1:FAMILY]-(f)
RETURN collect(DISTINCT a)