三角形 Counting/Clustering Neo4j
Triangle Counting/Clustering Neo4j
我想在我的 Neo4j 图中测试三角聚类。这是一个示例:
CREATE(a:Person { name: "a" })-[:FRIENDS]->(b:Person {name : "b"}),
(a)-[:WORKS_AT]->(p:Business {name : "Mcdonalds"}),
(b)-[:WORKS_AT]->(p),
(c:Person { name: "c"})-[:FRIENDS]->(a),
(c:Person { name: "c"})-[:FRIENDS]->(b),
(d:Person { name: "d"})-[:FRIENDS]->(a)
return *
MATCH (c:Person {name: "c"}),(p:Business {name : "Mcdonalds"}), (d:Person { name: "d"}),(b:Person {name : "b"})
CREATE (c)-[:WORKS_AT]->(p),
(e:Person { name: "e"})-[:FRIENDS]->(c),
(d)-[:FRIENDS]->(c),
(d)-[:FRIENDS]->(e),
(f:Person { name: "f"})-[:FRIENDS]->(b),
(g:Person { name: "g"})-[:FRIENDS]->(b),
(i:Person { name: "i"})-[:FRIENDS]->(b),
(h:Person { name: "h"})-[:FRIENDS]->(b),
(j:Person { name: "j"})-[:FRIENDS]->(b),
(k:Person { name: "k"})-[:FRIENDS]->(b)
return *
MATCH (g:Person {name: "g"}),(f:Person {name: "f"}),(c:Person {name: "c"}), (e:Person {name: "e"})
CREATE (g)-[:FRIENDS]->(c),
(f)-[:FRIENDS]->(c),
(g)-[:FRIENDS]->(e)
return *
在我的示例图中,我希望 select 节点 a、b、c 基于它们与麦当劳的 :works_at 关系,然后查看那些具有 :friends 关系的节点并使用它们进行三角计数。我得到了部分答案:
CALL algo.triangleCount(
'MATCH (p:Person)-[]-(:Person)-[:WORKS_AT]-(:Business {name : "Mcdonalds"}) RETURN id(p) as id',
'MATCH (p1:Person)-[:FRIENDS]->(p2:Person) RETURN id(p1) as source, id(p2) as target',
{concurrency:4, write:true, writeProperty:'triangle',graph:'cypher', clusteringCoefficientProperty:'coefficient'})
YIELD loadMillis, computeMillis, writeMillis, nodeCount, triangleCount, averageClusteringCoefficient
但我想要更接近 documentation 中的流示例中列出的内容,其中包含 nodeId(在本示例中 node.name)、三角形和系数的细分。
我更接近于:
CALL algo.triangleCount.stream(
'MATCH (p:Person)-[]-(:Person)-[:WORKS_AT]-(:Business {name : "Mcdonalds"}) RETURN id(p) as id',
'MATCH (p1:Person)-[:FRIENDS]->(p2:Person) RETURN id(p1) as source, id(p2) as target',
{concurrency:4, write:true, writeProperty:'triangle',graph:'cypher', clusteringCoefficientProperty:'coefficient'})
YIELD nodeId, triangles, coefficient
MATCH (p:Person) WHERE id(p) = nodeId
RETURN p.id as name, triangles, coefficient ORDER BY coefficient DESC
CALL algo.triangleCount.stream('match (p:Person)-[*1..2]-(b:Business) return p', '[]', {concurrency:4})
YIELD nodeId, triangles, coefficient
MATCH (p:Person) WHERE id(p) = nodeId
RETURN p.name AS name, triangles, coefficient
ORDER BY triangles
这是我想出的答案。我缺少的关键是理解 triangleCount 和 triangleCount.stream 之间的区别。 Stream 实际上分析数据,而普通的 triangleCount 仅提供性能、计数等方面的统计数据。
我想在我的 Neo4j 图中测试三角聚类。这是一个示例:
CREATE(a:Person { name: "a" })-[:FRIENDS]->(b:Person {name : "b"}),
(a)-[:WORKS_AT]->(p:Business {name : "Mcdonalds"}),
(b)-[:WORKS_AT]->(p),
(c:Person { name: "c"})-[:FRIENDS]->(a),
(c:Person { name: "c"})-[:FRIENDS]->(b),
(d:Person { name: "d"})-[:FRIENDS]->(a)
return *
MATCH (c:Person {name: "c"}),(p:Business {name : "Mcdonalds"}), (d:Person { name: "d"}),(b:Person {name : "b"})
CREATE (c)-[:WORKS_AT]->(p),
(e:Person { name: "e"})-[:FRIENDS]->(c),
(d)-[:FRIENDS]->(c),
(d)-[:FRIENDS]->(e),
(f:Person { name: "f"})-[:FRIENDS]->(b),
(g:Person { name: "g"})-[:FRIENDS]->(b),
(i:Person { name: "i"})-[:FRIENDS]->(b),
(h:Person { name: "h"})-[:FRIENDS]->(b),
(j:Person { name: "j"})-[:FRIENDS]->(b),
(k:Person { name: "k"})-[:FRIENDS]->(b)
return *
MATCH (g:Person {name: "g"}),(f:Person {name: "f"}),(c:Person {name: "c"}), (e:Person {name: "e"})
CREATE (g)-[:FRIENDS]->(c),
(f)-[:FRIENDS]->(c),
(g)-[:FRIENDS]->(e)
return *
在我的示例图中,我希望 select 节点 a、b、c 基于它们与麦当劳的 :works_at 关系,然后查看那些具有 :friends 关系的节点并使用它们进行三角计数。我得到了部分答案:
CALL algo.triangleCount(
'MATCH (p:Person)-[]-(:Person)-[:WORKS_AT]-(:Business {name : "Mcdonalds"}) RETURN id(p) as id',
'MATCH (p1:Person)-[:FRIENDS]->(p2:Person) RETURN id(p1) as source, id(p2) as target',
{concurrency:4, write:true, writeProperty:'triangle',graph:'cypher', clusteringCoefficientProperty:'coefficient'})
YIELD loadMillis, computeMillis, writeMillis, nodeCount, triangleCount, averageClusteringCoefficient
但我想要更接近 documentation 中的流示例中列出的内容,其中包含 nodeId(在本示例中 node.name)、三角形和系数的细分。
我更接近于:
CALL algo.triangleCount.stream(
'MATCH (p:Person)-[]-(:Person)-[:WORKS_AT]-(:Business {name : "Mcdonalds"}) RETURN id(p) as id',
'MATCH (p1:Person)-[:FRIENDS]->(p2:Person) RETURN id(p1) as source, id(p2) as target',
{concurrency:4, write:true, writeProperty:'triangle',graph:'cypher', clusteringCoefficientProperty:'coefficient'})
YIELD nodeId, triangles, coefficient
MATCH (p:Person) WHERE id(p) = nodeId
RETURN p.id as name, triangles, coefficient ORDER BY coefficient DESC
CALL algo.triangleCount.stream('match (p:Person)-[*1..2]-(b:Business) return p', '[]', {concurrency:4})
YIELD nodeId, triangles, coefficient
MATCH (p:Person) WHERE id(p) = nodeId
RETURN p.name AS name, triangles, coefficient
ORDER BY triangles
这是我想出的答案。我缺少的关键是理解 triangleCount 和 triangleCount.stream 之间的区别。 Stream 实际上分析数据,而普通的 triangleCount 仅提供性能、计数等方面的统计数据。