Neo4j 中的联合和计数集合
Union and Count Collections in Neo4j
我的简单数据库包含相互链接的 'terms' 和 'codes' 节点。
有两种类型的关系。
'terms' 和 'codes' 之间的关系称为 :CODE 并且是无向的(或在两个方向上同等阅读)。
'terms' 之间的关系称为 :NT(这意味着狭义术语)并且是定向的。
我想从上到下遍历所有 'terms' 并收集所有唯一代码并计算它们。
这是我的查询:
MATCH (a)-[:NT*]->(b), (a)-[:CODE]-(c), (b)-[:CODE]-(d)
WHERE a.btqty = 0
RETURN a.termid AS termid, a.maxlen AS maxlen, COUNT(DISTINCT c.code) + COUNT(DISTINCT d.code) AS total, COLLECT(DISTINCT c.code) + COLLECT(DISTINCT d.code) AS codes
ORDER BY termid;
这是我得到的:
termid maxlen total codes
22 2 3 ["S70","S43","S70"]
25 4 9 ["S20","S21","S54","S61","S63","S63","S21","S61","S54"]
26 2 9 ["S99","S98","S29","S13","S13","S20","S29","S14","S15"]
68 5 13 ["S38","S11","S12","S11","S12","S38","S37","S21","S36","S22","S98","S63","S58"]
123 2 3 ["S38","S12","S12"]
154 2 2 ["S58","S58"]
155 4 3 ["S63","S62","S63"]
159 2 2 ["S36","S36"]
...
我需要删除集合中的重复项并像这样正确计算它们:
termid maxlen total codes
22 2 2 ["S43","S70"]
25 4 5 ["S20","S21","S54","S61","S63"]
26 2 7 ["S99","S98","S29","S13","S20","S14","S15"]
68 5 10 ["S38","S11","S12","S37","S21","S36","S22","S98","S63","S58"]
123 2 2 ["S38","S12"]
154 2 1 ["S58"]
155 4 2 ["S63","S62"]
159 2 1 ["S36"]
...
我认为这是关于应用 REDUCE 函数的东西,但我不知道如何使用它。
感谢您的帮助!
你说得对,这可以使用 REDUCE
来解决。在reduce里面你需要检查当前元素是否已经存在于累加器中并有条件地修改它:
MATCH (a)-[:NT*]->(b), (a)-[:CODE]-(c), (b)-[:CODE]-(d)
WHERE a.btqty = 0
WITH a.termid AS termid, a.maxlen AS maxlen,
REDUCE(uniqueCodes=[],
x in COLLECT(DISTINCT c.code) + COLLECT(DISTINCT d.code) |
CASE WHEN x IN uniqueCodes THEN uniqueCodes ELSE uniqueCodes+x END
) AS codes
ORDER BY termid
RETURN termid, maxlen, count(codes) as total, codes
我的简单数据库包含相互链接的 'terms' 和 'codes' 节点。 有两种类型的关系。
'terms' 和 'codes' 之间的关系称为 :CODE 并且是无向的(或在两个方向上同等阅读)。 'terms' 之间的关系称为 :NT(这意味着狭义术语)并且是定向的。
我想从上到下遍历所有 'terms' 并收集所有唯一代码并计算它们。
这是我的查询:
MATCH (a)-[:NT*]->(b), (a)-[:CODE]-(c), (b)-[:CODE]-(d)
WHERE a.btqty = 0
RETURN a.termid AS termid, a.maxlen AS maxlen, COUNT(DISTINCT c.code) + COUNT(DISTINCT d.code) AS total, COLLECT(DISTINCT c.code) + COLLECT(DISTINCT d.code) AS codes
ORDER BY termid;
这是我得到的:
termid maxlen total codes
22 2 3 ["S70","S43","S70"]
25 4 9 ["S20","S21","S54","S61","S63","S63","S21","S61","S54"]
26 2 9 ["S99","S98","S29","S13","S13","S20","S29","S14","S15"]
68 5 13 ["S38","S11","S12","S11","S12","S38","S37","S21","S36","S22","S98","S63","S58"]
123 2 3 ["S38","S12","S12"]
154 2 2 ["S58","S58"]
155 4 3 ["S63","S62","S63"]
159 2 2 ["S36","S36"]
...
我需要删除集合中的重复项并像这样正确计算它们:
termid maxlen total codes
22 2 2 ["S43","S70"]
25 4 5 ["S20","S21","S54","S61","S63"]
26 2 7 ["S99","S98","S29","S13","S20","S14","S15"]
68 5 10 ["S38","S11","S12","S37","S21","S36","S22","S98","S63","S58"]
123 2 2 ["S38","S12"]
154 2 1 ["S58"]
155 4 2 ["S63","S62"]
159 2 1 ["S36"]
...
我认为这是关于应用 REDUCE 函数的东西,但我不知道如何使用它。
感谢您的帮助!
你说得对,这可以使用 REDUCE
来解决。在reduce里面你需要检查当前元素是否已经存在于累加器中并有条件地修改它:
MATCH (a)-[:NT*]->(b), (a)-[:CODE]-(c), (b)-[:CODE]-(d)
WHERE a.btqty = 0
WITH a.termid AS termid, a.maxlen AS maxlen,
REDUCE(uniqueCodes=[],
x in COLLECT(DISTINCT c.code) + COLLECT(DISTINCT d.code) |
CASE WHEN x IN uniqueCodes THEN uniqueCodes ELSE uniqueCodes+x END
) AS codes
ORDER BY termid
RETURN termid, maxlen, count(codes) as total, codes