Neo4j - 不知道如何改进密码查询
Neo4j - don't know how to improve cypher query
我让这个查询返回得非常快,0.5 秒,并返回了预期的所有 303 条记录。注:这里的"Woka"表示"Book".
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author),
(l:Language)-[t:USED]->(w:Woka)-[u:INCLUDED]->(b:Bisac)
WHERE (a.author_name = 'Camus, Albert')
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value
ORDER BY woka_id;
我想添加更多信息,例如描述。我创建了描述节点并创建了关系,这些关系存在于语言和描述以及描述和书籍(Woka)之间。
下面的查询 returns 所有描述都为空,但只针对 60 条记录而不是 303 条记录。这是因为并非所有书籍都有描述。执行时间还是可以的,0.3秒。
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author),
(l:Language)-[t:USED]->(w:Woka), (b:Bisac)<-[u:INCLUDED]-(w:Woka),
(d:Description)-[v:HAS_DESCRIPTION]-(w)
WHERE (a.author_name = 'Camus, Albert')
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value, d.description as description
ORDER BY woka_id;
不过我知道结果集中遗漏了一些记录,50和303之间的区别是有描述的。我使用 OPTIONAL 构建了另一个查询,但是这个查询(如下所示)从未 returns,永远运行。
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author),
(l:Language)-[t:USED]->(w:Woka)-[u:INCLUDED]->(b:Bisac)
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w:Woka)-[:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Camus, Albert')
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value, d.description as description
ORDER BY woka_id;
不知道如何改进查询以获取可选描述(其中存在)以及当这些描述不存在时对于 303 条记录的原始结果集获取空值?
你能试试这个吗?
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author), (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WHERE (a.author_name = 'Camus, Albert')
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value, d.description as description
ORDER BY woka_id;
我想我们之前已经谈过这个话题了。
你必须降低你的中间基数
在你的关系中使用方向
不要重复你已经解决的模式,比如
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w:Woka)-[:AUTHORED]-(a:Author)
应该是
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
如果你匹配的是长路径,你会在中间创建 很多 潜在匹配,对于这些行中的每一行 下一个匹配( es) 被执行,如果它们每行创建多行,您将得到 rows1*rows2*rows3
的乘积
因此您必须使用 DISTINCT
或两者之间的聚合来尽可能降低基数。
只是为您的第一个示例演示它,一次使用 DISTINCT
,一次使用 collect
。这里可能没有必要,但它只是为了演示,因为示例足够小。
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Camus, Albert')
WITH DISTINCT w,a,p
MATCH (l:Language)-[t:USED]->(w)
WITH w,a,p, collect(l) as languages
MATCH (w)-[u:INCLUDED]->(b:Bisac)
RETURN w.woka_id as woka_id, w.woka_title as woka_title,
p.publisher_name as publisher_name,
a.author_name as author_name,
[l in languages | l.language_name] as language_names,
b.bisac_code as bisac_code, b.bisac_value as bisac_value
ORDER BY woka_id;
您正确地使用了 OPTIONAL MATCH
,但是您必须再次考虑到潜在的额外行数会相乘。
OPTIONAL MATCH 的另一种选择是使用路径表达式和解构,例如描述:
RETURN w.woka_id as woka_id, w.woka_title as woka_title,
[p in ()<-[:HAS_DESCRIPTION]-(w) | head(nodes(p)).description] as descriptions
除了@pablosaraiva 的回复之外,请确保您有关于 :Author 和 属性 author_name
:
的索引
create index on :Author(author_name)
如果这和 pablo 的回复没有帮助,请 post 您的查询的查询计划。为此使用 explain <myquery>
(假设您使用的是 >=2.2)
我让这个查询返回得非常快,0.5 秒,并返回了预期的所有 303 条记录。注:这里的"Woka"表示"Book".
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author),
(l:Language)-[t:USED]->(w:Woka)-[u:INCLUDED]->(b:Bisac)
WHERE (a.author_name = 'Camus, Albert')
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value
ORDER BY woka_id;
我想添加更多信息,例如描述。我创建了描述节点并创建了关系,这些关系存在于语言和描述以及描述和书籍(Woka)之间。 下面的查询 returns 所有描述都为空,但只针对 60 条记录而不是 303 条记录。这是因为并非所有书籍都有描述。执行时间还是可以的,0.3秒。
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author),
(l:Language)-[t:USED]->(w:Woka), (b:Bisac)<-[u:INCLUDED]-(w:Woka),
(d:Description)-[v:HAS_DESCRIPTION]-(w)
WHERE (a.author_name = 'Camus, Albert')
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value, d.description as description
ORDER BY woka_id;
不过我知道结果集中遗漏了一些记录,50和303之间的区别是有描述的。我使用 OPTIONAL 构建了另一个查询,但是这个查询(如下所示)从未 returns,永远运行。
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author),
(l:Language)-[t:USED]->(w:Woka)-[u:INCLUDED]->(b:Bisac)
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w:Woka)-[:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Camus, Albert')
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value, d.description as description
ORDER BY woka_id;
不知道如何改进查询以获取可选描述(其中存在)以及当这些描述不存在时对于 303 条记录的原始结果集获取空值?
你能试试这个吗?
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author), (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WHERE (a.author_name = 'Camus, Albert')
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w.woka_id as woka_id, p.publisher_name as publisher_name, w.woka_title as woka_title, a.author_name as author_name, l.language_name as language_name, b.bisac_code as bisac_code, b.bisac_value as bisac_value, d.description as description
ORDER BY woka_id;
我想我们之前已经谈过这个话题了。
你必须降低你的中间基数
在你的关系中使用方向
不要重复你已经解决的模式,比如
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w:Woka)-[:AUTHORED]-(a:Author)
应该是
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
如果你匹配的是长路径,你会在中间创建 很多 潜在匹配,对于这些行中的每一行 下一个匹配( es) 被执行,如果它们每行创建多行,您将得到 rows1*rows2*rows3
因此您必须使用 DISTINCT
或两者之间的聚合来尽可能降低基数。
只是为您的第一个示例演示它,一次使用 DISTINCT
,一次使用 collect
。这里可能没有必要,但它只是为了演示,因为示例足够小。
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Camus, Albert')
WITH DISTINCT w,a,p
MATCH (l:Language)-[t:USED]->(w)
WITH w,a,p, collect(l) as languages
MATCH (w)-[u:INCLUDED]->(b:Bisac)
RETURN w.woka_id as woka_id, w.woka_title as woka_title,
p.publisher_name as publisher_name,
a.author_name as author_name,
[l in languages | l.language_name] as language_names,
b.bisac_code as bisac_code, b.bisac_value as bisac_value
ORDER BY woka_id;
您正确地使用了 OPTIONAL MATCH
,但是您必须再次考虑到潜在的额外行数会相乘。
OPTIONAL MATCH 的另一种选择是使用路径表达式和解构,例如描述:
RETURN w.woka_id as woka_id, w.woka_title as woka_title,
[p in ()<-[:HAS_DESCRIPTION]-(w) | head(nodes(p)).description] as descriptions
除了@pablosaraiva 的回复之外,请确保您有关于 :Author 和 属性 author_name
:
create index on :Author(author_name)
如果这和 pablo 的回复没有帮助,请 post 您的查询的查询计划。为此使用 explain <myquery>
(假设您使用的是 >=2.2)