如何在 neo4j 中 return 所有具有相同剧组的电影?

How to return all movies with the same crew in neo4j?

由于我是 neo4j 的新手,我目前正在试验 neo4j 电影数据库示例。

我想知道比较子图和关系的最佳方法是什么,例如,如何获得所有具有相同剧组的电影。

基于这里关于 Whosebug 的其他问题,我让它适用于 return 所有特定演员共同出演的电影:

WITH ['Tom Hanks', 'Meg Ryan'] as names
MATCH (p:Person)
WHERE p.name in names
WITH collect(p) as persons
WITH head(persons) as head, tail(persons) as persons
MATCH (head)-[:ACTED_IN]->(m:Movie)
WHERE ALL(p in persons WHERE (p)-[:ACTED_IN]->(m))
RETURN m.title

但是,如何在不指定演员姓名的情况下检索具有相同演员的电影?

这个查询应该有效:

// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// match the second movie and all its actors
match (m2:Movie)<-[:ACTED_IN]-(a2:Person)
// avoid match the same movie with where id(m1) > id(m2)
where id(m1) > id(m2)
// order actors of m2 by name
with m1, m2, actors1, a2 order by a2.name
// store ordered actors of m2 into actors2 variable
// pass to the next context only when the ordered arrays (actors1 and actors2) are equals
with m1, m2, actors1, collect(a2) actors2 where actors1 = actors2
// return movies that have the same actors
return m1, m2 

使用电影数据库 (:play movie graph) 此查询产生此输出:

╒══════════════════════════════════════════════════════════════════════╤══════════════════════════════════════════════════════════════════════╕
│"m1"                                                                  │"m2"                                                                  │
╞══════════════════════════════════════════════════════════════════════╪══════════════════════════════════════════════════════════════════════╡
│{"title":"The Matrix Revolutions","tagline":"Everything that has a beg│{"title":"The Matrix Reloaded","tagline":"Free your mind","released":2│
│inning has an end","released":2003}                                   │003}                                                                  │
└──────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────┘

一些可能更有效的替代方法(使用 PROFILE 检查):

从电影到演员只匹配一次,然后将它们收集起来并展开你需要生成叉积的次数,然后过滤掉并进行比较。这使您不必多次访问数据库,因为您所需要的只是从第一次匹配中获得的数据。我要借用 Bruno 的查询并稍微调整一下。

// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// collect this data into a single collection
with collect({m:m1, actors:actors1}) as data
// generate cross product of the data
unwind data as d1
unwind data as d2
with d1, d2
// prevent comparison against the same movie, or the same pairs in different orders
where id(d1.m) < id(d2.m) and d1.actors = d2.actors
// return movies that have the same actors
return d1.m, d2.m

或者,您可以按演员对电影进行分组,并且只有 return 部相应分组的电影:

// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// group movies with their sets of actors
with collect(m1) as movies, actors1
// only interested in where multiple movies have the same actor sets
where size(movies) > 1
// return the collection of movies with the same actors
return movies

第二个查询在这里可能更好,因为您会得到所有具有相同演员阵容的电影,而不是每行成对。