Neo4j:为什么 allShortestPaths 函数的性能这么慢?
Neo4j: why the performance of allShortestPaths function is so slow?
我正在使用 Neo4j 'neo4j-community-2.3.0-RC1' 版本。
在我的数据库中只有 1054 个节点。
当我使用 'allShotestPaths' 函数进行路径查询时,为什么这么慢。
大约需要1秒多,单元测试结果如下:
√ search optimalPath Path (192ms)
√ search optimal Path by Lat Lng (1131ms)
我应该优化查询吗?以下是 'optimalPath' 和 'optimal Path by Lat Lng'
的查询
optimalPath 查询:
MATCH path=allShortestPaths((start:潍坊_STATION )-[rels*..50]->(end:潍坊_STATION {name:"火车站"}))
RETURN NODES(path) AS stations,relationships(path) AS path,length(path) AS stop_count,
length(FILTER(index IN RANGE(1, length(rels)-1) WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
order by transfer_count,walk_count,stop_count
Lat Lng 查询的最佳路径:
MATCH path=allShortestPaths((start:潍坊_STATION {name:"公交总公司"})-[rels*..50]->(end:潍坊_STATION {name:"火车站"}))
WHERE
round(
6378.137 *1000*2*
asin(sqrt(
sin((radians(start.lat)-radians(36.714))/2)^2+cos(radians(start.lat))*cos(radians(36.714))*
sin((radians(start.lng)-radians(119.1268))/2)^2
))
)/1000 < 0.5 // this formula is used to calculate the distance between two GEO coordinate (latitude\longitude)
RETURN NODES(path) AS stations,relationships(path) AS path,length(path) AS stop_count,
length(FILTER(index IN RANGE(1, length(rels)-1) WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
order by transfer_count,walk_count,stop_count
您可以在这里下载数据库:https://www.dropbox.com/s/zamkyh2aaw3voe6/data.rar?dl=0
如果有人能帮助我,我将不胜感激。谢谢
一般情况下,在不知道更多的情况下,我会在匹配之前,在路径全部展开之前拉出可以计算的谓词和表达式。
并且由于您的地理过滤器独立于除您的参数和起始节点之外的任何其他内容,您可以执行以下操作:
MATCH (start:潍坊_STATION {name:"公交总公司"})
WHERE
// this formula is used to calculate the distance between two GEO coordinate (latitude\longitude)
round(6378.137 *1000*2*
asin(sqrt(sin((radians(start.lat)-radians({lat}))/2)^2
+cos(radians(start.lat))*cos(radians({lat}))*
sin((radians(start.lng)-radians({lng}))/2)^2)))/1000
< 0.5
MATCH (end:潍坊_STATION {name:"火车站"})
MATCH path=allShortestPaths((start)-[rels*..50]->(end))
RETURN NODES(path) AS stations,
relationships(path) AS path,
length(path) AS stop_count,
length(FILTER(index IN RANGE(1, length(rels)-1)
WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
ORDER BY transfer_count,walk_count,stop_count;
看到这个测试(但另一个查询同样快):
neo4j-sh (?)$ MATCH (start:潍坊_STATION {name:"公交总公司"})
>
> WHERE
> // this formula is used to calculate the distance between two GEO coordinate (latitude\longitude)
> round(6378.137 *1000*2*
> asin(sqrt(sin((radians(start.lat)-radians({lat}))/2)^2
> +cos(radians(start.lat))*cos(radians({lat}))*
> sin((radians(start.lng)-radians({lng}))/2)^2)))/1000
> < 0.5
>
> MATCH (end:潍坊_STATION {name:"火车站"})
> MATCH path=allShortestPaths((start)-[rels*..50]->(end))
> WITH NODES(path) AS stations,
> relationships(path) AS path,
> length(path) AS stop_count,
> length(FILTER(index IN RANGE(1, length(rels)-1)
> WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
> length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
>
> ORDER BY transfer_count,walk_count,stop_count
> RETURN count(*);
+----------+
| count(*) |
+----------+
| 320 |
+----------+
1 row
10 ms
我正在使用 Neo4j 'neo4j-community-2.3.0-RC1' 版本。 在我的数据库中只有 1054 个节点。 当我使用 'allShotestPaths' 函数进行路径查询时,为什么这么慢。 大约需要1秒多,单元测试结果如下:
√ search optimalPath Path (192ms)
√ search optimal Path by Lat Lng (1131ms)
我应该优化查询吗?以下是 'optimalPath' 和 'optimal Path by Lat Lng'
的查询optimalPath 查询:
MATCH path=allShortestPaths((start:潍坊_STATION )-[rels*..50]->(end:潍坊_STATION {name:"火车站"}))
RETURN NODES(path) AS stations,relationships(path) AS path,length(path) AS stop_count,
length(FILTER(index IN RANGE(1, length(rels)-1) WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
order by transfer_count,walk_count,stop_count
Lat Lng 查询的最佳路径:
MATCH path=allShortestPaths((start:潍坊_STATION {name:"公交总公司"})-[rels*..50]->(end:潍坊_STATION {name:"火车站"}))
WHERE
round(
6378.137 *1000*2*
asin(sqrt(
sin((radians(start.lat)-radians(36.714))/2)^2+cos(radians(start.lat))*cos(radians(36.714))*
sin((radians(start.lng)-radians(119.1268))/2)^2
))
)/1000 < 0.5 // this formula is used to calculate the distance between two GEO coordinate (latitude\longitude)
RETURN NODES(path) AS stations,relationships(path) AS path,length(path) AS stop_count,
length(FILTER(index IN RANGE(1, length(rels)-1) WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
order by transfer_count,walk_count,stop_count
您可以在这里下载数据库:https://www.dropbox.com/s/zamkyh2aaw3voe6/data.rar?dl=0
如果有人能帮助我,我将不胜感激。谢谢
一般情况下,在不知道更多的情况下,我会在匹配之前,在路径全部展开之前拉出可以计算的谓词和表达式。
并且由于您的地理过滤器独立于除您的参数和起始节点之外的任何其他内容,您可以执行以下操作:
MATCH (start:潍坊_STATION {name:"公交总公司"})
WHERE
// this formula is used to calculate the distance between two GEO coordinate (latitude\longitude)
round(6378.137 *1000*2*
asin(sqrt(sin((radians(start.lat)-radians({lat}))/2)^2
+cos(radians(start.lat))*cos(radians({lat}))*
sin((radians(start.lng)-radians({lng}))/2)^2)))/1000
< 0.5
MATCH (end:潍坊_STATION {name:"火车站"})
MATCH path=allShortestPaths((start)-[rels*..50]->(end))
RETURN NODES(path) AS stations,
relationships(path) AS path,
length(path) AS stop_count,
length(FILTER(index IN RANGE(1, length(rels)-1)
WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
ORDER BY transfer_count,walk_count,stop_count;
看到这个测试(但另一个查询同样快):
neo4j-sh (?)$ MATCH (start:潍坊_STATION {name:"公交总公司"})
>
> WHERE
> // this formula is used to calculate the distance between two GEO coordinate (latitude\longitude)
> round(6378.137 *1000*2*
> asin(sqrt(sin((radians(start.lat)-radians({lat}))/2)^2
> +cos(radians(start.lat))*cos(radians({lat}))*
> sin((radians(start.lng)-radians({lng}))/2)^2)))/1000
> < 0.5
>
> MATCH (end:潍坊_STATION {name:"火车站"})
> MATCH path=allShortestPaths((start)-[rels*..50]->(end))
> WITH NODES(path) AS stations,
> relationships(path) AS path,
> length(path) AS stop_count,
> length(FILTER(index IN RANGE(1, length(rels)-1)
> WHERE (rels[index]).bus <> (rels[index - 1]).bus)) AS transfer_count,
> length(FILTER( rel IN rels WHERE type(rel)="WALK" )) AS walk_count
>
> ORDER BY transfer_count,walk_count,stop_count
> RETURN count(*);
+----------+
| count(*) |
+----------+
| 320 |
+----------+
1 row
10 ms