Neo4j - 使用 where 条件计算行数

Neo4j - Counting rows with where condition

我正在尝试计算 Neo4j 将 return 的行数,但计数(或查询)非常慢。

版本 1(70 秒):

MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WHERE order.timestamp >= 1632434400 AND size((order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(:Product)) <= 20
WITH order
MATCH (order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(product:Product)
RETURN COUNT(product);

版本 2(68 秒):

MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WITH size((order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(:Product)) AS amount
WHERE order.timestamp >= 1632434400 AND amount <= 20
RETURN SUM(amount)

使用 Neo4j 4.4 社区约有 800000 个订单和约 17000000 个订单行。

是否有更有效的行数统计方法?

这些是索引:

CREATE INDEX idx_order_torder_id FOR (n:Order) ON (n.order_id);
CREATE INDEX idx_order_timestamp FOR (n:Order) ON (n.timestamp);
CREATE INDEX idx_person_person_id FOR (n:Person) ON (n.person_id);
CREATE INDEX idx_product_product_id FOR (n:Product) ON (n.product_id);

行数等于 4269011。

解释计划:

请在下面尝试,希望它能更快给出结果

MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WHERE order.timestamp >= 1632434400 
WITH order.order_id AS orderid
MATCH (o:Order { order_id: orderid })<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(product:Product)
WITH COUNT(product) as productCount
WHERE productCount <= 20
RETURN productCount;

因为每个订单行都有一个产品,我可以跳过计算订单行到产品的关系:

MATCH (order:Order) 
WHERE order.timestamp >= 1632434400 
WITH order 
MATCH (order)<-[:HAS_ORDER]-(orderLine:OrderLine) 
WITH COUNT(orderLine) as productCount 
WHERE productCount <= 20 
RETURN SUM(productCount);

本次查询耗时 0m17.342s

但我设法通过以下查询窥探了几秒钟:

MATCH (order:Order) 
WHERE order.timestamp >= 1632434400
WITH order, size((order)<-[:HAS_ORDER]-(:OrderLine)) AS amount 
WHERE amount <= 20 
RETURN SUM(amount);

本次查询耗时 0m15.675s