Neo4j - 使用 where 条件计算行数
Neo4j - Counting rows with where condition
我正在尝试计算 Neo4j 将 return 的行数,但计数(或查询)非常慢。
版本 1(70 秒):
MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WHERE order.timestamp >= 1632434400 AND size((order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(:Product)) <= 20
WITH order
MATCH (order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(product:Product)
RETURN COUNT(product);
版本 2(68 秒):
MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WITH size((order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(:Product)) AS amount
WHERE order.timestamp >= 1632434400 AND amount <= 20
RETURN SUM(amount)
使用 Neo4j 4.4 社区约有 800000 个订单和约 17000000 个订单行。
是否有更有效的行数统计方法?
这些是索引:
CREATE INDEX idx_order_torder_id FOR (n:Order) ON (n.order_id);
CREATE INDEX idx_order_timestamp FOR (n:Order) ON (n.timestamp);
CREATE INDEX idx_person_person_id FOR (n:Person) ON (n.person_id);
CREATE INDEX idx_product_product_id FOR (n:Product) ON (n.product_id);
行数等于 4269011。
解释计划:
请在下面尝试,希望它能更快给出结果
MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WHERE order.timestamp >= 1632434400
WITH order.order_id AS orderid
MATCH (o:Order { order_id: orderid })<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(product:Product)
WITH COUNT(product) as productCount
WHERE productCount <= 20
RETURN productCount;
因为每个订单行都有一个产品,我可以跳过计算订单行到产品的关系:
MATCH (order:Order)
WHERE order.timestamp >= 1632434400
WITH order
MATCH (order)<-[:HAS_ORDER]-(orderLine:OrderLine)
WITH COUNT(orderLine) as productCount
WHERE productCount <= 20
RETURN SUM(productCount);
本次查询耗时 0m17.342s
但我设法通过以下查询窥探了几秒钟:
MATCH (order:Order)
WHERE order.timestamp >= 1632434400
WITH order, size((order)<-[:HAS_ORDER]-(:OrderLine)) AS amount
WHERE amount <= 20
RETURN SUM(amount);
本次查询耗时 0m15.675s
我正在尝试计算 Neo4j 将 return 的行数,但计数(或查询)非常慢。
版本 1(70 秒):
MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WHERE order.timestamp >= 1632434400 AND size((order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(:Product)) <= 20
WITH order
MATCH (order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(product:Product)
RETURN COUNT(product);
版本 2(68 秒):
MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WITH size((order)<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(:Product)) AS amount
WHERE order.timestamp >= 1632434400 AND amount <= 20
RETURN SUM(amount)
使用 Neo4j 4.4 社区约有 800000 个订单和约 17000000 个订单行。
是否有更有效的行数统计方法?
这些是索引:
CREATE INDEX idx_order_torder_id FOR (n:Order) ON (n.order_id);
CREATE INDEX idx_order_timestamp FOR (n:Order) ON (n.timestamp);
CREATE INDEX idx_person_person_id FOR (n:Person) ON (n.person_id);
CREATE INDEX idx_product_product_id FOR (n:Product) ON (n.product_id);
行数等于 4269011。
解释计划:
请在下面尝试,希望它能更快给出结果
MATCH (person:Person)-[:HAS_ORDER]->(order:Order)
WHERE order.timestamp >= 1632434400
WITH order.order_id AS orderid
MATCH (o:Order { order_id: orderid })<-[:HAS_ORDER]-(:OrderLine)-[:HAS_PRODUCT]->(product:Product)
WITH COUNT(product) as productCount
WHERE productCount <= 20
RETURN productCount;
因为每个订单行都有一个产品,我可以跳过计算订单行到产品的关系:
MATCH (order:Order)
WHERE order.timestamp >= 1632434400
WITH order
MATCH (order)<-[:HAS_ORDER]-(orderLine:OrderLine)
WITH COUNT(orderLine) as productCount
WHERE productCount <= 20
RETURN SUM(productCount);
本次查询耗时 0m17.342s
但我设法通过以下查询窥探了几秒钟:
MATCH (order:Order)
WHERE order.timestamp >= 1632434400
WITH order, size((order)<-[:HAS_ORDER]-(:OrderLine)) AS amount
WHERE amount <= 20
RETURN SUM(amount);
本次查询耗时 0m15.675s