Neo4j 批量删除

Neo4j batch delete

我使用以下工具 class 进行 Neo4j 数据库清理:

public class Neo4jUtils {

    final static Logger logger = LoggerFactory.getLogger(Neo4jUtils.class);

    private static final int BATCH_SIZE = 1000;

    public static void cleanDb(Neo4jTemplate template) {
        logger.info("Cleaning database");
        long deletedNodesCount = 0;
        do {
            GraphDatabaseService graphDatabaseService = template.getGraphDatabaseService();
            Transaction tx = graphDatabaseService.beginTx();
            try {
                Result<Map<String, Object>> result = template.query("MATCH (n) WITH n LIMIT " + BATCH_SIZE + " OPTIONAL MATCH (n)-[r]-() DELETE n, r RETURN count(n) as count", null);
                deletedNodesCount = (long) result.single().get("count");
                tx.success();
                logger.info("Deleted " + deletedNodesCount + " nodes...");
            } catch (Throwable th) {
                logger.error("Error while deleting database", th);
                throw th;
            } finally {
                tx.close();
            }
        } while (deletedNodesCount > 0);
    }

}

如您所见,我将批处理大小限制为 1000,但无论如何,在删除操作期间,第一批删除约 300000 个实体,其余批删除每批约 2000 个实体。

你能告诉我为什么在使用 BATCH_SIZE = 1000; 的情况下我有这些大数字吗?如何修复此功能以真正将批处理大小限制为 1000 个节点?

它可能重复计算节点,因为您与它们有多个关系。您的查询确实应该删除 1000 个节点,但您 return 组合数 (n,r)。

你可以:

更改您的查询以打印唯一节点:

MATCH (n) WITH n LIMIT 1000 OPTIONAL MATCH (n)-[r]-() DELETE n, r RETURN count(DISTINCT n) as count

或者打印每次删除后剩余的节点数,看是否比之前少了1000

MATCH (n) RETURN count(n) as count