H2 在第一次访问数据库时间歇性地遇到 StackOverflowError

H2 intermittently experiencing a StackOverflowError when first hitting the database

我们的 H2 数据库 (v1.4.199) 偶尔会在我们的 Java Spring 启动应用程序启动并使用非常简单的 SELECT 针对单个 table 的查询,这是应用程序的 first 查询。应用程序最终会正确启动,但只有在多次重启尝试后,每次都会抛出相同的 WhosebugError。鉴于它最终确实有效,这感觉不像是真正的无限递归问题。

这是来自 h2.trace.db 日志的堆栈跟踪:

2021-08-02 13:46:15 database: flush
org.h2.message.DbException: General error: "java.lang.IllegalStateException: java.lang.WhosebugError [1.4.199/3]" [50000-199]
    at org.h2.message.DbException.get(DbException.java:194)
    at org.h2.message.DbException.convert(DbException.java:347)
    at org.h2.mvstore.db.MVTableEngine.uncaughtException(MVTableEngine.java:90)
    at org.h2.mvstore.MVStore.handleException(MVStore.java:2787)
    at org.h2.mvstore.MVStore.panic(MVStore.java:441)
    at org.h2.mvstore.MVStore.store(MVStore.java:1194)
    at org.h2.mvstore.MVStore.tryCommit(MVStore.java:1135)
    at org.h2.mvstore.MVStore.writeInBackground(MVStore.java:2763)
    at org.h2.mvstore.MVStore$BackgroundWriterThread.run(MVStore.java:3123)
Caused by: org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: java.lang.WhosebugError [1.4.199/3]" [50000-199]
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:502)
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:427)
    ... 9 more
Caused by: java.lang.IllegalStateException: java.lang.WhosebugError [1.4.199/3]
    at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:883)
    ... 4 more
Caused by: java.lang.WhosebugError
    at java.util.concurrent.ConcurrentHashMap.transfer(ConcurrentHashMap.java:2415)
    at java.util.concurrent.ConcurrentHashMap.addCount(ConcurrentHashMap.java:2288)
    at java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1070)
    at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1619)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    at org.h2.mvstore.MVStore$ChunkIdsCollector.registerChunk(MVStore.java:1620)
    [invoking registerChunk() continues for another 1000 lines before being cut off...]

WhosebugError 被包装为 IllegalStateException,然后在 我们的 应用程序日志中捕获为这个堆栈跟踪:

org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: Reading from nio:/home/myapp/database/h2.mv.db failed; file length -1 read length 768 at 55238818159 [1.4.199/1]" [50000-199]
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:502) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:427) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.message.DbException.get(DbException.java:194) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.message.DbException.convert(DbException.java:347) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Database.openDatabase(Database.java:345) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Database.<init>(Database.java:313) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Engine.openSession(Engine.java:69) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Engine.openSession(Engine.java:201) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Engine.createSessionAndValidate(Engine.java:178) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Engine.createSession(Engine.java:161) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.Engine.createSession(Engine.java:31) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:336) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:169) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:148) ~[h2-1.4.199.jar:1.4.199]
    at org.h2.Driver.connect(Driver.java:69) ~[h2-1.4.199.jar:1.4.199]
    at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:136) ~[HikariCP-3.2.0.jar:na]
    at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:369) ~[HikariCP-3.2.0.jar:na]
    at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:198) ~[HikariCP-3.2.0.jar:na]
    at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:467) [HikariCP-3.2.0.jar:na]
    at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:541) [HikariCP-3.2.0.jar:na]
    at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115) [HikariCP-3.2.0.jar:na]
    at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112) [HikariCP-3.2.0.jar:na]
    at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319) [commons-dbutils-1.7.jar:1.7]
    at org.apache.commons.dbutils.QueryRunner.execute(QueryRunner.java:851) [commons-dbutils-1.7.jar:1.7]
    at com.mycompany.myapp.database.objectstores.h2.AppCache.<init>(AppCache.java:46) [main/:na]
    at com.mycompany.myapp.config.ObjectStore.database(ObjectStore.java:41) [main/:na]
    at com.mycompany.myapp.config.ObjectStore$$EnhancerBySpringCGLIB$$f5cc3c27.CGLIB$database[=11=](<generated>) [main/:na]
    at com.mycompany.myapp.config.ObjectStore$$EnhancerBySpringCGLIB$$f5cc3c27$$FastClassBySpringCGLIB$399cde.invoke(<generated>) [main/:na]
    at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:244) [spring-core-5.1.6.RELEASE.jar:5.1.6.RELEASE]

我们应用程序中的 file length -1 read length 768 at 55238818159 错误格外令人担忧。

我们 运行 我们的 H2 数据库 (v1.4.199) 处于文件模式,在我们的 Java Spring 启动应用程序中具有以下配置:

    String url = "jdbc:h2:file:~/database/h2;"
            + "DB_CLOSE_ON_EXIT=FALSE;"
            + "AUTO_RECONNECT=TRUE;"
            + "DB_CLOSE_DELAY=-1;"
            + "MODE=MYSQL;";    
    
    return DataSourceBuilder.create()
        .username("sa")
        .url(url)
        .driverClassName(org.h2.Driver.class.getName()).build();

我们的数据库有 10 个 table,每个 table 就像一个 key/value 存储,其中一些 VARCHAR 列充当复合唯一键,然后 CLOB,通常包含大量 JSON 数据。它看起来像这样:

CREATE TABLE IF NOT EXISTS table_name (
  keypart1 VARCHAR,
  keypart2 VARCHAR,
  keypart3 VARCHAR,
  json_value CLOB,
  primary key (keypart1, keypart2, keypart3)
)

大多数 table 有几百或几千条较小 CLOB 值的记录,而我们最大的 table 有 20k 条 CLOB 值范围从 2mb 到到 120mb(平均为 4mb)。 h2.mv.db 文件大约 50GB。

在出现此问题之前,该应用程序已正常运行 2 年。看起来堆栈跟踪中的 registerChunk(MVStore.java:1619) 是 H2 遍历 parent/child 数据块的地方,所以我想知道它是在释放 space、修复某种早期的损坏,还是我们的大量数据CLOB 数据以某种方式将其丢弃。

有什么想法吗?提前致谢。

找到解决办法。首先,IllegalStateException 是一个转移注意力的问题,所以我将注意力集中在 WhosebugError 上。看起来 H2 v1.4.199 在 MvStore 中有一些糟糕的递归逻辑,当数据库超过特定大小或复杂性阈值时开始失败。通过使用 JVM -Xss10M 参数将堆栈大小从 256K(默认)增加到 10MB,H2 能够递归到足够深以成功完成启动逻辑。

我也看到了这个 WhosebugError 导致其他类型的 IllegaStateException 如下:

Caused by: java.lang.RuntimeException: Unable to connect to H2
    at com.myapp.migrate.H2Db.<clinit>(H2Db.java:36)
    ... 3 more
Caused by: org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: This store is closed [1.4.199/4]" [50000-199]

还有我原来遇到的错误:

org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: Reading from nio:/home/myapp/database/h2.mv.db failed; file length -1 read length 768 at 55238818159 [1.4.199/1]" [50000-199]

WhosebugError 似乎是不同故障的根本原因,但 rethrown/rewrapped 是其他原因。