EJB JPA CMT - 大型数据集刷新失败

EJB JPA CMT - Flush failure on large dataset

我有一个 JBoss 6.3 EAP、JPA 2.0、EJB 3.1、CMT JTA 网络应用程序。数据库是 MSSQL2008R2,使用 MS JDBC 驱动程序,底层是 hibernate 4.2.14。

我有一个看起来有点像这样的方法来复制一百万个价格实体:

public void doStuff(Date newDate)
{
    List<Prices> prices = dao.getPrices(); //<< 1000000+ prices
    for (Prices price : prices)
    {
        Prices copy = price.clone();
        copy.setDate(newDate);
        entityManager.persist(copy);
        if (newDate.before(someDate))
        {
            price.setDate(someDate);
            entityManager.merge(price);
        }
    }
}

我将 JBoss EJB 协调器超时设置为一个小时,让它 运行。在第一次 运行 内存不足后,我将堆大小增加到 -Xmx 3G。

代码从 1:24am 开始,在 1:36am 结束,然后在 2:24am,失败并出现 t运行saction 错误,然后回滚。堆栈跟踪在刷新期间说它。

 at org.hibernate.ejb.AbstractEntityManagerImpl$CallbackExceptionMapperImpl.mapManagedFlushFailure(AbstractEntityManagerImpl.java:1510) [hibernate-e
ntitymanager-4.2.14.SP1-redhat-1.jar:4.2.14.SP1-redhat-1]

我可以看到,如果我将百万分成 10000 个块并在每个块之后刷新,则在一个小时内它甚至不会接近一百万。所以冲洗显然是一项昂贵的任务。但我想它在 JTA 的 post-拦截提交期间开始隐式刷新。

我是否应该增加超时并重试?这是一个由其他几个人使用的 DEV 数据库,我的代码似乎锁定了价格 table,使其无法从 MSSQL SMSS 查询,所以我不想让 运行 无限期地使用它。但这只是需要更多时间的问题吗?

堆栈跟踪的开始:

02:24:45,157 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a14021f:3d218bb8:56009132:22 in state  RUN
02:24:45,169 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012095: Abort of action id 0:ffff0a14021f:3d218bb8:56009132:22 invoked while multiple threads active within it.
02:24:45,169 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012108: CheckedAction::check - atomic action 0:ffff0a14021f:3d218bb8:56009132:22 aborting with 1 threads active!
02:24:45,667 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a14021f:3d218bb8:56009132:22 in state  CANCEL
02:24:46,209 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a14021f:3d218bb8:56009132:22 in state  CANCEL_INTERRUPTED
02:24:46,210 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012120: TransactionReaper::check worker Thread[Transaction Reaper Worker 0,5,main] not responding to interrupt when cancelling TX 0:ffff0a14021f:3d218bb8:56009132:22 -- worker marked as zombie and TX scheduled for mark-as-rollback
02:24:46,210 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012110: TransactionReaper::check successfuly marked TX 0:ffff0a14021f:3d218bb8:56009132:22 as rollback only
02:25:07,968 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:8080-1) SQL Error: 0, SQLState: null
02:25:07,968 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:8080-1) Transaction cannot proceed STATUS_ROLLEDBACK
02:25:08,085 WARN  [com.arjuna.ats.arjuna] (http-/0.0.0.0:8080-1) ARJUNA012125: TwoPhaseCoordinator.beforeCompletion - failed for SynchronizationImple< 0:ffff0a14021f:3d218bb8:56009132:24, org.hibernate.engine.transaction.synchronization.internal.RegisteredSynchronization@2d633a18 >: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not prepare statement

好吧,我将其重写为 SQL,并使用了 2 个 entityManager.createNativeQuery 调用,而不是程序化 JPA,它在 30 秒左右完成。

所以,教训是,不要为大型数据集而烦恼 JPA。在SQL中算出解决方案,然后抓住直接JDBC连接来做。