Tomcat java 应用程序线程饥饿 - 所有线程都在等待 c3p0 BasicResourcePool.awaitAvailable
Thread starvation on Tomcat java application - all threads in WAITING for c3p0 BasicResourcePool.awaitAvailable
我们的 java Web 应用程序当前遇到随机(并且以前从未遇到过)阻塞问题,这是由创建的线程突然激增以及随后耗尽 Tomcat 的最大线程数引起的可以创建(默认 = 200)。
最后一个峰值在 10 分钟内创建了 120 个新线程,使计数从 80 增加到 200,并阻止了应用程序。通常的线程数是 70-80。
进行线程转储时,所有这些线程都处于 WAITING
状态,似乎所有线程都在等待从 c3p0 到 DBMS 的新连接。
这是一个示例堆栈:
"http-nio-8443-exec-77 Waiting Thread ID: 10016","1"
"java.lang.Object.wait(long)","2"
"com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(long) BasicResourcePool.java:1414","2"
"com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(long) BasicResourcePool.java:606","2"
"com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(long) BasicResourcePool.java:526","2"
"com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse() C3P0PooledConnectionPool.java:755","2"
"com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection() C3P0PooledConnectionPool.java:682","2"
"com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection() AbstractPoolBackedDataSource.java:140","2"
"org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection() AbstractRoutingDataSource.java:164","2"
"org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection() DatasourceConnectionProviderImpl.java:139","2"
"org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection() AbstractSessionImpl.java:380","2"
"org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection() LogicalConnectionImpl.java:228","2"
"org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection() LogicalConnectionImpl.java:171","2"
"org.hibernate.internal.SessionImpl.connection() SessionImpl.java:450","2"
"org.springframework.orm.hibernate4.HibernateTransactionManager.doBegin(Object, TransactionDefinition) HibernateTransactionManager.java:450","2"
"org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(TransactionDefinition) AbstractPlatformTransactionManager.java:373","2"
"org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(PlatformTransactionManager, TransactionAttribute, String) TransactionAspectSupport.java:463","2"
"org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(Method, Class, TransactionAspectSupport$InvocationCallback) TransactionAspectSupport.java:276","2"
"org.springframework.transaction.interceptor.TransactionInterceptor.invoke(MethodInvocation) TransactionInterceptor.java:96","2"
"org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() ReflectiveMethodInvocation.java:179","2"
"org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(MethodInvocation) ExposeInvocationInterceptor.java:92","2"
"org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() ReflectiveMethodInvocation.java:179","2"
"org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(Object, Method, Object[], MethodProxy) CglibAopProxy.java:653","2"
这是我们的 c3p0 配置 DataSource
:
<property name="maxIdleTime" value="600" />
<property name="maxPoolSize" value="300" />
<property name="maxStatements" value="200" />
<property name="maxStatementsPerConnection" value="10" />
<property name="minPoolSize" value="15" />
<property name="idleConnectionTestPeriod" value="100" />
<property name="acquireIncrement" value="3" />
<property name="numHelperThreads" value="20" />
我目前正在尝试添加一个超时参数来帮助我调试这些情况
<property name="checkoutTimeout" value="30" />
我读到我应该使用 debugUnreturnedConnectionStackTraces
和 unreturnedConnectionTimeout
来调试更好的这类问题,但我目前对它们不是很了解并且不敢在生产中使用它们。
是否有任何其他类型的配置可以帮助我调试问题?其他一些超时配置?
堆栈:
- 数据库管理系统:Sql 服务器 2014
- c3p0: 0.9.2.1
- 休眠:4.3.5
- Spring: 4.1.6
编辑(检查超时测试)
我试图在有问题的数据库上输入参数 checkoutTimeout:
<property name="checkoutTimeout" value="30" />
这 属性 应该强制所有正在等待从 c3p0 签出连接的线程在 30 秒后放弃,以防池耗尽,并抛出 Sql异常。
将此 属性 投入生产后,我们开始记录此类错误:
Caused by: org.hibernate.exception.GenericJDBCException: Could not open connection
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:235)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:171)
at org.hibernate.internal.SessionImpl.connection(SessionImpl.java:450)
at org.springframework.orm.hibernate4.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:450)
... 119 more
Caused by: java.sql.SQLException: An attempt by a client to checkout a Connection has timed out.
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:118)
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:77)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:687)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
at org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection(AbstractRoutingDataSource.java:164)
at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:139)
at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:380)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:228)
... 122 more
Caused by: com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResourcePool@14e83e8f -- timeout at awaitAvailable()
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1416)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:606)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:526)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:755)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:682)
... 127 more
所以我想这证明了 out pool 已经耗尽(maxPoolSize 是 300)。对吗?
编辑 2:问题已解决,请查看我的答案
您可以要求您的 Sql 服务器 DBA 提供一些统计数据,包括在特定时间段内绑定到您的应用程序的会话数、执行所花费的时间 SQL 等。
我们能够通过 JMX 的帮助和 c3p0 数据源状态的监控解决问题。
使用 JVisualVM(或者如果您没有 Oracle JDK,您也可以使用 JConsole)我们查看了数据源的状态,显示如下:
我们的池大小是 300,这意味着 c3p0 在任何给定时间最多可以有 300 个连接可用,并且已经有 95 个处于忙碌状态,而且这个数字还在上升。
经过反复试验,我们知道这个问题只发生在特定客户身上,我们开始测试该客户使用的新功能(问题最近才出现),我们发现了一个特定的用例与我们一起使用重现问题,不断泄漏连接。
在代码库中,问题是这个用例打开了一个会话,但从未注意关闭它。
事情是这样的:
this.sessionFactory.withOptions().noInterceptor().openSession();
我们添加了会话的关闭,问题已解决:
session.flush();
session.close();
我们的 java Web 应用程序当前遇到随机(并且以前从未遇到过)阻塞问题,这是由创建的线程突然激增以及随后耗尽 Tomcat 的最大线程数引起的可以创建(默认 = 200)。
最后一个峰值在 10 分钟内创建了 120 个新线程,使计数从 80 增加到 200,并阻止了应用程序。通常的线程数是 70-80。
进行线程转储时,所有这些线程都处于 WAITING
状态,似乎所有线程都在等待从 c3p0 到 DBMS 的新连接。
这是一个示例堆栈:
"http-nio-8443-exec-77 Waiting Thread ID: 10016","1"
"java.lang.Object.wait(long)","2"
"com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(long) BasicResourcePool.java:1414","2"
"com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(long) BasicResourcePool.java:606","2"
"com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(long) BasicResourcePool.java:526","2"
"com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse() C3P0PooledConnectionPool.java:755","2"
"com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection() C3P0PooledConnectionPool.java:682","2"
"com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection() AbstractPoolBackedDataSource.java:140","2"
"org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection() AbstractRoutingDataSource.java:164","2"
"org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection() DatasourceConnectionProviderImpl.java:139","2"
"org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection() AbstractSessionImpl.java:380","2"
"org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection() LogicalConnectionImpl.java:228","2"
"org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection() LogicalConnectionImpl.java:171","2"
"org.hibernate.internal.SessionImpl.connection() SessionImpl.java:450","2"
"org.springframework.orm.hibernate4.HibernateTransactionManager.doBegin(Object, TransactionDefinition) HibernateTransactionManager.java:450","2"
"org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(TransactionDefinition) AbstractPlatformTransactionManager.java:373","2"
"org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(PlatformTransactionManager, TransactionAttribute, String) TransactionAspectSupport.java:463","2"
"org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(Method, Class, TransactionAspectSupport$InvocationCallback) TransactionAspectSupport.java:276","2"
"org.springframework.transaction.interceptor.TransactionInterceptor.invoke(MethodInvocation) TransactionInterceptor.java:96","2"
"org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() ReflectiveMethodInvocation.java:179","2"
"org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(MethodInvocation) ExposeInvocationInterceptor.java:92","2"
"org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() ReflectiveMethodInvocation.java:179","2"
"org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(Object, Method, Object[], MethodProxy) CglibAopProxy.java:653","2"
这是我们的 c3p0 配置 DataSource
:
<property name="maxIdleTime" value="600" />
<property name="maxPoolSize" value="300" />
<property name="maxStatements" value="200" />
<property name="maxStatementsPerConnection" value="10" />
<property name="minPoolSize" value="15" />
<property name="idleConnectionTestPeriod" value="100" />
<property name="acquireIncrement" value="3" />
<property name="numHelperThreads" value="20" />
我目前正在尝试添加一个超时参数来帮助我调试这些情况
<property name="checkoutTimeout" value="30" />
我读到我应该使用 debugUnreturnedConnectionStackTraces
和 unreturnedConnectionTimeout
来调试更好的这类问题,但我目前对它们不是很了解并且不敢在生产中使用它们。
是否有任何其他类型的配置可以帮助我调试问题?其他一些超时配置?
堆栈:
- 数据库管理系统:Sql 服务器 2014
- c3p0: 0.9.2.1
- 休眠:4.3.5
- Spring: 4.1.6
编辑(检查超时测试)
我试图在有问题的数据库上输入参数 checkoutTimeout:
<property name="checkoutTimeout" value="30" />
这 属性 应该强制所有正在等待从 c3p0 签出连接的线程在 30 秒后放弃,以防池耗尽,并抛出 Sql异常。
将此 属性 投入生产后,我们开始记录此类错误:
Caused by: org.hibernate.exception.GenericJDBCException: Could not open connection
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:235)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:171)
at org.hibernate.internal.SessionImpl.connection(SessionImpl.java:450)
at org.springframework.orm.hibernate4.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:450)
... 119 more
Caused by: java.sql.SQLException: An attempt by a client to checkout a Connection has timed out.
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:118)
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:77)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:687)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
at org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection(AbstractRoutingDataSource.java:164)
at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:139)
at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:380)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:228)
... 122 more
Caused by: com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResourcePool@14e83e8f -- timeout at awaitAvailable()
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1416)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:606)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:526)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:755)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:682)
... 127 more
所以我想这证明了 out pool 已经耗尽(maxPoolSize 是 300)。对吗?
编辑 2:问题已解决,请查看我的答案
您可以要求您的 Sql 服务器 DBA 提供一些统计数据,包括在特定时间段内绑定到您的应用程序的会话数、执行所花费的时间 SQL 等。
我们能够通过 JMX 的帮助和 c3p0 数据源状态的监控解决问题。
使用 JVisualVM(或者如果您没有 Oracle JDK,您也可以使用 JConsole)我们查看了数据源的状态,显示如下:
我们的池大小是 300,这意味着 c3p0 在任何给定时间最多可以有 300 个连接可用,并且已经有 95 个处于忙碌状态,而且这个数字还在上升。
经过反复试验,我们知道这个问题只发生在特定客户身上,我们开始测试该客户使用的新功能(问题最近才出现),我们发现了一个特定的用例与我们一起使用重现问题,不断泄漏连接。
在代码库中,问题是这个用例打开了一个会话,但从未注意关闭它。
事情是这样的:
this.sessionFactory.withOptions().noInterceptor().openSession();
我们添加了会话的关闭,问题已解决:
session.flush();
session.close();