一段时间后 Apache Ignite 缓存冻结
Apache Ignite cache freeze after a period of time
我有一个包含 6 个服务器和 1 个客户端节点的集群。我的客户端节点执行大量更新和创建作业,并且还有一个过期策略以捕获过期的项目。
但是每天集群至少卡死一次。甚至 ignitevisor 的缓存命令在调用过程中也会冻结。
所以我查看了线程转储,我看到了一件奇怪的事情,有很多类似的语句:
"pub-#39%null%" #51 prio=5 os_prio=0 tid=0x00007f9788623800 nid=0x1d02 waiting on condition [0x00007f9769ddc000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c004aaa8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
所以有很多线程在等待一个条件,但不知何故它永远不会发生。
我的缓存配置如下:
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<!--
Alter configuration below as needed.
-->
<bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- Configure internal thread pool. -->
<property name="publicThreadPoolSize" value="64"/>
<!-- Configure system thread pool. -->
<property name="systemThreadPoolSize" value="32"/>
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
<property name="addresses">
<list>
...
</list>
</property>
</bean>
</property>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="asd1"/>
<property name="eagerTtl" value="true"/>
<property name="expiryPolicyFactory">
<bean class="javax.cache.configuration.FactoryBuilder.SingletonFactory">
<constructor-arg name="instance">
<bean class="javax.cache.expiry.TouchedExpiryPolicy">
<constructor-arg name="expiryDuration">
<bean class="javax.cache.expiry.Duration">
<constructor-arg name="timeUnit">
<value type="java.util.concurrent.TimeUnit">MILLISECONDS</value>
</constructor-arg>
<constructor-arg name="durationAmount" value="10800000"/>
</bean>
</constructor-arg>
</bean>
</constructor-arg>
</bean>
</property>
</bean>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="asd2"/>
<property name="eagerTtl" value="true"/>
<property name="expiryPolicyFactory">
<bean class="javax.cache.configuration.FactoryBuilder.SingletonFactory">
<constructor-arg name="instance">
<bean class="javax.cache.expiry.TouchedExpiryPolicy">
<constructor-arg name="expiryDuration">
<bean class="javax.cache.expiry.Duration">
<constructor-arg name="timeUnit">
<value type="java.util.concurrent.TimeUnit">MILLISECONDS</value>
</constructor-arg>
<constructor-arg name="durationAmount" value="86400000"/>
</bean>
</constructor-arg>
</bean>
</constructor-arg>
</bean>
</property>
</bean>
</list>
</property>
<property name="includeEventTypes" value="70"/>
</bean>
</beans>
我真的需要帮助。感谢
Apache Ignite 论坛对此进行了讨论:http://apache-ignite-users.70518.x6.nabble.com/Apache-Ignite-cluster-freeze-after-a-period-of-time-td7726.html
死锁很可能是由 putAll
操作中重新排序的键引起的。
我有一个包含 6 个服务器和 1 个客户端节点的集群。我的客户端节点执行大量更新和创建作业,并且还有一个过期策略以捕获过期的项目。
但是每天集群至少卡死一次。甚至 ignitevisor 的缓存命令在调用过程中也会冻结。
所以我查看了线程转储,我看到了一件奇怪的事情,有很多类似的语句:
"pub-#39%null%" #51 prio=5 os_prio=0 tid=0x00007f9788623800 nid=0x1d02 waiting on condition [0x00007f9769ddc000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c004aaa8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
所以有很多线程在等待一个条件,但不知何故它永远不会发生。
我的缓存配置如下:
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<!--
Alter configuration below as needed.
-->
<bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- Configure internal thread pool. -->
<property name="publicThreadPoolSize" value="64"/>
<!-- Configure system thread pool. -->
<property name="systemThreadPoolSize" value="32"/>
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
<property name="addresses">
<list>
...
</list>
</property>
</bean>
</property>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="asd1"/>
<property name="eagerTtl" value="true"/>
<property name="expiryPolicyFactory">
<bean class="javax.cache.configuration.FactoryBuilder.SingletonFactory">
<constructor-arg name="instance">
<bean class="javax.cache.expiry.TouchedExpiryPolicy">
<constructor-arg name="expiryDuration">
<bean class="javax.cache.expiry.Duration">
<constructor-arg name="timeUnit">
<value type="java.util.concurrent.TimeUnit">MILLISECONDS</value>
</constructor-arg>
<constructor-arg name="durationAmount" value="10800000"/>
</bean>
</constructor-arg>
</bean>
</constructor-arg>
</bean>
</property>
</bean>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="asd2"/>
<property name="eagerTtl" value="true"/>
<property name="expiryPolicyFactory">
<bean class="javax.cache.configuration.FactoryBuilder.SingletonFactory">
<constructor-arg name="instance">
<bean class="javax.cache.expiry.TouchedExpiryPolicy">
<constructor-arg name="expiryDuration">
<bean class="javax.cache.expiry.Duration">
<constructor-arg name="timeUnit">
<value type="java.util.concurrent.TimeUnit">MILLISECONDS</value>
</constructor-arg>
<constructor-arg name="durationAmount" value="86400000"/>
</bean>
</constructor-arg>
</bean>
</constructor-arg>
</bean>
</property>
</bean>
</list>
</property>
<property name="includeEventTypes" value="70"/>
</bean>
</beans>
我真的需要帮助。感谢
Apache Ignite 论坛对此进行了讨论:http://apache-ignite-users.70518.x6.nabble.com/Apache-Ignite-cluster-freeze-after-a-period-of-time-td7726.html
死锁很可能是由 putAll
操作中重新排序的键引起的。