即使内存正在回收,并发模式也会失败
Concurrent mode failure even when memory is getting reclamied
这是我的应用程序的 VM OPS
-Xms7500m
-Xmx7500m
-Xmn4g
-XX:MaxPermSize=192m
-XX:TargetSurvivorRatio=80
-XX:+AggressiveOpts
-XX:+UseFastAccessorMethods
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly
-XX:ConcGCThreads=6
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCCause
-verbose:gc
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
-XX:+HeapDumpOnOutOfMemoryError
5天内出现3次并发模式异常。应用程序仍然是 运行 文件,内存也可用。
2020-04-16T18:57:47.575-0400: 100509.755: [CMS-concurrent-abortable-preclean-start]
2020-04-16T18:57:51.701-0400: 100513.881: [GC (Allocation Failure)2020-04-16T18:57:51.702-0400: 100513.881: [ParNew (promotion failed)
Desired survivor size 343565920 bytes, new threshold 6 (max 6)
- age 1: 91240280 bytes, 91240280 total
- age 2: 52703592 bytes, 143943872 total
- age 3: 26770336 bytes, 170714208 total
- age 4: 29495504 bytes, 200209712 total
- age 5: 9595480 bytes, 209805192 total
- age 6: 24205808 bytes, 234011000 total
: 3710667K->3614781K(3774912K), 1.1252750 secs]2020-04-16T18:57:52.827-0400: 100515.007: [CMS CMS: abort preclean due to time 2020-04-16T18:57:52.894-0400: 100515.074: [CMS-concurrent-abortable-preclean: 4.174/5.319 secs] [Times: user=14.39 sys=0.29, real=5.32 secs]
(concurrent mode failure): 3227087K->1442206K(3485696K), 8.0583310 secs] 6926256K->1442206K(7260608K), [CMS Perm : 105501K->105321K(176020K)], 9.1844190 secs] [Times: user=9.41 sys=0.01, real=9.18 secs]
--
2020-04-19T18:05:09.581-0400: 356551.761: [CMS-concurrent-abortable-preclean-start]
2020-04-19T18:05:11.759-0400: 356553.939: [GC (Allocation Failure)2020-04-19T18:05:11.760-0400: 356553.939: [ParNew (promotion failed)
Desired survivor size 343565920 bytes, new threshold 6 (max 6)
- age 1: 95822816 bytes, 95822816 total
- age 2: 24589528 bytes, 120412344 total
- age 3: 28175272 bytes, 148587616 total
- age 4: 24536120 bytes, 173123736 total
- age 5: 23041104 bytes, 196164840 total
- age 6: 12194152 bytes, 208358992 total
: 3670487K->3606232K(3774912K), 0.9360540 secs]2020-04-19T18:05:12.696-0400: 356554.875: [CMS2020-04-19T18:05:12.758-0400: 356554.938: [CMS-concurrent-abortable-preclean: 2.224/3.177 secs] [Times: user=10.62 sys=0.17, real=3.18 secs]
(concurrent mode failure): 3233090K->1492098K(3485696K), 7.9204130 secs] 6896158K->1492098K(7260608K), [CMS Perm : 105666K->105467K(176212K)], 8.8569600 secs] [Times: user=9.08 sys=0.01, real=8.86 secs]
--
2020-04-22T19:07:04.975-0400: 619467.155: Total time for which application threads were stopped: 0.0047280 seconds
2020-04-22T19:07:07.174-0400: 619469.354: [GC (Allocation Failure)2020-04-22T19:07:07.174-0400: 619469.354: [ParNew (promotion failed)
Desired survivor size 343565920 bytes, new threshold 6 (max 6)
- age 1: 98089096 bytes, 98089096 total
- age 2: 31239384 bytes, 129328480 total
- age 3: 29372368 bytes, 158700848 total
- age 4: 27791800 bytes, 186492648 total
- age 5: 19365904 bytes, 205858552 total
- age 6: 35928016 bytes, 241786568 total
: 3643909K->3678567K(3774912K), 0.9460110 secs]2020-04-22T19:07:08.121-0400: 619470.300: [CMS2020-04-22T19:07:08.234-0400: 619470.413: [CMS-concurrent-abortable-preclean: 2.612/3.582 secs] [Times: user=12.78 sys=0.18, real=3.58 secs]
(concurrent mode failure): 3230258K->1503933K(3485696K), 8.8236640 secs] 6862317K->1503933K(7260608K), [CMS Perm : 105907K->105647K(176596K)], 9.7702040 secs] [Times: user=10.06 sys=0.00, real=9.77 secs]
应用在过去一周内没有 FullGC
2020-04-15T15:03:29.193-0400: 51.372: [Full GC (Permanent Generation Full)2020-04-15T15:03:29.193-0400: 51.373: [CMS: 771051K->1044286K(3485696K), 4.2744340 secs] 1915012K->1044286K(7260608K), [CMS Perm : 101698K->101581K(102128K)], 4.2749400 secs] [Times: user=4.18 sys=0.11, real=4.28 secs]
为什么
时会出现并发模式失败异常
- 没有频繁的 Full GC
GC 后正在回收内存
3227087K->1442206K(3485696K), 8.0583310 secs] 6926256K->1442206K(7260608K)
3233090K->1492098K(3485696K), 7.9204130 secs] 6896158K->1492098K(7260608K)
3230258K->1503933K(3485696K), 8.8236640 secs] 6862317K->1503933K(7260608K)
节点配置:8 核 CPU、12 GB RAM RHEL VM、JDK 1.7.0_45
来自 Java 平台标准版 HotSpot 虚拟机垃圾收集调优指南的第 8 Concurrent Mark Sweep (CMS) Collector 章:
Concurrent Mode Failure
The CMS collector uses one or more garbage collector threads that run simultaneously with the application threads with the goal of completing the collection of the tenured generation before it becomes full. As described previously, in normal operation, the CMS collector does most of its tracing and sweeping work with the application threads still running, so only brief pauses are seen by the application threads. However, if the CMS collector is unable to finish reclaiming the unreachable objects before the tenured generation fills up, or if an allocation cannot be satisfied with the available free space blocks in the tenured generation, then the application is paused and the collection is completed with all the application threads stopped. The inability to complete a collection concurrently is referred to as concurrent mode failure and indicates the need to adjust the CMS collector parameters. If a concurrent collection is interrupted by an explicit garbage collection (System.gc()
) or for a garbage collection needed to provide information for diagnostic tools, then a concurrent mode interruption is reported.
这意味着,当时您的应用程序生成垃圾的速度快于收集器线程回收它们的速度,and/or 使用了太多 CPU 收集器线程没有时间完成作业。
这是我的应用程序的 VM OPS
-Xms7500m
-Xmx7500m
-Xmn4g
-XX:MaxPermSize=192m
-XX:TargetSurvivorRatio=80
-XX:+AggressiveOpts
-XX:+UseFastAccessorMethods
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly
-XX:ConcGCThreads=6
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCCause
-verbose:gc
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
-XX:+HeapDumpOnOutOfMemoryError
5天内出现3次并发模式异常。应用程序仍然是 运行 文件,内存也可用。
2020-04-16T18:57:47.575-0400: 100509.755: [CMS-concurrent-abortable-preclean-start]
2020-04-16T18:57:51.701-0400: 100513.881: [GC (Allocation Failure)2020-04-16T18:57:51.702-0400: 100513.881: [ParNew (promotion failed)
Desired survivor size 343565920 bytes, new threshold 6 (max 6)
- age 1: 91240280 bytes, 91240280 total
- age 2: 52703592 bytes, 143943872 total
- age 3: 26770336 bytes, 170714208 total
- age 4: 29495504 bytes, 200209712 total
- age 5: 9595480 bytes, 209805192 total
- age 6: 24205808 bytes, 234011000 total
: 3710667K->3614781K(3774912K), 1.1252750 secs]2020-04-16T18:57:52.827-0400: 100515.007: [CMS CMS: abort preclean due to time 2020-04-16T18:57:52.894-0400: 100515.074: [CMS-concurrent-abortable-preclean: 4.174/5.319 secs] [Times: user=14.39 sys=0.29, real=5.32 secs]
(concurrent mode failure): 3227087K->1442206K(3485696K), 8.0583310 secs] 6926256K->1442206K(7260608K), [CMS Perm : 105501K->105321K(176020K)], 9.1844190 secs] [Times: user=9.41 sys=0.01, real=9.18 secs]
--
2020-04-19T18:05:09.581-0400: 356551.761: [CMS-concurrent-abortable-preclean-start]
2020-04-19T18:05:11.759-0400: 356553.939: [GC (Allocation Failure)2020-04-19T18:05:11.760-0400: 356553.939: [ParNew (promotion failed)
Desired survivor size 343565920 bytes, new threshold 6 (max 6)
- age 1: 95822816 bytes, 95822816 total
- age 2: 24589528 bytes, 120412344 total
- age 3: 28175272 bytes, 148587616 total
- age 4: 24536120 bytes, 173123736 total
- age 5: 23041104 bytes, 196164840 total
- age 6: 12194152 bytes, 208358992 total
: 3670487K->3606232K(3774912K), 0.9360540 secs]2020-04-19T18:05:12.696-0400: 356554.875: [CMS2020-04-19T18:05:12.758-0400: 356554.938: [CMS-concurrent-abortable-preclean: 2.224/3.177 secs] [Times: user=10.62 sys=0.17, real=3.18 secs]
(concurrent mode failure): 3233090K->1492098K(3485696K), 7.9204130 secs] 6896158K->1492098K(7260608K), [CMS Perm : 105666K->105467K(176212K)], 8.8569600 secs] [Times: user=9.08 sys=0.01, real=8.86 secs]
--
2020-04-22T19:07:04.975-0400: 619467.155: Total time for which application threads were stopped: 0.0047280 seconds
2020-04-22T19:07:07.174-0400: 619469.354: [GC (Allocation Failure)2020-04-22T19:07:07.174-0400: 619469.354: [ParNew (promotion failed)
Desired survivor size 343565920 bytes, new threshold 6 (max 6)
- age 1: 98089096 bytes, 98089096 total
- age 2: 31239384 bytes, 129328480 total
- age 3: 29372368 bytes, 158700848 total
- age 4: 27791800 bytes, 186492648 total
- age 5: 19365904 bytes, 205858552 total
- age 6: 35928016 bytes, 241786568 total
: 3643909K->3678567K(3774912K), 0.9460110 secs]2020-04-22T19:07:08.121-0400: 619470.300: [CMS2020-04-22T19:07:08.234-0400: 619470.413: [CMS-concurrent-abortable-preclean: 2.612/3.582 secs] [Times: user=12.78 sys=0.18, real=3.58 secs]
(concurrent mode failure): 3230258K->1503933K(3485696K), 8.8236640 secs] 6862317K->1503933K(7260608K), [CMS Perm : 105907K->105647K(176596K)], 9.7702040 secs] [Times: user=10.06 sys=0.00, real=9.77 secs]
应用在过去一周内没有 FullGC
2020-04-15T15:03:29.193-0400: 51.372: [Full GC (Permanent Generation Full)2020-04-15T15:03:29.193-0400: 51.373: [CMS: 771051K->1044286K(3485696K), 4.2744340 secs] 1915012K->1044286K(7260608K), [CMS Perm : 101698K->101581K(102128K)], 4.2749400 secs] [Times: user=4.18 sys=0.11, real=4.28 secs]
为什么
时会出现并发模式失败异常- 没有频繁的 Full GC
GC 后正在回收内存
3227087K->1442206K(3485696K), 8.0583310 secs] 6926256K->1442206K(7260608K) 3233090K->1492098K(3485696K), 7.9204130 secs] 6896158K->1492098K(7260608K) 3230258K->1503933K(3485696K), 8.8236640 secs] 6862317K->1503933K(7260608K)
节点配置:8 核 CPU、12 GB RAM RHEL VM、JDK 1.7.0_45
来自 Java 平台标准版 HotSpot 虚拟机垃圾收集调优指南的第 8 Concurrent Mark Sweep (CMS) Collector 章:
Concurrent Mode Failure
The CMS collector uses one or more garbage collector threads that run simultaneously with the application threads with the goal of completing the collection of the tenured generation before it becomes full. As described previously, in normal operation, the CMS collector does most of its tracing and sweeping work with the application threads still running, so only brief pauses are seen by the application threads. However, if the CMS collector is unable to finish reclaiming the unreachable objects before the tenured generation fills up, or if an allocation cannot be satisfied with the available free space blocks in the tenured generation, then the application is paused and the collection is completed with all the application threads stopped. The inability to complete a collection concurrently is referred to as concurrent mode failure and indicates the need to adjust the CMS collector parameters. If a concurrent collection is interrupted by an explicit garbage collection (
System.gc()
) or for a garbage collection needed to provide information for diagnostic tools, then a concurrent mode interruption is reported.
这意味着,当时您的应用程序生成垃圾的速度快于收集器线程回收它们的速度,and/or 使用了太多 CPU 收集器线程没有时间完成作业。