使用 G1GC 时很少有长时间的 GC 暂停

infrequent long GC pauses with G1GC

我在使用 G1GC 算法时偶尔会遇到长时间的 GC 暂停 - 一个月超过 30 秒。一旦发生,我将重新启动我的服务,并且在接下来的 1 个月内不会再发生这种延迟。

我正在附加 GC 日志。

2016-06-15T02:30:17.407-0400: 1702455.276: Total time for which application threads were stopped: 0.2900480 seconds
2016-06-15T02:30:25.509-0400: 1702463.378: Total time for which application threads were stopped: 0.0018900 seconds
2016-06-15T02:30:29.817-0400: 1702467.686: [GC pause (young)
Desired survivor size 7340032 bytes, new threshold 15 (max 15)
- age   1:    1903912 bytes,    1903912 total
- age   2:     641440 bytes,    2545352 total
- age   3:     455856 bytes,    3001208 total
- age   4:     563544 bytes,    3564752 total
- age   5:    1873368 bytes,    5438120 total
- age   6:     326024 bytes,    5764144 total
- age   7:     299144 bytes,    6063288 total
- age   8:     304632 bytes,    6367920 total
- age   9:     273160 bytes,    6641080 total
- age  10:     309152 bytes,    6950232 total
- age  11:     156824 bytes,    7107056 total
- age  12:     135064 bytes,    7242120 total
, 0.1199490 secs]
   [Parallel Time: 118.2 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 1702467686.7, Avg: 1702467686.8, Max: 1702467686.9, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 18.4, Avg: 20.2, Max: 24.6, Diff: 6.2, Sum: 80.7]
      [Update RS (ms): Min: 75.5, Avg: 79.5, Max: 80.8, Diff: 5.3, Sum: 317.8]
         [Processed Buffers: Min: 52, Avg: 61.0, Max: 70, Diff: 18, Sum: 244]
      [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.3]
      [Object Copy (ms): Min: 17.7, Avg: 18.1, Max: 18.6, Diff: 0.8, Sum: 72.3]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.4]
      [GC Worker Total (ms): Min: 117.8, Avg: 117.9, Max: 118.0, Diff: 0.2, Sum: 471.6]
      [GC Worker End (ms): Min: 1702467804.7, Avg: 1702467804.7, Max: 1702467804.7, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 1.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.3 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.4 ms]
   [Eden: 100.0M(100.0M)->0.0B(100.0M) Survivors: 9216.0K->9216.0K Heap: 2164.6M(2198.0M)->2065.5M(2198.0M)]
 [Times: user=0.48 sys=0.00, real=37.62 secs]
2016-06-15T02:31:07.438-0400: 1702505.307: Total time for which application threads were stopped: 37.6223210 seconds

2016-06-15T02:31:07.443-0400: 1702505.312: Total time for which application threads were stopped: 0.0034690 seconds
2016-06-15T02:31:07.446-0400: 1702505.315: Total time for which application threads were stopped: 0.0018140 seconds
2016-06-15T02:31:07.451-0400: 1702505.320: Total time for which application threads were stopped: 0.0020130 seconds
2016-06-15T02:31:07.453-0400: 1702505.322: Total time for which application threads were stopped: 0.0015850 seconds

我正在使用以下 G1GC 标志:

-server -d64 -verbose:gc -Xms2g -Xmx4g -XX:+UseG1GC 
-XX:MaxGCPauseMillis=1500 -XX:G1HeapRegionSize=2 
-XX:+PrintFlagsFinal -XX:ParallelGCThreads=4 -XX:ConcGCThreads=2 
-XX:PermSize=64m -XX:MaxPermSize=256m -XX:+DisableExplicitGC 
-XX:+PrintGCDateStamps -XX:+DoEscapeAnalysis -XX:+PrintTenuringDistribution 
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails 
-XX:+PrintGCApplicationStoppedTime -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=$PP_HOME -XX:+UseCompressedOops -XX:+AggressiveOpts"

机器配置:

Red Hat Enterprise Linux Server release 5.11 
Dual core CPU
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

如何从以上语句中找出long GC的根本原因?

更新到 Java 8 的最新版本 - 你应该 将 G1 与 Java 7 一起使用,因为它当时还没有准备好生产.您遇到一个现已修补的错误的可能性很大。是的,有 种方法可以将 OpenJDK 8 安装到 RHEL 5 机器上。是的,您的应用程序将大大提高速度。

另外,这个:

-XX:MaxGCPauseMillis=1500

还有这个:

-Xms2g -Xmx4g

建议您 运行 不在 G1 设计的那种大堆、低延迟环境中。对于这么旧的 JRE,您最好使用默认值 (ParallelGC) 并且 可能 调整一些选项以减少暂停时间(里程可能会有所不同)。或者CMS,那个时候比较成熟,调优可以很稳定

看在上帝的份上,如果您的应用足够重要以至于每个月 30 秒的 GC 暂停是一个问题,那么您可能可以负担得起 RHEL 7 和双核以上的系统。