G1 GC 单个,很长的年轻 GC 发生在 ParallelGCThreads=1
G1 GC single, very long young GC occured with ParallelGCThreads=1
我设置 ParallelGCThreads=1
并使用 G1
GC
,所有其他 JVM
设置都是默认设置。我 运行 PageRank
在 Spark-1.5.1 上有两个 EC2 节点,每个节点 100 GB 堆。
我的堆使用图如下(红色区域:年轻代,黑色区域:老年代)。所有年轻的 GC 都很小,突然间出现了一个收集 60 GB 的年轻 GC,然后年轻的 GC 又变小了。我的 GC 日志显示没有混合 GC、没有完整 GC、1 个并发标记和数十个年轻 GC。我想知道为什么会发生这么大的年轻 GC?
下面是我的 GC 日志的一部分。巨大的年轻 GC 是 "Heap: 84.1G"
2015-12-30T06:59:02.488+0000: 245.088: [GC pause (young) 245.089: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 1727, predicted base time: 24.64 ms, remaining time: 175.36 ms, target pause time: 200.00 ms]
245.089: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 206 regions, survivors: 3 regions, predicted young region time: 148.87 ms]
245.089: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 206 regions, survivors: 3 regions, old: 0 regions, predicted pause time: 173.51 ms, target pause time: 200.00 ms]
2015-12-30T06:59:02.531+0000: 245.131: [SoftReference, 0 refs, 0.0000520 secs]2015-12-30T06:59:02.531+0000: 245.131: [WeakReference, 21 refs, 0.0000160 secs]2015-12-30T06:59:02.531+0000: 245.131: [FinalReference, 9759 refs, 0.0084720 secs]2015-12-30T06:59:02.539+0000: 245.140: [PhantomReference, 0 refs, 14 refs, 0.0000190 secs]2015-12-30T06:59:02.539+0000: 245.140: [JNI Weak Reference, 0.0000130 secs] 245.142: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 12.51 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 0.0534140 secs]
[Parallel Time: 42.3 ms, GC Workers: 1]
[GC Worker Start (ms): 245088.6]
[Ext Root Scanning (ms): 14.4]
[Update RS (ms): 1.9]
[Processed Buffers: 34]
[Scan RS (ms): 0.4]
[Code Root Scanning (ms): 0.0]
[Object Copy (ms): 25.5]
[Termination (ms): 0.0]
[GC Worker Other (ms): 0.0]
[GC Worker Total (ms): 42.3]
[GC Worker End (ms): 245130.9]
[Code Root Fixup: 0.0 ms]
[Code Root Migration: 0.0 ms]
[Clear CT: 1.6 ms]
[Other: 9.5 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 8.6 ms]
[Ref Enq: 0.2 ms]
[Free CSet: 0.4 ms]
[Eden: 6592.0M(6592.0M)->0.0B(58.8G) Survivors: 96.0M->128.0M Heap: 30.6G(100.0G)->24.2G(100.0G)]
[Times: user=0.05 sys=0.00, real=0.06 secs]
2015-12-30T06:59:43.451+0000: 286.051: [GC pause (young) 286.054: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 392599, predicted base time: 367.03 ms, remaining time: 0.00 ms, target pause time: 200.00 ms]
286.054: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1884 regions, survivors: 4 regions, predicted young region time: 150.18 ms]
286.054: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1884 regions, survivors: 4 regions, old: 0 regions, predicted pause time: 517.21 ms, target pause time: 200.00 ms]
2015-12-30T06:59:47.767+0000: 290.368: [SoftReference, 0 refs, 0.0000570 secs]2015-12-30T06:59:47.768+0000: 290.368: [WeakReference, 350 refs, 0.0000640 secs]2015-12-30T06:59:47.768+0000: 290.368: [FinalReference, 99336 refs, 0.3781120 secs]2015-12-30T06:59:48.146+0000: 290.746: [PhantomReference, 0 refs, 1 refs, 0.0000290 secs]2015-12-30T06:59:48.146+0000: 290.746: [JNI Weak Reference, 0.0000140 secs] 290.767: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.74 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 4.7153740 secs]
[Parallel Time: 4313.9 ms, GC Workers: 1]
[GC Worker Start (ms): 286053.9]
[Ext Root Scanning (ms): 15.2]
[Update RS (ms): 86.3]
[Processed Buffers: 1557]
[Scan RS (ms): 4.1]
[Code Root Scanning (ms): 0.2]
[Object Copy (ms): 4208.1]
[Termination (ms): 0.0]
[GC Worker Other (ms): 0.0]
[GC Worker Total (ms): 4313.9]
[GC Worker End (ms): 290367.8]
[Code Root Fixup: 0.0 ms]
[Code Root Migration: 0.3 ms]
[Clear CT: 15.1 ms]
[Other: 386.0 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 378.4 ms]
[Ref Enq: 1.7 ms]
[Free CSet: 3.3 ms]
[Eden: 58.9G(58.8G)->0.0B(3456.0M) Survivors: 128.0M->1664.0M Heap: 84.1G(100.0G)->26.7G(100.0G)]
[Times: user=0.78 sys=3.94, real=4.71 secs]
attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.74 %, threshold: 10.00 %
我的猜测是这推动了 G1 的决策。您可以通过设置 -XX:GCTimeRatio=4
来放松它,这将允许它占用 CPU 周期的 20% 相对于 GCing 的应用程序时间而不是 10%。
如果这太多了,你应该
- 允许它使用更多 CPU 个核心 - 这将更容易满足其暂停时间目标,这反过来意味着它可以将收集推迟更长时间,从而更容易实现吞吐量目标。
是的,这确实意味着使用更多内核实际上可以总体上使用更少的 CPU 周期。
- 放宽暂停时间目标,这样它就不必经常收集
我设置 ParallelGCThreads=1
并使用 G1
GC
,所有其他 JVM
设置都是默认设置。我 运行 PageRank
在 Spark-1.5.1 上有两个 EC2 节点,每个节点 100 GB 堆。
我的堆使用图如下(红色区域:年轻代,黑色区域:老年代)。所有年轻的 GC 都很小,突然间出现了一个收集 60 GB 的年轻 GC,然后年轻的 GC 又变小了。我的 GC 日志显示没有混合 GC、没有完整 GC、1 个并发标记和数十个年轻 GC。我想知道为什么会发生这么大的年轻 GC?
下面是我的 GC 日志的一部分。巨大的年轻 GC 是 "Heap: 84.1G"
2015-12-30T06:59:02.488+0000: 245.088: [GC pause (young) 245.089: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 1727, predicted base time: 24.64 ms, remaining time: 175.36 ms, target pause time: 200.00 ms]
245.089: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 206 regions, survivors: 3 regions, predicted young region time: 148.87 ms]
245.089: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 206 regions, survivors: 3 regions, old: 0 regions, predicted pause time: 173.51 ms, target pause time: 200.00 ms]
2015-12-30T06:59:02.531+0000: 245.131: [SoftReference, 0 refs, 0.0000520 secs]2015-12-30T06:59:02.531+0000: 245.131: [WeakReference, 21 refs, 0.0000160 secs]2015-12-30T06:59:02.531+0000: 245.131: [FinalReference, 9759 refs, 0.0084720 secs]2015-12-30T06:59:02.539+0000: 245.140: [PhantomReference, 0 refs, 14 refs, 0.0000190 secs]2015-12-30T06:59:02.539+0000: 245.140: [JNI Weak Reference, 0.0000130 secs] 245.142: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 12.51 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 0.0534140 secs]
[Parallel Time: 42.3 ms, GC Workers: 1]
[GC Worker Start (ms): 245088.6]
[Ext Root Scanning (ms): 14.4]
[Update RS (ms): 1.9]
[Processed Buffers: 34]
[Scan RS (ms): 0.4]
[Code Root Scanning (ms): 0.0]
[Object Copy (ms): 25.5]
[Termination (ms): 0.0]
[GC Worker Other (ms): 0.0]
[GC Worker Total (ms): 42.3]
[GC Worker End (ms): 245130.9]
[Code Root Fixup: 0.0 ms]
[Code Root Migration: 0.0 ms]
[Clear CT: 1.6 ms]
[Other: 9.5 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 8.6 ms]
[Ref Enq: 0.2 ms]
[Free CSet: 0.4 ms]
[Eden: 6592.0M(6592.0M)->0.0B(58.8G) Survivors: 96.0M->128.0M Heap: 30.6G(100.0G)->24.2G(100.0G)]
[Times: user=0.05 sys=0.00, real=0.06 secs]
2015-12-30T06:59:43.451+0000: 286.051: [GC pause (young) 286.054: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 392599, predicted base time: 367.03 ms, remaining time: 0.00 ms, target pause time: 200.00 ms]
286.054: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1884 regions, survivors: 4 regions, predicted young region time: 150.18 ms]
286.054: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1884 regions, survivors: 4 regions, old: 0 regions, predicted pause time: 517.21 ms, target pause time: 200.00 ms]
2015-12-30T06:59:47.767+0000: 290.368: [SoftReference, 0 refs, 0.0000570 secs]2015-12-30T06:59:47.768+0000: 290.368: [WeakReference, 350 refs, 0.0000640 secs]2015-12-30T06:59:47.768+0000: 290.368: [FinalReference, 99336 refs, 0.3781120 secs]2015-12-30T06:59:48.146+0000: 290.746: [PhantomReference, 0 refs, 1 refs, 0.0000290 secs]2015-12-30T06:59:48.146+0000: 290.746: [JNI Weak Reference, 0.0000140 secs] 290.767: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.74 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 4.7153740 secs]
[Parallel Time: 4313.9 ms, GC Workers: 1]
[GC Worker Start (ms): 286053.9]
[Ext Root Scanning (ms): 15.2]
[Update RS (ms): 86.3]
[Processed Buffers: 1557]
[Scan RS (ms): 4.1]
[Code Root Scanning (ms): 0.2]
[Object Copy (ms): 4208.1]
[Termination (ms): 0.0]
[GC Worker Other (ms): 0.0]
[GC Worker Total (ms): 4313.9]
[GC Worker End (ms): 290367.8]
[Code Root Fixup: 0.0 ms]
[Code Root Migration: 0.3 ms]
[Clear CT: 15.1 ms]
[Other: 386.0 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 378.4 ms]
[Ref Enq: 1.7 ms]
[Free CSet: 3.3 ms]
[Eden: 58.9G(58.8G)->0.0B(3456.0M) Survivors: 128.0M->1664.0M Heap: 84.1G(100.0G)->26.7G(100.0G)]
[Times: user=0.78 sys=3.94, real=4.71 secs]
attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.74 %, threshold: 10.00 %
我的猜测是这推动了 G1 的决策。您可以通过设置 -XX:GCTimeRatio=4
来放松它,这将允许它占用 CPU 周期的 20% 相对于 GCing 的应用程序时间而不是 10%。
如果这太多了,你应该
- 允许它使用更多 CPU 个核心 - 这将更容易满足其暂停时间目标,这反过来意味着它可以将收集推迟更长时间,从而更容易实现吞吐量目标。
是的,这确实意味着使用更多内核实际上可以总体上使用更少的 CPU 周期。 - 放宽暂停时间目标,这样它就不必经常收集