如果 getStackTrace() 被一个线程调用并且 lambda 定义(通过 Unsafe.defineAnonymousClass)出现在另一个线程中,Java8 挂起

Java8 hangs up if getStackTrace() is called by one thread and lambda definition (via Unsafe.defineAnonymousClass) occurs in another thread

我的 Web 应用程序在 Apache Tomcat/8.0.21 中运行,JVM 1.8.0_45-b15 和 Windows Server 2012 在 16 核(32- 超线程)Dual-Xeon NUMA 机器,在某些非常不幸的情况下,当标题中描述的操作同时发生在两个不同的线程中时,可能会卡住。

执行第一个操作的线程 (getStackTrace()) 正在尝试执行一些诊断以检测系统的哪一部分正在减慢速度并在调用 Thread.dumpThreads 时卡住。 另一个线程正在执行一些操作,其中包括 under-the-hood 部分 JVM 上的 lambda 定义。

特别是,我有以下堆栈跟踪(通过 jstack -F <pid> 获得):

Attaching to process ID 6568, please wait... 
Debugger attached successfully. 
Server compiler detected. 
JVM version is 25.45-b02 

Deadlock Detection:

No deadlocks found. (... well, that's not the kind of deadlock you were searching for, dear JVM, but something bad is happening altogether :( )

Thread 155: (state = BLOCKED)
 - sun.misc.Unsafe.defineAnonymousClass(java.lang.Class, byte[], java.lang.Object[]) @bci=0 (Compiled frame; information may be imprecise)
 - java.lang.invoke.InvokerBytecodeGenerator.loadAndInitializeInvokerClass(byte[], java.lang.Object[]) @bci=8 (Compiled frame)
 - java.lang.invoke.InvokerBytecodeGenerator.loadMethod(byte[]) @bci=6 (Compiled frame)
 - java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(java.lang.invoke.LambdaForm, java.lang.invoke.MethodType) @bci=17 (Compiled frame)
 - java.lang.invoke.LambdaForm.compileToBytecode() @bci=65 (Compiled frame)
 - java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(java.lang.invoke.MethodType, int) @bci=638 (Interpreted frame)
 - java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.lang.invoke.MethodType, int) @bci=17 (Compiled frame)
 - java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.lang.invoke.MemberName) @bci=163 (Compiled frame)
 - java.lang.invoke.DirectMethodHandle.make(byte, java.lang.Class, java.lang.invoke.MemberName) @bci=94 (Compiled frame)
 - java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(byte, java.lang.Class, java.lang.invoke.MemberName, boolean, boolean, java.lang.Class) @bci=201 (Compiled frame)
 - java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(byte, java.lang.Class, java.lang.invoke.MemberName, java.lang.Class) @bci=8 (Compiled frame)
 - java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(byte, java.lang.Class, java.lang.invoke.MemberName) @bci=30 (Compiled frame)
 - java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(byte, java.lang.Class, java.lang.String, java.lang.Object) @bci=115 (Compiled frame)
 - java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(java.lang.Class, int, java.lang.Class, java.lang.String, java.lang.Object) @bci=38 (Compiled frame)
 - c.e.s.w.t.si.a.DDVP.lambda(com.vaadin.data.Container, com.vaadin.ui.HorizontalLayout, com.vaadin.ui.Label, com.vaadin.ui.Label, java.lang.String, java.lang.String, java.util.Map) @bci=48, line=104 (Interpreted frame)
 - c.e.s.w.t.si.a.DDVP$$Lambda7.updateUIWith(java.lang.Object) @bci=32 (Interpreted frame)
 - c.e.s.w.d.DU$VoidUIUpdaterFromUIUpdater.updateUI() @bci=8, line=321 (Compiled frame)
 - c.e.s.w.d.DU$CompletionSignallingVoidUIUpdater.updateUI() @bci=4, line=125 (Compiled frame)
 - c.e.s.w.d.CUQ.sweepWhileNotTimedOut() @bci=59, line=218 (Compiled frame)
 - c.e.s.w.d.CUQ$QueueExhauster.run() @bci=247, line=122 (Compiled frame)
 - c.e.s.w.d.CUQ$DequeuerStartFailed.run() @bci=40, line=60 (Compiled frame)
 - c.e.s.w.s.ew.CC.lambda(java.lang.Runnable) @bci=13, line=66 (Compiled frame)
 - c.e.s.w.s.ew.CC$$Lambda.run() @bci=8 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5 (Interpreted frame)
 - java.lang.Thread.run() @bci=11 (Compiled frame)

Thread 108: (state = BLOCKED) [The tricky one...]
 - java.lang.Thread.dumpThreads(java.lang.Thread[]) @bci=0 (Interpreted frame)
 - java.lang.Thread.getStackTrace() @bci=41 (Compiled frame)
 - c.e.s.w.SWA$$Lambda.getStackTrace() @bci=4 (Interpreted frame)
 - c.e.s.w.SWA.describeHoggingCode(c.e.s.w.s.RequestTimeTracker$StackTraceProvider, boolean) @bci=1, line=401 (Interpreted frame)
 - c.e.s.w.SWA.describeHoggingCode(c.e.s.w.s.RequestTimeTracker$StackTraceProvider, boolean, java.lang.Thread) @bci=6, line=396 (Interpreted frame)
 - c.e.s.w.SWA.lambda(java.lang.Thread, java.lang.String) @bci=8, line=890 (Interpreted frame)
 - c.e.s.w.SWA$$Lambda.run() @bci=12 (Interpreted frame)
 - c.e.s.w.s.ew.CC.lambda(java.lang.Runnable) @bci=13, line=66 (Compiled frame)
 - c.e.s.w.s.ew.CC$$Lambda.run() @bci=8 (Compiled frame)
 - c.e.s.w.s.IS.lambda(java.util.concurrent.atomic.AtomicBoolean, java.lang.Runnable) @bci=8, line=327 (Compiled frame)
 - c.e.s.w.s.IS$$Lambda.run() @bci=8 (Compiled frame)
 - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4 (Compiled frame)
 - java.util.concurrent.FutureTask.run() @bci=42 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access1(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) @bci=1 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() @bci=30 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5 (Interpreted frame)
 - java.lang.Thread.run() @bci=11 (Interpreted frame)

从我的角度来看,这个问题可能与 Unsafe.defineAnonymousClass 无法处理对 java.lang.Thread.dumpThreads 的持续调用有关(这反过来又需要实现 java.lang.Thread.getStackTrace,在 JVM 中)。关键是,由于 final 或包修饰符,我无法扩展此过程中涉及的任何核心 类(例如 LookupMethodHandleNatives等),以便在对 java.lang.Thread.dumpThreads 的调用仍在进行时引入一个锁来阻止棘手的不安全调用。另外,我怀疑引入这样的锁也可能会减慢速度,因为 lambda 无处不在。

有没有人遇到过类似的问题?能帮忙解决一下吗?

谢谢!


当然,stacktrace 中也有类似的线程,我认为它们与本案例无关,所以我省略了。

Thread 154: (state = BLOCKED) [Many of these....]
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) @bci=78 (Compiled frame)
 - org.eclipse.jetty.util.BlockingArrayQueue.poll(long, java.util.concurrent.TimeUnit) @bci=57, line=389 (Compiled frame)
 - org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll() @bci=12, line=516 (Compiled frame)
 - org.eclipse.jetty.util.thread.QueuedThreadPool.access0(org.eclipse.jetty.util.thread.QueuedThreadPool) @bci=1, line=47 (Compiled frame)
 - org.eclipse.jetty.util.thread.QueuedThreadPool.run() @bci=300, line=575 (Compiled frame)
 - java.lang.Thread.run() @bci=11 (Compiled frame)


Thread 153: (state = BLOCKED) [and of these...]
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.ForkJoinPool.awaitWork(java.util.concurrent.ForkJoinPool$WorkQueue, int) @bci=354 (Compiled frame)
 - java.util.concurrent.ForkJoinPool.runWorker(java.util.concurrent.ForkJoinPool$WorkQueue) @bci=44 (Interpreted frame)
 - java.util.concurrent.ForkJoinWorkerThread.run() @bci=24 (Interpreted frame)


Thread 141: (state = BLOCKED) [and of these...]
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42 (Compiled frame)
 - java.util.concurrent.LinkedBlockingQueue.take() @bci=29 (Compiled frame)
 - org.apache.tomcat.util.threads.TaskQueue.take() @bci=36, line=103 (Compiled frame)
 - org.apache.tomcat.util.threads.TaskQueue.take() @bci=1, line=31 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=149 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5 (Interpreted frame)
 - org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run() @bci=4, line=61 (Interpreted frame)
 - java.lang.Thread.run() @bci=11 (Interpreted frame)

我终于解决了这个讨厌的问题。如果将选项 -XX:-ClassUnloading 传递给 JVM 可执行文件(就像我尝试的 JVM 的情况一样),那么 JVM 似乎会像发布时那样卡住跟踪(至少涉及 sun.misc.Unsafe.defineAnonymousClass)治疗)。出于某种原因,这会在 Java 系统库中激发这种死锁式错误,它似乎是随机的(与所有死锁错误一样)并且随着 JVM 进程 "ages" 越来越有可能。同样,这对于死锁错误来说并不是什么新鲜事:您放入机器中的 "gears"(例如,类 应该被卸载但不是由于该选项)越多,就越有可能其中两个或更多曲柄。

因此,删除选项 -XX:-ClassUnloading 可使此问题完全消失。

底线是:永远不要在生产系统中使用 -XX:-ClassUnloading(或者让任何人在 JVM 进程启动脚本中放置这样的选项),即使在 Java8, which shouldn't be limited by PermGen 中也不行,但它仍然会受到严重影响,因为sun.misc.Unsafe.defineAnonymousClass 中的问题。