使用命令行工具从堆转储 (hprof) 中提取 JVM 堆栈跟踪

Extracting JVM stack trace from a heap dump (hprof) with a command line tool

我想自动从无头服务器上的 JVM 崩溃堆转储文件中提取堆栈跟踪。是否有非 GUI 解决方案?

MAT 项目带有一个脚本 ParseHeapDump.sh*.bat 在 Windows 上),如果你只是 运行 它在堆转储上是这样的: /path/to/MAT/ParseHeapDump.sh <HEAPDUMP_NAME>.hprof,你得到一组文件,其中一个是包含堆栈跟踪的文本文件 <HEAPDUMP_NAME>.threads,例如:

Thread 0x715883630

  locals:

Thread 0x71577b418
  at java.lang.Object.wait(J)V (Native Method)
  at java.lang.ref.ReferenceQueue.remove(J)Ljava/lang/ref/Reference; (ReferenceQueue.java:155)
  at jdk.internal.ref.CleanerImpl.run()V (CleanerImpl.java:148)
  at java.lang.Thread.run()V (Thread.java:834)
  at jdk.internal.misc.InnocuousThread.run()V (InnocuousThread.java:134)

  locals:
    objectId=0x715833c30, line=1
    objectId=0x716df1990, line=1
    objectId=0x715832230, line=2
    objectId=0x71577b418, line=2
    objectId=0x71577b418, line=2
    objectId=0x71577b418, line=3
    objectId=0x71577b418, line=4

Thread 0x715883050
  at jdk.internal.misc.Unsafe.park(ZJ)V (Native Method)
  at java.util.concurrent.locks.LockSupport.parkNanos(Ljava/lang/Object;J)V (LockSupport.java:234)
  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(J)J (AbstractQueuedSynchronizer.java:2123)
  at java.util.concurrent.DelayQueue.take()Ljava/util/concurrent/Delayed; (DelayQueue.java:229)
  at com.intellij.util.concurrency.AppDelayQueue.lambda$new[=10=]()V (AppDelayQueue.java:26)
  at com.intellij.util.concurrency.AppDelayQueue$$Lambda.run()V (Unknown Source)
  at java.lang.Thread.run()V (Thread.java:834)

  locals:
    objectId=0x715883050, line=1
    objectId=0x715cd8a90, line=2
    objectId=0x7291c6618, line=2
    objectId=0x715f73520, line=3
    objectId=0x715cd8aa8, line=3
    objectId=0x715883050, line=3
    objectId=0x715f73520, line=4
    objectId=0x715cd8ac0, line=5
    objectId=0x715883050, line=6
...

您只对此处以 Thread 0x... at ... 开头的行感兴趣。

备注:

  1. 该脚本比仅提取线程转储所需的工作要多得多(它还解析堆,占工作的 99.99...%);它会生成许多您不需要的文件(所有文件都带有前缀 <HEAPDUMP_NAME>)。
  2. 线程名称丢失,线程 ID 似乎不对应任何 OS-特定的(例如本机线程 ID);换句话说,您无法在 pstop/htop 输出中找到 Thread 0x715883050 — 至少我不知道办法。