'serviceability memory category' 本机内存跟踪是什么？

Question

我在 docker 容器中有一个 java 应用程序 (JDK13) 运行。最近我将应用程序移动到 JDK17 (OpenJDK17) 并发现 docker 容器的内存使用量逐渐增加。

在调查过程中，我发现 'serviceability memory category' NMT 不断增长（每小时 15mb）。我检查了页面 https://docs.oracle.com/en/java/javase/17/troubleshoot/diagnostic-tools.html#GUID-5EF7BB07-C903-4EBD-A9C2-EC0E44048D37，但那里没有提到这个类别。

谁能解释一下这个可维护性类别是什么意思，是什么导致了这种逐渐增加？与 JDK13 相比，还有一些额外的新内存类别。也许有人知道我在哪里可以阅读有关它们的详细信息。

这是命令的结果jcmd 1 VM.native_memory summary

Native Memory Tracking:

(Omitting categories weighting less than 1KB)

Total: reserved=4431401KB, committed=1191617KB
-                 Java Heap (reserved=2097152KB, committed=479232KB)
                            (mmap: reserved=2097152KB, committed=479232KB) 
 
-                     Class (reserved=1052227KB, committed=22403KB)
                            (classes #29547)
                            (  instance classes #27790, array classes #1757)
                            (malloc=3651KB #79345) 
                            (mmap: reserved=1048576KB, committed=18752KB) 
                            (  Metadata:   )
                            (    reserved=139264KB, committed=130816KB)
                            (    used=130309KB)
                            (    waste=507KB =0.39%)
                            (  Class space:)
                            (    reserved=1048576KB, committed=18752KB)
                            (    used=18149KB)
                            (    waste=603KB =3.21%)
 
-                    Thread (reserved=387638KB, committed=40694KB)
                            (thread #378)
                            (stack: reserved=386548KB, committed=39604KB)
                            (malloc=650KB #2271) 
                            (arena=440KB #752)
 
-                      Code (reserved=253202KB, committed=76734KB)
                            (malloc=5518KB #23715) 
                            (mmap: reserved=247684KB, committed=71216KB) 
 
-                        GC (reserved=152419KB, committed=92391KB)
                            (malloc=40783KB #34817) 
                            (mmap: reserved=111636KB, committed=51608KB) 
 
-                  Compiler (reserved=1506KB, committed=1506KB)
                            (malloc=1342KB #2557) 
                            (arena=165KB #5)
 
-                  Internal (reserved=5579KB, committed=5579KB)
                            (malloc=5543KB #33822) 
                            (mmap: reserved=36KB, committed=36KB) 
 
-                     Other (reserved=231161KB, committed=231161KB)
                            (malloc=231161KB #347) 
 
-                    Symbol (reserved=30558KB, committed=30558KB)
                            (malloc=28887KB #769230) 
                            (arena=1670KB #1)
 
-    Native Memory Tracking (reserved=16412KB, committed=16412KB)
                            (malloc=575KB #8281) 
                            (tracking overhead=15837KB)
 
-        Shared class space (reserved=12288KB, committed=12136KB)
                            (mmap: reserved=12288KB, committed=12136KB) 
 
-               Arena Chunk (reserved=18743KB, committed=18743KB)
                            (malloc=18743KB) 
 
-                   Tracing (reserved=32KB, committed=32KB)
                            (arena=32KB #1)
 
-                   Logging (reserved=7KB, committed=7KB)
                            (malloc=7KB #289) 
 
-                 Arguments (reserved=1KB, committed=1KB)
                            (malloc=1KB #53) 
 
-                    Module (reserved=1045KB, committed=1045KB)
                            (malloc=1045KB #5026) 
 
-                 Safepoint (reserved=8KB, committed=8KB)
                            (mmap: reserved=8KB, committed=8KB) 
 
-           Synchronization (reserved=204KB, committed=204KB)
                            (malloc=204KB #2026) 
 
-            Serviceability (reserved=31187KB, committed=31187KB)
                            (malloc=31187KB #49714) 
 
-                 Metaspace (reserved=140032KB, committed=131584KB)
                            (malloc=768KB #622) 
                            (mmap: reserved=139264KB, committed=130816KB) 
 
-      String Deduplication (reserved=1KB, committed=1KB)
                            (malloc=1KB #8)

增加部分内存的详细信息为：

[0x00007f6ccb970cbe] OopStorage::try_add_block()+0x2e
[0x00007f6ccb97132d] OopStorage::allocate()+0x3d
[0x00007f6ccbb34ee8] StackFrameInfo::StackFrameInfo(javaVFrame*, bool)+0x68
[0x00007f6ccbb35a64] ThreadStackTrace::dump_stack_at_safepoint(int)+0xe4
                             (malloc=6755KB type=Serviceability #10944)

2022-01-17 更新#1：

感谢@Aleksey Shipilev 的帮助！我们能够找到导致问题的地方，与许多 ThreadMXBean#.dumpAllThreads 调用有关。这是 MCVE，Test.java:

运行与：

java -Xmx512M -XX:NativeMemoryTracking=detail Test.java

并定期检查

结果中的可维护性类别

jcmd YOUR_PID VM.native_memory summary

测试java:

import java.lang.management.ManagementFactory;
import java.lang.management.ThreadInfo;
import java.lang.management.ThreadMXBean;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class Test {

    private static final int RUNNING = 40;
    private static final int WAITING = 460;

    private final Object monitor = new Object();
    private final ThreadMXBean threadMxBean = ManagementFactory.getThreadMXBean();
    private final ExecutorService executorService = Executors.newFixedThreadPool(RUNNING + WAITING);

    void startRunningThread() {
        executorService.submit(() -> {
            while (true) {
            }
        });
    }

    void startWaitingThread() {
        executorService.submit(() -> {
            try {
                monitor.wait();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        });
    }

    void startThreads() {
        for (int i = 0; i < RUNNING; i++) {
            startRunningThread();
        }

        for (int i = 0; i < WAITING; i++) {
            startWaitingThread();
        }
    }

    void shutdown() {
        executorService.shutdown();
        try {
            executorService.awaitTermination(5, TimeUnit.SECONDS);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    
    public static void main(String[] args) throws InterruptedException {
        Test test = new Test();

        Runtime.getRuntime().addShutdownHook(new Thread(test::shutdown));

        test.startThreads();

        for (int i = 0; i < 12000; i++) {
            ThreadInfo[] threadInfos = test.threadMxBean.dumpAllThreads(false, false);
            System.out.println("ThreadInfos: " + threadInfos.length);

            Thread.sleep(100);
        }

        test.shutdown();
    }
}

Answer 1

不幸的是（？），要确定这些类别映射到什么，最简单的方法是查看 OpenJDK 源代码。您要找的 NMT 标签是 mtServiceability。这表明“可维护性”基本上是 JDK/JVM 中的诊断接口：JVMTI、堆转储等

但是通过观察您显示的堆栈跟踪样本可以清楚地看到同样的事情提到 ThreadStackTrace::dump_stack_at_safepoint——这是转储线程信息的东西，例如 jstack，堆转储等。如果您怀疑该代码中存在内存泄漏，您可以尝试构建一个 MCVE 来演示它，并针对 OpenJDK 提交错误，或将其展示给其他 OpenJDK开发商。您可能更清楚您的应用程序正在做什么导致线程转储，请关注那里。

也就是说，我在 StackFrameInfo 中没有看到 any obvious memory leaks，我也无法通过压力测试重现任何泄漏，所以也许您看到的是“只是”线程转储越来越大的线程堆栈。或者您在线程转储发生时捕获它。或者……没有MCVE就不好说了。

更新：玩过 MCVE 后，我意识到它可以用 17.0.1 重现，但主线开发 JDK 或 JDK 都不能重现18 EA，或 JDK 17.0.2 EA。我之前用 17.0.2 EA 测试过，所以没看到它，该死。 17.0.1 和 17.0.2 EA 之间的二分显示它已通过 JDK-8273902 向后移植修复。 17.0.2 本周发布，所以升级后该错误应该会消失。

Answer 2

一些内存波动的一个可能原因是一些其他进程使用动态附加附加到 JVM 上并调试应用程序并将应用程序明智的信息传输到调试器。 Serviceability 与 jdb（java 调试器）密切相关。

https://openjdk.java.net/groups/serviceability/

打开JDK有这个也解析documented

Serviceability in HotSpot

The HotSpot Virtual Machine contains several technologies that allow its operation >to be observed by another Java process:

The Serviceability Agent(SA). The Serviceability Agent is a Sun private >component in the HotSpot repository that was developed by HotSpot engineers to >assist in debugging HotSpot. They then realized that SA could be used to craft >serviceability tools for end users since it can expose Java objects as well as >HotSpot data structures both in running processes and in core files.

jvmstat performance counters. HotSpot maintains several performance counters >that are exposed to external processes via a Sun private shared memory mechanism. >These counters are sometimes called perfdata.

The Java Virtual Machine Tool Interface (JVM TI). This is a standard C >interface that is the reference implementation of JSR 163 - JavaTM Platform >Profiling Architecture JVM TI is implemented by HotSpot and allows a native code >'agent' to inspect and modify the state of the JVM.

The Monitoring and Management interface. This is a Sun private API that allows >aspects of HotSpot to be monitored and managed.

Dynamic Attach. This is a Sun private mechanism that allows an external process >to start a thread in HotSpot that can then be used to launch an agent to run in >that HotSpot, and to send information about the state of HotSpot back to the >external process.

DTrace. DTrace is the award winning dynamic trace facility built into Solaris >10 and later versions. DTrace probes have been added to HotSpot that allow >monitoring of many aspects of operation when HotSpot runs on Solaris. In addition, >HotSpot contains a jhelper.d file that enables dtrace to show Java frames in stack >traces.

pstack support. pstack is a Solaris utility that prints stack traces of all >threads in a process. HotSpot includes support that allows pstack to show Java >stack frames.

'serviceability memory category' 本机内存跟踪是什么？

What is 'serviceability memory category' of Native Memory Tracking?

java

java-17

openjdk-17