JVM NMT 工具

2019年04月08日

学习来源

https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html

Native Memory Tracking

本地内存追踪(NMT) 是 Java HotSpot 虚拟机跟踪内部内存使用情况的工具。NMT使用文档:传送门

因为NMT不能跟踪非java代码分配的内存,你可能需要使用操作系统支持的工具来检测本机代码中的内存泄漏。

下面的章节描述的是如何监控虚拟机内部内存分配和诊断虚拟机内存泄漏

使用NMT发现内存泄漏(Use NMT to Detect a Memory Leak)

使用一下步骤来发现内存泄漏:

  1. 开启JVM统计或者跟踪详情,JVM参数:-XX:NativeMemoryTracking=summary 或者 -XX:NativeMemoryTracking=detail.
  2. 建立一个基准线,使用NMT的基准线功能获得一个基准线,用来在开发和维护机器期间做比较使用,命令:jcmd VM.native_memory baseline
  3. 监控内存变化:jcmd VM.native_memory detail.diff
  4. 如果应用程序泄漏的内存较少,则需要花费一定时间才能表现出来。

如何监控虚拟机内部内存(How to Monitor VM Interal Memory)

通过使用 jcmd 工具,可以设置本机内存跟踪来监视内存,并确保应用程序在开发或维护期间不会开始使用越来越多的内存。在这里可以查看NMT内存分类:点我

一下章节会描述如何获取统计或详情数据,并介绍如何理解输出样例

  • 解释输出样例:以下的输出样例,你会看到预留和已提交内存。需要注意的是只有已提交内存才是实际上已经使用的内存。例如,当使用 -Xms100m -Xmx1000m 运行时,JVM会为Java堆预留1000M内存。因为初始化的堆大小只有100M,所以在刚开始时只有100MB的内存是已提交的。在64位的机器上地址空间几乎是无限制的,所以JVM预留大量内存也不会有问题。如果越来越多的内存变为已提交的,可能会导致交换区或者本地OOM的情况。

Arena是一个内存分配块,当退出一个范围或者离开一个代码块时,内存就是从一系列的内存块中进行释放的。这些内存块可能会被其他子系统重用,用来保存临时内存,例如,线程前置分配(prethread allocations)。Arena内存分配策略确保不会发生内存泄漏。所以Arena会跟踪整个内存,而不只是对象。某些内存初始存量不能被追踪。

启用NMT会造成5%-10%的JVM新能损耗,并且NMT会增加两个机器指令到内存分配头中。NMT的内存使用也由NMT跟踪。

  • 获取统计数据:使用JVM启动参数 -XX:NativeMemoryTracking=summary 可以获得本地内存使用情况。

下面是一个由NMT描述的内存统计样例,一种获取样例数据的方法是运行jcmd <pid> VM.native_memory summary

    Total:  reserved=664192KB,  committed=253120KB                                           <--- total memory tracked by Native Memory Tracking

    -                 Java Heap (reserved=516096KB, committed=204800KB)                      <--- Java Heap
                                (mmap: reserved=516096KB, committed=204800KB)

    -                     Class (reserved=6568KB, committed=4140KB)                          <--- class metadata
                                (classes #665)                                               <--- number of loaded classes
                                (malloc=424KB, #1000)                                        <--- malloc'd memory, #number of malloc
                                (mmap: reserved=6144KB, committed=3716KB)

    -                    Thread (reserved=6868KB, committed=6868KB)
                                (thread #15)                                                 <--- number of threads
                                (stack: reserved=6780KB, committed=6780KB)                   <--- memory used by thread stacks
                                (malloc=27KB, #66)
                                (arena=61KB, #30)                                            <--- resource and handle areas

    -                      Code (reserved=102414KB, committed=6314KB)
                                (malloc=2574KB, #74316)
                                (mmap: reserved=99840KB, committed=3740KB)

    -                        GC (reserved=26154KB, committed=24938KB)
                                (malloc=486KB, #110)
                                (mmap: reserved=25668KB, committed=24452KB)

    -                  Compiler (reserved=106KB, committed=106KB)
                                (malloc=7KB, #90)
                                (arena=99KB, #3)

    -                  Internal (reserved=586KB, committed=554KB)
                                (malloc=554KB, #1677)
                                (mmap: reserved=32KB, committed=0KB)

    -                    Symbol (reserved=906KB, committed=906KB)
                                (malloc=514KB, #2736)
                                (arena=392KB, #1)

    -           Memory Tracking (reserved=3184KB, committed=3184KB)
                                (malloc=3184KB, #300)

    -        Pooled Free Chunks (reserved=1276KB, committed=1276KB)
                                (malloc=1276KB)

    -                   Unknown (reserved=33KB, committed=33KB)
                                (arena=33KB, #1)
  • 获取详细数据:使用JVM启动参数-XX:NativeMemoryTracking=detail可以获取NMT的详细数据。它会追踪分配内存最多的方法。下面的样例是 使用 jcmd <pid> VM.native_memory detail 获取的输出数据。

      Virtual memory map:
    
    [0x8f1c1000 - 0x8f467000] reserved 2712KB for Thread Stack
                    from [Thread::record_stack_base_and_size()+0xca]
            [0x8f1c1000 - 0x8f467000] committed 2712KB from [Thread::record_stack_base_and_size()+0xca]
    
    [0x8f585000 - 0x8f729000] reserved 1680KB for Thread Stack
                    from [Thread::record_stack_base_and_size()+0xca]
            [0x8f585000 - 0x8f729000] committed 1680KB from [Thread::record_stack_base_and_size()+0xca]
    
    [0x8f930000 - 0x90100000] reserved 8000KB for GC
                    from [ReservedSpace::initialize(unsigned int, unsigned int, bool, char*, unsigned int, bool)+0x555]
            [0x8f930000 - 0x90100000] committed 8000KB from [PSVirtualSpace::expand_by(unsigned int)+0x95]
    
    [0x902dd000 - 0x9127d000] reserved 16000KB for GC
                    from [ReservedSpace::initialize(unsigned int, unsigned int, bool, char*, unsigned int, bool)+0x555]
            [0x902dd000 - 0x9127d000] committed 16000KB from [os::pd_commit_memory(char*, unsigned int, unsigned int, bool)+0x36]
    
    [0x9127d000 - 0x91400000] reserved 1548KB for Thread Stack
                    from [Thread::record_stack_base_and_size()+0xca]
            [0x9127d000 - 0x91400000] committed 1548KB from [Thread::record_stack_base_and_size()+0xca]
    
    [0x91400000 - 0xb0c00000] reserved 516096KB for Java Heap                                                                            <--- reserved memory range
                    from [ReservedSpace::initialize(unsigned int, unsigned int, bool, char*, unsigned int, bool)+0x190]                  <--- callsite that reserves the memory
            [0x91400000 - 0x93400000] committed 32768KB from [VirtualSpace::initialize(ReservedSpace, unsigned int)+0x3e8]               <--- committed memory range and its callsite
            [0xa6400000 - 0xb0c00000] committed 172032KB from [PSVirtualSpace::expand_by(unsigned int)+0x95]                             <--- committed memory range and its callsite
    
    [0xb0c61000 - 0xb0ce2000] reserved 516KB for Thread Stack
                    from [Thread::record_stack_base_and_size()+0xca]
            [0xb0c61000 - 0xb0ce2000] committed 516KB from [Thread::record_stack_base_and_size()+0xca]
    
    [0xb0ce2000 - 0xb0e83000] reserved 1668KB for GC
                    from [ReservedSpace::initialize(unsigned int, unsigned int, bool, char*, unsigned int, bool)+0x555]
            [0xb0ce2000 - 0xb0cf0000] committed 56KB from [PSVirtualSpace::expand_by(unsigned int)+0x95]
            [0xb0d88000 - 0xb0d96000] committed 56KB from [CardTableModRefBS::resize_covered_region(MemRegion)+0xebf]
            [0xb0e2e000 - 0xb0e83000] committed 340KB from [CardTableModRefBS::resize_covered_region(MemRegion)+0xebf]
    
    [0xb0e83000 - 0xb7003000] reserved 99840KB for Code
                    from [ReservedSpace::initialize(unsigned int, unsigned int, bool, char*, unsigned int, bool)+0x555]
            [0xb0e83000 - 0xb0e92000] committed 60KB from [VirtualSpace::initialize(ReservedSpace, unsigned int)+0x3e8]
            [0xb1003000 - 0xb139b000] committed 3680KB from [VirtualSpace::initialize(ReservedSpace, unsigned int)+0x37a]
    
    [0xb7003000 - 0xb7603000] reserved 6144KB for Class
                    from [ReservedSpace::initialize(unsigned int, unsigned int, bool, char*, unsigned int, bool)+0x555]
            [0xb7003000 - 0xb73a4000] committed 3716KB from [VirtualSpace::initialize(ReservedSpace, unsigned int)+0x37a]
    
    [0xb7603000 - 0xb760b000] reserved 32KB for Internal
                    from [PerfMemory::create_memory_region(unsigned int)+0x8ba]
    
    [0xb770b000 - 0xb775c000] reserved 324KB for Thread Stack
                    from [Thread::record_stack_base_and_size()+0xca]
            [0xb770b000 - 0xb775c000] committed 324KB from [Thread::record_stack_base_and_size()+0xca]
    
    
            Get diff from NMT baseline: For both summary and detail level tracking, you can set baseline once the application is up and running. Do this by running jcmd <pid> VM.native_memory baseline after some warm up of the application. Then, you can run: jcmd <pid> VM.native_memory summary.diff or jcmd <pid> VM.native_memory detail.diff.
    
  • 基于NMT基准线获取变化:为了便于统计和详细数据的追踪,可以在程序启动时使用jcmd <pid> VM.native_memory baseline 设置一个基准线。然后就可以使用jcmd <pid> VM.native_memory summary.diff 或者 jcmd <pid> VM.native_memory detail.diff 获取变化信息了。下面是一个统计信息的diff样输出:

        Total:  reserved=664624KB  -20610KB, committed=254344KB -20610KB                         <--- total memory changes vs. earlier baseline. '+'=increase '-'=decrease
    
        -                 Java Heap (reserved=516096KB, committed=204800KB)
                                    (mmap: reserved=516096KB, committed=204800KB)
    
        -                     Class (reserved=6578KB +3KB, committed=4530KB +3KB)
                                    (classes #668 +3)                                            <--- 3 more classes loaded
                                    (malloc=434KB +3KB, #930 -7)                                 <--- malloc'd memory increased by 3KB, but number of malloc count decreased by 7
                                    (mmap: reserved=6144KB, committed=4096KB)
    
        -                    Thread (reserved=60KB -1129KB, committed=60KB -1129KB)
                                    (thread #16 +1)                                              <--- one more thread
                                    (stack: reserved=7104KB +324KB, committed=7104KB +324KB)
                                    (malloc=29KB +2KB, #70 +4)
                                    (arena=31KB -1131KB, #32 +2)                                 <--- 2 more arenas (one more resource area and one more handle area)
    
        -                      Code (reserved=102328KB +133KB, committed=6640KB +133KB)
                                    (malloc=2488KB +133KB, #72694 +4287)
                                    (mmap: reserved=99840KB, committed=4152KB)
    
        -                        GC (reserved=26154KB, committed=24938KB)
                                    (malloc=486KB, #110)
                                    (mmap: reserved=25668KB, committed=24452KB)
    
        -                  Compiler (reserved=106KB, committed=106KB)
                                    (malloc=7KB, #93)
                                    (arena=99KB, #3)
    
        -                  Internal (reserved=590KB +35KB, committed=558KB +35KB)
                                    (malloc=558KB +35KB, #1699 +20)
                                    (mmap: reserved=32KB, committed=0KB)
    
        -                    Symbol (reserved=911KB +5KB, committed=911KB +5KB)
                                    (malloc=519KB +5KB, #2921 +180)
                                    (arena=392KB, #1)
    
        -           Memory Tracking (reserved=2073KB -887KB, committed=2073KB -887KB)
                                    (malloc=2073KB -887KB, #84 -210)
    
        -        Pooled Free Chunks (reserved=2624KB -15876KB, committed=2624KB -15876KB)
                                    (malloc=2624KB -15876KB)
    

下面是detail详细信息的diff样例输出:

                                  Details:

                                  [0x01195652] ChunkPool::allocate(unsigned int)+0xe2
                                                              (malloc=482KB -481KB, #8 -8)

                                  [0x01195652] ChunkPool::allocate(unsigned int)+0xe2
                                                              (malloc=2786KB -19742KB, #134 -618)

                                  [0x013bd432] CodeBlob::set_oop_maps(OopMapSet*)+0xa2
                                                              (malloc=591KB +6KB, #681 +37)

                                  [0x013c12b1] CodeBuffer::block_comment(int, char const*)+0x21                <--- [callsite address] method name + offset
                                                              (malloc=562KB +33KB, #35940 +2125)               <--- malloc'd amount, increased by 33KB #malloc count, increased by 2125

                                  [0x0145f172] ConstantPool::ConstantPool(Array<unsigned char>*)+0x62
                                                              (malloc=69KB +2KB, #610 +15)

                                  ...

                                  [0x01aa3ee2] Thread::allocate(unsigned int, bool, unsigned short)+0x122
                                                              (malloc=21KB +2KB, #13 +1)

                                  [0x01aa73ca] Thread::record_stack_base_and_size()+0xca
                                                              (mmap: reserved=7104KB +324KB, committed=7104KB +324KB)