DEV Community

Raunak Ramakrishnan
Raunak Ramakrishnan

Posted on • Edited on

JVM Primer Part 2 - Debugging memory issues

This is part 2 of my series on JVM Memory management and debugging.
Read part 1 here:

In this post, we will cover symptoms of memory issues for JVM-based applications, which tools we can use to diagnose them and how we can fix them.

Symptoms

Here are a few symptoms of memory issues:

  1. Poor application performance
  2. Abnormal memory usage
  3. OutOfMemory errors (OOME)

Poor Application Performance

  1. Application not performing to expected level
  2. Long response times
  3. Dropping client requests
  4. Stuck threads
  5. Service unavailability
  6. Large gaps in timestamps in application logs

Causes of memory problems:

  1. Misconfigured memory
    • Old generation memory space is sized smaller than live-set of objects. This triggers a major garbage collection (GC), resulting in larger pauses.
    • Code cache is smaller than generated compiled code footprint
    • Young generation is not sized appropriately leading to premature promotion of objects
    • PermGen / Metaspace not sized correctly leading to full GC
  2. Memory leaks - Unintentional retention of objects in memory spaces
    • Unintentional references to set of objects in heap
    • Not dereferencing classloader instances appropriateky
    • Not releasing native resources appropriately
  3. Excessive use of finalizers
    • Objects with finalizers may delay their own GC
    • Finalizer thread needs to invoke finalize() method of the instances before reclaiming them
    • There can only be 1 Finalizer thread. If it does not keep up with rate at which objects become available for finalization, JVM fails with OOME
    • Pending finalizer objects are essentially accumulated garbage
    • Finalizers deprecated in Java 9
  4. Explicit GC calls
    • System.gc() and diagnostic data collections can cause long pauses
    • -XX:+DisableExplicitGC can disable System.gc() calls
    • -XX:+PrintClassHistogram also calls an explicit GC when receiving kill -3 signal

OutOfMemoryError

  • Hierarchy : Throwable -> Error -> VirtualMachineError -> OutOfMemoryError (Unchecked exception)
  • Thrown when JVM runs out of space in various memory spaces or cannot proceed further with process execution. Some of the possibilities:
    • Heap space full
      • JVM already invoked full GC but could not free up space
      • Heap may be sized smaller than app footprint or app is unnecessarily holding on to some set of objects in heap
    • GC overhead limit exceeded
      • Too many GCs with very less space claimed
      • Application threads not getting any CPU cycles
    • Requested array size exceeds VM limit
    • PermGen space / Metaspace / compressed class space
      • Full GC invoked but unable to free space in Metaspace and application is attempting to load more classes
      • Metaspace by default "unlimited" but can be controlled by MaxMetaspaceSize. By default, 1 GB reserved for compressed class space
      • Make sure that -Xnoclassgc is not in use as it prevents unloading of classes
    • Native memory - out of swap space / stack trace with native method
      • Native space used for Java thread stacks, loaded jars, zips, native libraries, native resources like files; mem allocated from native code
      • Unable to allocate more native memory or to create new threads or native memory leaks
      • Running 32 bit JVM on 64 bit machine puts 4 GB limit on process size
      • Position of Java heap can put a cap on max size of native heap. Can be controlled by option -XX:HeapBaseMinAddress=n to specify address native heap should be based at

CodeCache warnings

  • warning message printed by JVM saying CodeCache full, compiler has been disabled.
  • No OOME when code cache is full
  • Emergency cleanup undertaken by Sweeper. This may discard compiled code and JIT may need to perform optimizations again
  • Ensure appropriate size of CC using ReservedCodeCacheSize option

Direct Buffer Memory

  • ByteBuffer.allocateDirect(N) : Direct buffers which are garbage collected using phantom references and a reference queue
  • Unlimited memory by default but can be controlled by -XX:MaxDirectMemorySize=n
  • Used by Java NIO. Heap ByteBuffer for I/O uses temporary direct ByteBuffer

Diagnostic Data, Data Collection and Analysis Tools

Troubleshooting Memory leaks

  1. Confirm memory leak

    • Monitor heap usage over time
    • If full GCs unable to claim space in OldGen, could be config issue
    • Heap size may be too small -> Increase heap size and monitor! If issue persists, it could be a memory leak
    • -XX:+GCTimeLimit sets upper limit on amount of time GCs can spend in percent of total time, default 98%
    • -XX:+GCHeapFreeLimit sets lower limit on amount of space that should be freed by a GC, represented as % of max heap, default is 2%
    • OutOfMemoryError is thrown after a full GC if previous 5 consecutive GCs were not able to keep the GC cost below GCTimeLimit or free up at least GCHeapFreeLimit space
    • PermGen/Metaspace may be too small if frequent Full GCs do not claim any space
  2. Diagnostic data and analysis

    • GC logs are helpful for determining heap requirements, finding out excessive GCs and long GC pauses and in configuration of memory spaces
      • For Java 9+, G1 options are: -Xlog:gc*,gc+phases=debug:file=gc.log . For non G1, -Xlog:gc*:file=gc.log. For older JVMs, -XX:+PrintGCDetails, -XX:+PrintGCTimeStamps, -XX:+PrintGCDateStamps, -Xloggc:gc.log
      • For checking metaspace, -verbose:class or -XX:+TraceClassLoading , -XX:+TraceClassUnloading
      • We can analyse logs through manual inspection, GCViewer, GCHisto, gceasy.io
    • Heap dumps help determine unexpected memory growth and memory leaks.
      • We can take heap dumps in follwing ways:
        • jcmd pid GC.heap_dump heapdump.dmp
        • jmap -dump:format=b,file=snapshot.jmap pid
        • JConsole or Java Mission Control using MBean HotSpotDiagnostic
        • JVM option heap dump on OOM error : -XX:+HeapDumpOnOutOfMemoryError . Frequent full GCs can delay collection of heap dump and restart of the process
      • Eclipse Memory Analyzer Tool (MAT) shows leak suspects, histograms, unreachable objects, duplicate classes, reference chains to GC roots, allows using OQL to explore heap dumps.
      • JOverFlow for JMC and Java VisualVM, YourKit (a commercial profiler) can all take heap dumps.
    • Heap histograms - quick view of objects in heap
      • Collect using:
        • -XX:+PrintClassHistogram and SIGQUIT on Posix and SIGBREAK on Windows
        • jcmd pid GC.class_histogram filename=histo
        • jmap -histo pid core_file
        • jhsdb jmap (Java 9)
    • Java flight recordings - unexpected memory growth and memory leaks, GC events
      • Enable Heap Statistics. Can introduce additional performance overhead
      • To create a flight recording : -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:StartFlightRecording=delay=20s,duration=60s,name=Rec,filename=lol.jfr,settings=profile
      • Flight recordings can find out type of leaking objects but you need heap dumps to find out what is causing the objects to leak
    • Finalizers
      • Collect data using JConsole, jmap
      • Analyse using Eclipse MAT / Visual VM using heap dumps
    • Native Memory
      • Native Memory Tracker output - tracks native memory used internally by JVM, not for external libraries. Start JVM with NativeMemoryTracking option
      • pmap, libumem, valgrind, core files

Conclusion

In this series, we have taken a look at how the JVM manages memory and how the garbage collection process works. We have also gone through how to diagnose memory issues, which tools to use to collect and analyze diagnostic information and some JVM options which can affect application performance.

Top comments (0)