Darius Juodokas

Posted on Apr 9, 2021 • Edited on Jun 8, 2022

JVM. Memory management

#jvm #java #middleware #memory

Indirect memory access

If you are familiar with low-level programming languages, like C or C++ or ASM, you might recall how memory management is done there. Basically, your application code has to estimate how much memory it will require for a data structure, ask the OS to allocate that memory as a contiguous memory block, access "cells" of that block separately (prepare/read/write) and, when no longer needed, release (free) that memory block explicitly. If you forget to release that memory block, you have a memory leak, which will eventually consume all the RAM you have.

In JVM life is a lot easier. Writing Java code you only have to worry about WHATs.

I want to create WHAT? A String. (I don't care HOW)
I want to copy WHAT? A List of Objects. (I don't care HOW)
I want to pass a WHAT to this function? An array. (I don't care HOW)
...

The JVM takes care of all of the HOWs (except for arrays - you still need to know how many items (not bytes though) they are to contain). It will estimate how many bytes you require and allocate those memory blocks for you. Will it allocate contiguous memory blocks? You don't know. Will it allocate memory in CopyOnWrite fashion? You don't know. And you don't need to worry about those details. Worry about WHAT will you do in your code next instead.

Memory pools

JVM has its own understanding of what the memory layout should look like. JVM divides all the memory it requires into pools. Note the highlighted "requires". It does not take up all the memory or most of the memory. It only consumes the memory it requires. And demands for each pool are different. Some are accessed more often, others - less often. Some change their contents completely several times per second, others - keep the same content until the application exits. Some contain gigabytes of memory, while others only a few megabytes.

At runtime, your code will ask the JVM to access all those pools for different reasons: calling a method (function), creating a new object, running the same algorithm over and over, some pools' saturation reached 100%, and so on. It's a somewhat complex system, doing its best to keep your code comfortable, not worrying about HOW (or WHEN) to get or release memory.

Memory pools are divided into 2 groups: Heap and non-Heap.

Heap is where all your application objects are stored. It is the larger of the two and it's usually the Heap that causes you memory trouble.
Non-Heap memory is everything else. You can think of Heap as the stage in the theatre, and off-heap as all the maintenance rooms and corridors, staircases, cupboards and other places you need to prepare for the show and keep it going.

Memory pools' layout

Consider the image below. It displays the principal layout of JVM memory pools.

Labels hint in what cases data is allocated in each pool.

Heap

Heap is where your data structures (objects) are stored in. It is usually the largest portion of the JVM memory. Heap itself is divided into 2 main sections:

Young generation, which is also divided into
- Eden
- survivor0
- survivor1
Old generation (also called Tenured space)

YoungGen

Eden

Eden is where objects are initially created. They stay in Eden for some time - for as long as there is room in this pool - and then the decision is made to either keep them or discard them releasing their memory. If the object is discarded - poof - the memory is no longer allocated and it's freed (in a way). If, however, the object is still used in your java code, it will be tagged with the "survivor" label and moved over (promoted) to the next memory pool - Survivors.

Survivors

There are 2 survivor regions: s0 and s1. One of them is usually empty, while another one is being filled up, and when it fills up, they switch who is empty and who is not. Suppose objects from Eden were promoted to S1. All the other objects that survive Eden will come to stay at S1 for as long as there is any room in S1 pool. When S1 fills up, an inspector comes and checks each and every object in the S1, labelling objects that are no longer required in your java code. All the objects that got tagged are immediately released. The remaining objects get a medal for surviving the S1 and are moved over to the S0 region. In the meantime, all the objects that survive Eden are now promoted to S0 too. When S0 becomes full, the same inspection takes place, and then survivors are moved to S1 with one more medal. When a survivor gets enough medals, it's ranked as a long-living survivor and it is promoted to the ultimate memory pool, where only long-living objects are stored: the OldGen.

OldGen

OldGen is usually the largest pool in the Heap. It contains long-living objects, veterans, if you will, that have been serving for a long time. They are so-called mature objects. As more and more objects mature, the OldGen will become more and more populated.

Non-Heap

This is the mysterious part of the JVM memory. It's difficult to monitor it, it's small and people usually don't even know it exists, not to mention knowing what it's used for.

PermGen / Metaspace

PermGen (Permanent Generation) is a region that was used until java8. It was renamed as Metaspace since java8 and got its limits lifted. PermGen stores classes' metadata (hence the metaspace). Not the Class<?>.class instances (they are stored in Heap, along with other instances), but just the metadata. Class metadata describes what's the class called, what's its memory layout, what methods it's got, static fields and their values. Everything static is stored in PermGen/Metaspace. When a class loader is created, it gets a memory region allocated in PermGen, for that classloader's classes' metadata. Each classloader has its own region, that's why classloaders do not share classes.
PermGen is an area that is fixed in size. You have to specifically tell the JVM (unless you are satisfied with the defaults) in advance how large PermGen you would want, and, once reached the limit, an OutOfMemoryException is thrown. Also, it was permanent - classes, once loaded in, could not be unloaded. This became a problem when more and more applications, libraries and frameworks began to rely on bytecode manipulation. Since Java8 this region was renamed as Metaspace and the limit was lifted. Metaspace can grow as large as it likes (or as long as there is memory on the host platform). This growth can be limited with JVM parameters.

JIT Code Cache

As the JVM runs, it keeps on executing the same parts of the code over and over again - some parts of the code are hotter than others (HotSpots). Over time, JVM notes down the sets of instructions that keep on recurring and are a good fit for optimization. These sets of instructions might be compiled into native machine code, and native code no longer runs on the bytecode interpreter - it now runs directly on the CPU. That is a great performance boost, improving the performance of those methods at magnitudes of 10s or 100s or, in some cases, even more. The compilation is done by the JIT (Just-in-Time) compiler, and the compiled machine code is stored in the JIT code cache region.

GC

This is a tiny region that the Garbage Collector uses for its own needs.

Symbol

This space contains field names, method signatures, cached numeric values and interned (cached) Strings. Numeric compile-time literals (5, 50L, 3.14D, etc.) are cached and reused by the JVM in order to preserve memory (literals are immutable, remember? They are static final). A similar thing happens with Strings too. Moreover, Strings can be interned manually, at runtime: if the String.intern() method is called, it will be cached too. Next time a string with the same contents is referenced, the interned String instance will be used instead of creating a new String object. If this region starts growing too large, it might mean that your java code is interning too many different strings.

Shared class space

This is a small memory region that stores .jsa files - java classes' data files, prepared for fast loading. This region is used to speed up JVM startup, particularly the part of the startup where system libraries are loaded. It doesn't have much of an impact during runtime.

Compiler

This region is used by the JIT Compiler. It's a working area for the compiler - it does not store the compiled machine code.

Logging

Unfortunately, Java docs are not very wordy when it comes to this region. They only say that this region is used for logging. We can only assume it's used actively during runtime, but it's unclear what problems can occur when this region is overutilized.

Aguments

This is also a tiny region, that stores command-line arguments of the java.exe command. This memory does not play any significant role at runtime, as it's mainly populated during boot-time.

Internal

Quoting the Java docs: "Memory that does not fit the previous categories, such as the memory used by the command line parser, JVMTI, properties, and so on"

Other

Quoting the java docs: "Memory not covered by another category"

Thread (JVM Stack)

This region can potentially cause problems, especially in heavy-duty applications. This area contains threads' meta info. Whenever a thread calls a method, JVM pushes the called method's signature (stack frame) to that thread's Stack, located in the Thread (JVM Stack) area. References to all the passed method parameters are also pushed along. The more threads you have and the deeper the stacks these threads have the more memory will they require in this region. Stack size can be limited per thread with JVM parameters, but there is no way to limit the number of threads. This effectively means, that uncontrolled proliferation of threads might exhaust your system memory (it's off-heap, remember?).

NMT

This is a tiny region used by the java's NativeMemoryTracking mechanism, for its internal needs. NMT is a feature you want to be enabled if you have memory usage concerns. It's a feature of the JVM that allows us to see what is actually happening off-heap, as there are no other ways to reliably observe off-heap memory usage. However, enabling NMT adds ~10% performance penalty, so that is not something you might want to use in a live production system on daily basis.

Native allocations

If off-heap is a memory that is not stored in Heap regions (and not restricted by Heap limits), the Native Allocations region can be seen as off-off-heap. It is a part of the memory that the JVM does not manage at all. At all. There are very few ways to reach this memory from your Java code (DirectByteBuffer or ByteBuffer.allocateDirect()). This part of memory is extensively utilized when developing hybrid Java applications, using JNI - java applications, that are also using components written in C/C++. This is often the case in high-throughput java applications and Android development, where some components are developed in native code to boost performance.

Memory pools' sizes and limits

Heap

Default MAX size
- jdk1.1: 16MB
- jdk1.2: 64MB
- jdk1.5: Math.min(1GB, totalRam/4)
- jdk6u18:
  - if total RAM is <192MB: totalRam/2
  - if total RAM is >192MB: totalRam/4 (some systems: Math.max(256MB, totalRam/4))
- jdk11 [up until jdk16]: totalRam/4
Configuration (algorithm, verification: java -XX:+PrintFlagsFinal -version)
- -Xmx (e.g. -Xmx10g) can be used to set custom maximum heap size
- -XX:MaxRAMPercentage (e.g. -XX:MaxRAMPercentage=75) (since jdk8u191; default: 25) can be used to adjust max Heap size in percent-of-total-ram manner
- -XX:MaxRAMFraction (e.g. -XX:MaxRAMFraction=2.5) (since jdk8u131 up to jdk8u191; default: 4) is another way to configure what part of total RAM can Heap allocate. It's basically an x in a formula: maxHeap = totalRam / x. In a machine with totalRam=16GB, MaxRAMFraction=1 is equal to setting -Xmx16g, MaxRAMFraction=2 is equal to -Xmx=8g, MaxRAMFraction=8 is equal to -Xmx=2g, and so on.
- -XX:MaxRam (e.g. -XX:MaxRam=1073741824) normally JVM asks the OS (or cgroups) what's the totalRam on the machine. MaxRam can override this ask - with this flag you can make the JVM think there's 1073741824 bytes (in given example) available in the system. The JVM will use this value to calculate memory pools' sizes dynamically. If -Xmx is passed, MaxRam has no effect.

YoungGen

Some configurations might not work OOTB, because the Adaptive Size Policy might be overriding them. To disable ASP use -XX:-UseAdaptiveSizePolicy.

Default MAX size
- NewRatio=2 (2/3 of Heap is OldGen, 1/3 is YoungGen)
Configuration
- -Xmn (e.g. -Xmn2g) sets the size (both, min and max) of the YoungGen to some particular value.
- -XX:NewRatio (e.g. -XX:NewRatio=3) defines the youngGen:oldGen ratio. For example, setting -XX:NewRatio=3 means that the ratio between the young and tenured generation is 1:3. In other words, the YoungGen (combined size of the eden and survivor spaces) will be 1/4 of the total heap size. This parameter is ignored if either NewSize or MaxNewSize is used.
- -XX:MaxNewSize (e.g. -XX:MaxNewSize=100m) sets the maximum size of the YoungGen.

Eden and Survivors

Default MAX size
- SurvivorRatio=8
Configuration
- -XX:SurvivorRatio (e.g. -XX:SurvivorRatio=6) defines the eden:survivors ratio. In this example, the ratio is 1:6. In other words, each survivor space will be 1/7 the size of eden, and thus 1/8 the size of the young generation (not one-seventh, because there are two survivor spaces). Survivor size can be calculated with this formula: singleSurvivorSize = youngGenSize / (SurvivorRatio + 2)

Off-Heap

Most of the regions are uncapped, meaning they can grow without any limits. Usually, it's not a problem, as most of those regions are used by internal JVM mechanisms and the memory is very unlikely to leak. However, the Native Memory pool, used by JNI and JNA as well as direct buffers in the Java code, are more likely to cause memory leaks here.

PermGen

Up to jdk8

Default Max Size
- MaxPermSize=64m on 32-bit systems, and 85.(3) in 64-bit machines
Configuration
- -XX:MaxPermSize (e.g.-XX:MaxPermSize=2g) sets the max size of the PermGen memory pool

Metaspace

From jdk8 onwards

Default Max Size
- unlimited
Configuration
- -XX:MaxMetaspaceSize (e.g. -XX:MaxMetaspaceSize=500m) sets the max size of the Metaspace region.

JIT Code Cache

Default Max Size
- jdk1.7 and below: ReservedCodeCacheSize=48MB
- jdk1.8 and above: ReservedCodeCacheSize=240MB with TieredCompilation enabled (by default). When -XX:-TieredCompilation (disabled), ReservedCodeCacheSize drops to 48MB
Configuration
- -XX:ReservedCodeCacheSize (e.g. -XX:ReservedCodeCacheSize=100m) can set max size of the JIT code cache.
- -XX:UseCodeCacheFlushing (e.g. -XX:UseCodeCacheFlushing or -XX:-UseCodeCacheFlushing) to enable or disable JIT cache flushing when certain conditions are met (full cache is one of them).

GC

Uncapped

Symbol

Uncapped

Shared Class Space

Uncapped

Compiler

Uncapped

Logging

Uncapped

Arguments

Uncapped

Internal

Uncapped

Threads' stacks

Default Max Size
- -Xss1024k in 64-bit VM and -Xss320k on 32-bit VMs (since jdk1.6)
Configuration
- -Xss (e.g. -Xss200m) or -XX:ThreadStackSize (e.g. -XX:ThreadStackSize=200m) will limit size of a single thread. However, avoid using ThreadStackSize. Setting ThreadStackSize to 0 will make the VM to use system (OS) defaults. There is no easy way to calculate how large stack you may need, so you may want to adjust the Xss when the default is not enough.

Monitoring

Heap

Heap is rather easy to monitor. Heap usage is tracked closely by default, you just need tools to access that information. See here for more info.

jmap -heap <pid> or jhsdb jmap --heap --pid <pid> displays sizes of each Heap region and PermGen.
jmap -histo <pid> (warning: might slow down the JVM for some time; the output might be lengthy, so dump it to the file) takes a histogram of all the classes in the Heap. It's like a summary of a HeapDump. Here you can find all the classes, a number of their instances and how much heap each utilizes.
jstat -gc -t <pid> 1000 will print usage of each Heap region every 1000 milliseconds (i.e. every second) - useful for live monitoring.

Off-Heap

It's difficult to monitor off-heap memory. It's advised to monitor this 'dark side of the JVM' only when required, i.e. when there are memory-related problems you are trying to debug. That is, because the method, that gives you the best visibility of off-heap memory, NMT, adds 2 words to each malloc(), which approx ends up in 5-10% overall performance penalty.

NMT can be enabled upon JVM startup either by passing -XX:NativeMemoryTracking=summary or -XX:NativeMemoryTracking=detail parameter to the java program. Unless you need very detail information, summary should suffice (amounts of detail output might be unnecessarily overwhelming). When JVM is started with NMT enabled, you can use jcmd to

see current off-heap statistics' summary (jcmd <pid> VM.native_memory summary),
see detail current off-heap statistics (jcmd <pid> VM.native_memory detail),
establish a baseline of NMT records (jcmd <pid> VM.native_memory baseline),
track diffs over periods of time (jcmd <pid> VM.native_memory detail.diff).

References

Written with StackEdit.

Top comments (2)

soylomass • Sep 14 '21 • Edited

Hi! Congrats for your article, it's the only one I found that explains Java memory usage this clearly and detailedly.

Is there any way for multiple JVMs using the same AppCDS .jsa to only load it to RAM once? I need to run multiple equal .jars and I notice they all load the .jsa to memory (using NMT i see that the memory that was previously used by "Class" is now used by "Shared class space"), making faster start-up the only benefit of class sharing.

Is there a way to them share classes without having each of them load them to RAM separately?

Thanks in advance.

PS: I use OpenJDK 16