Compressed oops in the Hotspot JVM
What's an oop, and why should they be compressed?
An "oop" in HotSpot parlance is a managed pointer to an object. They are normally the same size as a native machine pointer, which means 64 bits on an LP64 system. On an ILP32 system, there is a maximum heap size of 4Gb or so, which is not enough for many applications. On an LP64 system, though, the heap is almost twice as big, due to the expanded size of managed pointers. Memory is pretty cheap, but these days bandwidth and cache is in short supply, so doubling the size of the heap, just to get over the 4Gb limit, is painful.
(Additionally, on x86 chips, the ILP32 mode provides half the usable registers that the LP64 mode does. SPARC is not affected this way; RISC chips start out with lots of registers and just widen them for LP64 mode.)
Compressed oops represent managed pointers (in many but not all places in the JVM) as 32-bit values which must be scaled by a factor of 8 and added to a 64-bit base address to find the object they refer to. This allows applications to address up to four billion objects (not bytes), or a heap size of up to about 32Gb. At the same time, data structure compactness is competitive with ILP32 mode.
We use the term decode to express the operation by which a 32-bit compressed oop is converted into a 64-bit native address into the managed heap. The inverse operation is encoding.
Which oops are compressed?
In an ILP32-mode JVM, or if the UseCompressedOops flag is turned off in LP64 mode, all oops are the native machine word size.
If UseCompressedOops is true, the following oops in the heap will be compressed:
- the _klass field of every object
- every instance field
- every element of an oop array (objArray)
- in a constant pool... (what is the rule here??)
- (what else is compressed??)
The following oops in the heap are never compressed:
- type profile information (methodDataOops, "MDO's")
- (what else is native??)
In the interpreter, oops are never compressed. These include JVM locals and stack elements, outgoing call arguments, and return values. The interpreter eagerly decodes oops loaded from the heap, and encodes them before storing them to the heap.
Likewise, method calling sequences, either interpreted or compiled, do not deal with compressed oops.
In compiled code, oops are compressed or not according to the outcome of various optimizations. Optimized code may succeed in moving a compressed oop from one location in the managed heap to another without ever decoding it. Likewise, if the chip (i.e., x86) supports addressing modes which can be used for the decode operation, compressed oops might not be decoded even if they are used to address object fields or array elements.
Therefore, the following structures in compiled code can refer to either compressed oops or native heap addresses:
- register or spill slot contents
- oop maps (GC maps)
- debugging information (linked to oop maps)
- oops embedded directly in machine code (on non-RISC chips like x86 which allow this)
- nmethod constant section entries (including those used by relocations affecting machine code)
In the C++ code of the HotSpot JVM, the distinction between compressed and native oops is reflected in the C++ static type system. In general, oops are often uncompressed. In particular C++ member functions operate as usual on receivers (this) represented by native machine words. A few functions in the JVM are overloaded to handle either compressed or native oops.
Important C++ values which are never compressed:
- C++ object pointers (this)
- handles to managed pointers (type Handle, etc.)
- JNI handles (type jobject)