- Reduce the complexity of macros related to command-line flags (JDK-8243208)
- Avoid including large globals.hpp in every file. Just include the <mod>_globals.hpp file of the module you need (separate RFE: JDK-8243205)
- Improvement footprint/performance
- The flag declaration is done with large X-macros. Although some people find this objectionably, this proposal does not change that.
Requirements for Command-line Flags
The flags implementation in HotSpot has evolved over 20 years without much documentation.
By reading the JDK 15 code, I can gather the following requirements – which lead to the complex implementation.
(These requirements cannot be implemented elegantly/efficiently using older C++ compilers, leading to the current messy code)
- [REQ1] Flags have 3 types
- PRODUCT - readable/writable in all builds
- DEVELOP - readable in all builds, writable only in debug builds
- NOTPRODUCT - readable/writable only in debug builds
- Using a flag of this type in a product build will result in C compiler error.
- [REQ2] Flags can have optional attributes
- MANAGEABLE, DIAGNOSTIC, EXPERIMENTAL
- MANAGEABLE, DIAGNOSTIC, EXPERIMENTAL
[REQ3] Each flag can be in at most one of the following groups.
C1, C2, JVMCI, ARCH, LP64
(many flags aren't in any of these groups)
- [REQ4] Each flag must have a default value
- Platform-defined default values are specified by XXX_PD macros
- [REQ5] Flag declarations should be concise (default flag attributes shouldn't need to be specified)
- [REQ6] Metadata for flags must be compile-time generated
- Good: stored in .bss section
- Bad: initialized with global constructors
Problems with command-line flags in JDK 15
First, although types, attributes and groups are orthogonal, JDK 15 allows you to only specify a limited number of combinations. For example, all experimental flags must be of the PRODUCT type:
In a way, the [REQ5] conciseness requirement is the root cause of the messy implementation. If we add an extra "kind" parameter to the macros, we can convert the above to the following, which would allow the second flag to be both EXPERIMENTAL and DEVELOP
Similarly, groups could be implemented as something like the following (instead of the messy macros in here).
Another way to keep the code concise is to use constructor overloading in the flags definition, something like (simplified)
However, this runs into problem with [REQ6]. Even the effect of the above code can be completely decided at compilation time, without constexpr, GCC stubbornly insists to initialize the array using global constructors
Another option is to use clever macros. However, there's no vararg macro that I can think of that can satisfy all of the above macros
Proposal - Use constexpr!
C++ constexpr makes things much easier:
New way of declaring flags - with optional argument for attributes. Also, the flag declaration macros now all take only 7 arguments: (compared to old version in globals.hpp)
The flags metadata is defined using overloaded constructors (around lin 679 of jvmFlag.cpp)
Handling of Groups:
There's a very small number of groups. They don't seem very useful so I don't know if we will add many new groups in the future. So for the time being, I implemented groups by ordering the flags and counting the size of each group:
Handling of types
Same as before, just fewer cases (old version here)
Range and Constraint Checking
The old code has 2 problems
- Builds list of range/constraint checking objects at VM start-up (lots of code and slow start)
- Range/constraint checking needs linear search (further slows down start-up)
The new design (see here for webrev):
- Builds the checker objects (JVMFlagLimit) at build-time using constexpr.
- JVMFlagLimit objects are indexed by each flag's enum (or NULL if no limit exists), so it's O(1) time.
Implementation of JVMFlagLimit
The range/constraint information for a flag of type T is described by a JVMTypedFlagLimit<T>:
Each flag is given a unique enum that starts from 0 to NUM_JVMFlagsEnum-1. We use this enum to find the JVMTypedFlagLimit<T> of this flag from an array:
Most flags have neither range nor constraint. For those flags, we want its flagLimits[Flag_flagname_Enum] to be NULL.
To do this, we first define a JVMTypedFlagLimit<T> variable for each flag (including the ones that don't have range/constraint). It's done by this macro:
To understand how the macros work, it's best to compile jvmFlagLimit.c with gcc -save-temps. and look at the generated jvmFlagLimit.ii with the macros expanded. Here's an example:
We use overloaded constructors to fill out the necessarily fields of the JVMTypedFlagLimit<T> variables. Note that the min/max parameters, as well as the constraint_func/phase parameters, can both be integer values. For disambiguation, we pass in a dummy next_two_args_are_constraint for the constraint_func/phase.
We also need to always pass in an initial dummy 0 parameter so that the macros can safely add a comma before passing the min/max or constraint_func/phase.
These dummy parameters are evaluated at compile time so they can be safely optimized away.
The next step is to fill out the flagLimitTable array:
For the flags shown in the example above, the following code is generated by the macros:
- If a flag has neither range nor constraint, we will call the LimitGetter<T>::get_limit() function with two parameters, which returns NULL.
- If a flag has range and/or constraint, we will call a LimitGetter<T>::get_limit() function with more than two parameters. These functions would return the same JVMTypedFlagLimit<T> as passed in.
As a result, we will end up with this in the final output of the C++ compiler:
What happens to unreferenced flag limits
All the flag limits are defined with the constexpr keyword, which has internal linkage by default. If a flag has no range/constraint, its flag limit (e.g., limit_UseCompressedOops in the example above) will be unused, and will be eliminated by the C++ compiler from the object file. So we don't waste any space.
Why use enums for constraint_func
This is a small optimization: There are 120 flags that use a constraint function, but there are only 65 total constraint functions. By using a short index, we can:
- Reduce the size of the JVMFlagLimit object (the 2 bytes fits in unused space)
- Reduce the number of pointers relocated when libjvm.so is dynamically loaded (from 120 to 65).
The savings are not a big deal, but since we can do it, why not?
Is constexpr really working?
A good way to check is to build the .o with something like "gcc -save-temps" and look at the .s file. Here's an example of jvmFlagLimit.s. You can see that the content of the flagLimitTable is also completely determined at build time (it's in the "ro" section):