RFR: 8058176: [mlvm] tests should not allow code cache exhaustion

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8058176: [mlvm] tests should not allow code cache exhaustion

Evgeny Nikitin-2
Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:

* Code cache size getters are added to WhiteBox;
* MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
* Dependencies on WhiteBox added for all affected tests;
* The test cases in question un-problemlisted.

Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.

-------------

Commit messages:
 - Un-problemlist the OOME tests
 - Add CodeCache methods to the WhiteBox
 - 8058176: [mlvm] tests should not allow code cache exhaustion

Changes: https://git.openjdk.java.net/jdk/pull/2523/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8058176
  Stats: 102 lines in 13 files changed: 88 ins; 6 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2523.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Evgeny Nikitin-2
> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:
>
> * Code cache size getters are added to WhiteBox;
> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
> * Dependencies on WhiteBox added for all affected tests;
> * The test cases in question un-problemlisted.
>
> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.

Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision:

  Switch to ManagementBeans approach instead of the WhiteBox one

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2523/files
  - new: https://git.openjdk.java.net/jdk/pull/2523/files/4153edb1..71af7185

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=00-01

  Stats: 100 lines in 12 files changed: 12 ins; 80 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2523.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion

Evgeny Nikitin-2
In reply to this post by Evgeny Nikitin-2
On Thu, 11 Feb 2021 13:27:52 GMT, Evgeny Nikitin <[hidden email]> wrote:

> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:
>
> * Code cache size getters are added to WhiteBox;
> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
> * Dependencies on WhiteBox added for all affected tests;
> * The test cases in question un-problemlisted.
>
> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.

As suggested by @iignatev, cleaned off WhiteBox changes in favour of management JMX beans, resulting in much cleaner solution.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Igor Ignatyev-2
In reply to this post by Evgeny Nikitin-2
On Fri, 12 Feb 2021 18:22:52 GMT, Evgeny Nikitin <[hidden email]> wrote:

>> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:
>>
>> * Code cache size getters are added to WhiteBox;
>> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
>> * Dependencies on WhiteBox added for all affected tests;
>> * The test cases in question un-problemlisted.
>>
>> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.
>
> Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision:
>
>   Switch to ManagementBeans approach instead of the WhiteBox one

test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/share/MHTransformationGen.java line 69:

> 67:     private static final boolean USE_THROW_CATCH = false; // Test bugs
> 68:
> 69:     private static final MemoryPoolMXBean CODE_CACHE_MX_BEAN = ManagementFactory

does it work w/ both `-XX:+SegmentedCodeCache` and `-XX:-SegmentedCodeCache`?
If I remember correctly (@TobiHartmann , please correct me if I'm wrong), `CodeCache` pool exists when `SegmentedCodeCache` is disabled, when it's enabled, you will have 3 different pools (one for each "CodeHeap"), and here we would need to use one for `non-nmethod` codeheap.

-- Igor

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Igor Ignatyev-2
In reply to this post by Evgeny Nikitin-2
On Fri, 12 Feb 2021 18:22:52 GMT, Evgeny Nikitin <[hidden email]> wrote:

>> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:
>>
>> * Code cache size getters are added to WhiteBox;
>> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
>> * Dependencies on WhiteBox added for all affected tests;
>> * The test cases in question un-problemlisted.
>>
>> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.
>
> Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision:
>
>   Switch to ManagementBeans approach instead of the WhiteBox one

test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/share/MHTransformationGen.java line 107:

> 105:             if (isCodeCacheEffectivelyFull()) {
> 106:                 Env.traceNormal("Not enought code cache to build up MH sequences anymore. " +
> 107:                         " Has only been able to achieve " + (MAX_CYCLES - i) + " out of " + MAX_CYCLES);

given `nextInt(x)` returns a random number from `[0; x]`, we might have achieved more (or less) `MAX_CYCLES - i`, i.e. that part of the message is incorrect, I'd just remove it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v3]

Evgeny Nikitin-2
In reply to this post by Evgeny Nikitin-2
> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:
>
> * Code cache size getters are added to WhiteBox;
> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
> * Dependencies on WhiteBox added for all affected tests;
> * The test cases in question un-problemlisted.
>
> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.

Evgeny Nikitin has updated the pull request incrementally with two additional commits since the last revision:

 - Fix 'cycles to build' error output
 - Add support for segmented CodeCache

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2523/files
  - new: https://git.openjdk.java.net/jdk/pull/2523/files/71af7185..763d94b8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=01-02

  Stats: 31 lines in 1 file changed: 23 ins; 1 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2523.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Evgeny Nikitin-2
In reply to this post by Igor Ignatyev-2
On Fri, 12 Feb 2021 20:03:01 GMT, Igor Ignatyev <[hidden email]> wrote:

>> Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   Switch to ManagementBeans approach instead of the WhiteBox one
>
> test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/share/MHTransformationGen.java line 69:
>
>> 67:     private static final boolean USE_THROW_CATCH = false; // Test bugs
>> 68:
>> 69:     private static final MemoryPoolMXBean CODE_CACHE_MX_BEAN = ManagementFactory
>
> does it work w/ both `-XX:+SegmentedCodeCache` and `-XX:-SegmentedCodeCache`?
> If I remember correctly (@TobiHartmann , please correct me if I'm wrong), `CodeCache` pool exists when `SegmentedCodeCache` is disabled, when it's enabled, you will have 3 different pools (one for each "CodeHeap"), and here we would need to use one for `non-nmethod` codeheap.
>
> -- Igor

Thanks for the info about the segmented code cache. I did some research and found that the opposite is true - both nmethod pools ('profiled' and 'non-profiled') are growing along with the MH graph growth. This is supported by the specification for non-method code heap at:

https://docs.oracle.com/en/java/javase/15/vm/java-hotspot-virtual-machine-performance-enhancements.html#GUID-1D9B26AD-8E0A-4771-90DA-A81A2C1F5B55

Please check the the fixed version.

> test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/share/MHTransformationGen.java line 107:
>
>> 105:             if (isCodeCacheEffectivelyFull()) {
>> 106:                 Env.traceNormal("Not enought code cache to build up MH sequences anymore. " +
>> 107:                         " Has only been able to achieve " + (MAX_CYCLES - i) + " out of " + MAX_CYCLES);
>
> given `nextInt(x)` returns a random number from `[0; x]`, we might have achieved more (or less) `MAX_CYCLES - i`, i.e. that part of the message is incorrect, I'd just remove it.

Fixed by extracting the generated random number first.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Igor Ignatyev-2
On Tue, 16 Feb 2021 19:29:42 GMT, Evgeny Nikitin <[hidden email]> wrote:

>> test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/share/MHTransformationGen.java line 69:
>>
>>> 67:     private static final boolean USE_THROW_CATCH = false; // Test bugs
>>> 68:
>>> 69:     private static final MemoryPoolMXBean CODE_CACHE_MX_BEAN = ManagementFactory
>>
>> does it work w/ both `-XX:+SegmentedCodeCache` and `-XX:-SegmentedCodeCache`?
>> If I remember correctly (@TobiHartmann , please correct me if I'm wrong), `CodeCache` pool exists when `SegmentedCodeCache` is disabled, when it's enabled, you will have 3 different pools (one for each "CodeHeap"), and here we would need to use one for `non-nmethod` codeheap.
>>
>> -- Igor
>
> Thanks for the info about the segmented code cache. I did some research and found that the opposite is true - both nmethod pools ('profiled' and 'non-profiled') are growing along with the MH graph growth. This is supported by the specification for non-method code heap at:
>
> https://docs.oracle.com/en/java/javase/15/vm/java-hotspot-virtual-machine-performance-enhancements.html#GUID-1D9B26AD-8E0A-4771-90DA-A81A2C1F5B55
>
> Please check the the fixed version.

o/c they grow, b/c we use them for compiled code *and* if there is no space in non-nmethod heap, we use them for adapters as well, so I guess that the growth that you see is already after non-nmethod heap got exhausted. I'd recommend you simply use the sum of all available code-heaps (this will increase the possibility of false-positive results due to segmentation, but I don't think it matters much here).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Evgeny Nikitin-2
On Tue, 16 Feb 2021 19:49:02 GMT, Igor Ignatyev <[hidden email]> wrote:

>> Thanks for the info about the segmented code cache. I did some research and found that the opposite is true - both nmethod pools ('profiled' and 'non-profiled') are growing along with the MH graph growth. This is supported by the specification for non-method code heap at:
>>
>> https://docs.oracle.com/en/java/javase/15/vm/java-hotspot-virtual-machine-performance-enhancements.html#GUID-1D9B26AD-8E0A-4771-90DA-A81A2C1F5B55
>>
>> Please check the the fixed version.
>
> o/c they grow, b/c we use them for compiled code *and* if there is no space in non-nmethod heap, we use them for adapters as well, so I guess that the growth that you see is already after non-nmethod heap got exhausted. I'd recommend you simply use the sum of all available code-heaps (this will increase the possibility of false-positive results due to segmentation, but I don't think it matters much here).

Well, seems like rebalancing doesn't works that good. Here's a sample failure with plenty of free space in the non-nmethods heap:

[8.230s][warning][codecache] CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
[8.230s][warning][codecache] Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
Java HotSpot(TM) 64-Bit Server VM warning: CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
CodeHeap 'non-profiled nmethods': size=8192Kb used=8191Kb max_used=8191Kb free=0Kb   << Exhausted
CodeHeap 'profiled nmethods': size=8192Kb used=8191Kb max_used=8191Kb free=0Kb       << Exhausted
CodeHeap 'non-nmethods': size=102400Kb used=18343Kb max_used=18343Kb free=84056Kb    << 84Mb of free space

# ERROR: Caught exception in Thread[Thread-41,5,MainThreadGroup]
...
# ERROR: Caused by: java.lang.VirtualMachineError: Out of space in CodeCache for method handle intrinsic
The sum monitoring won't help here either. I've added non-nmethods heap to the monitoring, just to be sure.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Igor Ignatyev-2
On Wed, 17 Feb 2021 15:34:41 GMT, Evgeny Nikitin <[hidden email]> wrote:

>> o/c they grow, b/c we use them for compiled code *and* if there is no space in non-nmethod heap, we use them for adapters as well, so I guess that the growth that you see is already after non-nmethod heap got exhausted. I'd recommend you simply use the sum of all available code-heaps (this will increase the possibility of false-positive results due to segmentation, but I don't think it matters much here).
>
> Well, seems like rebalancing doesn't works that good. Here's a sample failure with plenty of free space in the non-nmethods heap:
>
> [8.230s][warning][codecache] CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
> [8.230s][warning][codecache] Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
> Java HotSpot(TM) 64-Bit Server VM warning: CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
> Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
> CodeHeap 'non-profiled nmethods': size=8192Kb used=8191Kb max_used=8191Kb free=0Kb   << Exhausted
> CodeHeap 'profiled nmethods': size=8192Kb used=8191Kb max_used=8191Kb free=0Kb       << Exhausted
> CodeHeap 'non-nmethods': size=102400Kb used=18343Kb max_used=18343Kb free=84056Kb    << 84Mb of free space
>
> # ERROR: Caught exception in Thread[Thread-41,5,MainThreadGroup]
> ...
> # ERROR: Caused by: java.lang.VirtualMachineError: Out of space in CodeCache for method handle intrinsic
> The sum monitoring won't help here either. I've added non-nmethods heap to the monitoring, just to be sure.

hm... that can mean that there is a product bug (or my recollections about code heaps aren't as good as I thought).

@TobiHartmann , @iwanowww, could you please take a look? Evgeny's observations suggest that method handle intrinsics use `non-profiled nmethods` and `profiled nmethods` heaps and not `non-nmethods` heap despite the fact that the last one has plenty of free space. my understanding is/was that we should have used `non-nmethods` heap for MH intrinsic 1st and if it's exhausted start to use the other heaps.

Thanks,
-- Igor

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v4]

Evgeny Nikitin-2
In reply to this post by Evgeny Nikitin-2
> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes:
>
> * Code cache size getters are added to WhiteBox;
> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance);
> * Dependencies on WhiteBox added for all affected tests;
> * The test cases in question un-problemlisted.
>
> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86.

Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision:

  Add non-nmethods pool to the monitoring

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2523/files
  - new: https://git.openjdk.java.net/jdk/pull/2523/files/763d94b8..6a3c4785

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=02-03

  Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2523.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523

PR: https://git.openjdk.java.net/jdk/pull/2523
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2]

Evgeny Nikitin-2
In reply to this post by Igor Ignatyev-2
On Wed, 17 Feb 2021 15:46:44 GMT, Igor Ignatyev <[hidden email]> wrote:

>> Well, seems like rebalancing doesn't works that good. Here's a sample failure with plenty of free space in the non-nmethods heap:
>>
>> [8.230s][warning][codecache] CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
>> [8.230s][warning][codecache] Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
>> Java HotSpot(TM) 64-Bit Server VM warning: CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
>> Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
>> CodeHeap 'non-profiled nmethods': size=8192Kb used=8191Kb max_used=8191Kb free=0Kb   << Exhausted
>> CodeHeap 'profiled nmethods': size=8192Kb used=8191Kb max_used=8191Kb free=0Kb       << Exhausted
>> CodeHeap 'non-nmethods': size=102400Kb used=18343Kb max_used=18343Kb free=84056Kb    << 84Mb of free space
>>
>> # ERROR: Caught exception in Thread[Thread-41,5,MainThreadGroup]
>> ...
>> # ERROR: Caused by: java.lang.VirtualMachineError: Out of space in CodeCache for method handle intrinsic
>> The sum monitoring won't help here either. I've added non-nmethods heap to the monitoring, just to be sure.
>
> hm... that can mean that there is a product bug (or my recollections about code heaps aren't as good as I thought).
>
> @TobiHartmann , @iwanowww, could you please take a look? Evgeny's observations suggest that method handle intrinsics use `non-profiled nmethods` and `profiled nmethods` heaps and not `non-nmethods` heap despite the fact that the last one has plenty of free space. my understanding is/was that we should have used `non-nmethods` heap for MH intrinsic 1st and if it's exhausted start to use the other heaps.
>
> Thanks,
> -- Igor

I inspected sample built up cache with 'Compiler.CodeHeap_Analytics' diagnostic command. The vast majority of the 'non-profiled nmethods' heap are zillions of `invokeBasic`, `linkToStatic` and similar, with different signatures. Dump shows something like this:

nMethod (active)    invokeBasic(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
nMethod (active)    invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJD)Ljava/lang/Object;
nMethod (active)    invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJDLjava/lang/Object;)Ljava/lang/Object;
nMethod (active)    invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJDLjava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;

... with their signatures marching to the right screen border and beyond. Given that their arguments are mish-mashed in all possible combinations, there are really many of them (I've been able to build up cashes up to 300MB without a pair signatures repeating). They are nmethods, and should be in the nmethods cache, aren't they?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2523