RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader

Aleksey Shipilev-5
`JVM_LatestUserDefinedLoader` is called normally from `ObjectInputStream.resolveClass` -> `VM.latestUserDefinedLoader0`. And it takes a measurable time to walk the stack. There is JDK-8173368 that wants to replace it with `StackWalker`, but we can tune up the `JVM_LatestUserDefinedLoader` itself without changing the semantics of it (thus providing the backportability, including the releases that do not have `StackWalker`) and improving performance (thus providing a more aggressive baseline for `StackWalker` rewrite).

The key is to recognize that out of two checks: 1) checking for two special subclasses; 2) checking for user classloader -- the first one usually passes, and second one fails much more frequently. First check also requires traversing the superclasses upwards looking for match. Reversing the order of the checks, plus inlining the helper method improves performance without changing the semantics.

Out of curiosity, my previous patch dropped the first check completely, replacing it by asserts, and we definitely run into situation where that check is needed on some tests.

On my machine, `VM.latestUserDefinedLoader` invocation time drops from 115 to 100 ns/op. Single-threaded SPECjvm2008:serial improves about 3% with this patch.

Additional testing:
 - [x] Ad-hoc benchmarks
 - [x] Linux x86_64 fastdebug, `tier1`, `tier2`, `tier3`

---------
### Progress
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
- [ ] Change must be properly reviewed



### Download
`$ git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485`
`$ git checkout pull/2485`

-------------

Commit messages:
 - 8261449: Micro-optimize JVM_LatestUserDefinedLoader

Changes: https://git.openjdk.java.net/jdk/pull/2485/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2485&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8261449
  Stats: 19 lines in 3 files changed: 3 ins; 13 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2485.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader

David Holmes-2
On Tue, 9 Feb 2021 15:40:03 GMT, Aleksey Shipilev <[hidden email]> wrote:

> `JVM_LatestUserDefinedLoader` is called normally from `ObjectInputStream.resolveClass` -> `VM.latestUserDefinedLoader0`. And it takes a measurable time to walk the stack. There is JDK-8173368 that wants to replace it with `StackWalker`, but we can tune up the `JVM_LatestUserDefinedLoader` itself without changing the semantics of it (thus providing the backportability, including the releases that do not have `StackWalker`) and improving performance (thus providing a more aggressive baseline for `StackWalker` rewrite).
>
> The key is to recognize that out of two checks: 1) checking for two special subclasses; 2) checking for user classloader -- the first one usually passes, and second one fails much more frequently. First check also requires traversing the superclasses upwards looking for match. Reversing the order of the checks, plus inlining the helper method improves performance without changing the semantics.
>
> Out of curiosity, my previous patch dropped the first check completely, replacing it by asserts, and we definitely run into situation where that check is needed on some tests.
>
> On my machine, `VM.latestUserDefinedLoader` invocation time drops from 115 to 100 ns/op. Single-threaded SPECjvm2008:serial improves about 3% with this patch.
>
> Additional testing:
>  - [x] Ad-hoc benchmarks
>  - [x] Linux x86_64 fastdebug, `tier1`, `tier2`, `tier3`
>
> ---------
> ### Progress
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> - [ ] Change must be properly reviewed
>
>
>
> ### Download
> `$ git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485`
> `$ git checkout pull/2485`

Hi Aleksey,

This seems reasonable to me. The generated reflection classes are loaded by a temporary loader (so they can be unloaded) and so have to be skipped.

I've added core-libs to the PR as this is the VM side of their code and I want to make sure nothing has been overlooked.

Thanks,
David

src/hotspot/share/prims/jvm.cpp line 3293:

> 3291:     oop loader = ik->class_loader();
> 3292:     if (loader != NULL && !SystemDictionary::is_platform_class_loader(loader)) {
> 3293:       if (!ik->is_subclass_of(vmClasses::reflect_MethodAccessorImpl_klass()) &&

Please add a comment:
// Skip reflection related frames

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader [v2]

Aleksey Shipilev-5
In reply to this post by Aleksey Shipilev-5
> `JVM_LatestUserDefinedLoader` is called normally from `ObjectInputStream.resolveClass` -> `VM.latestUserDefinedLoader0`. And it takes a measurable time to walk the stack. There is JDK-8173368 that wants to replace it with `StackWalker`, but we can tune up the `JVM_LatestUserDefinedLoader` itself without changing the semantics of it (thus providing the backportability, including the releases that do not have `StackWalker`) and improving performance (thus providing a more aggressive baseline for `StackWalker` rewrite).
>
> The key is to recognize that out of two checks: 1) checking for two special subclasses; 2) checking for user classloader -- the first one usually passes, and second one fails much more frequently. First check also requires traversing the superclasses upwards looking for match. Reversing the order of the checks, plus inlining the helper method improves performance without changing the semantics.
>
> Out of curiosity, my previous patch dropped the first check completely, replacing it by asserts, and we definitely run into situation where that check is needed on some tests.
>
> On my machine, `VM.latestUserDefinedLoader` invocation time drops from 115 to 100 ns/op. Single-threaded SPECjvm2008:serial improves about 3% with this patch.
>
> Additional testing:
>  - [x] Ad-hoc benchmarks
>  - [x] Linux x86_64 fastdebug, `tier1`, `tier2`, `tier3`
>
> ---------
> ### Progress
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> - [ ] Change must be properly reviewed
>
>
>
> ### Download
> `$ git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485`
> `$ git checkout pull/2485`

Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:

  Added a comment

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2485/files
  - new: https://git.openjdk.java.net/jdk/pull/2485/files/fc333037..72e830a8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2485&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2485&range=00-01

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2485.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader [v2]

Aleksey Shipilev-5
In reply to this post by David Holmes-2
On Wed, 10 Feb 2021 01:29:51 GMT, David Holmes <[hidden email]> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   Added a comment
>
> src/hotspot/share/prims/jvm.cpp line 3293:
>
>> 3291:     oop loader = ik->class_loader();
>> 3292:     if (loader != NULL && !SystemDictionary::is_platform_class_loader(loader)) {
>> 3293:       if (!ik->is_subclass_of(vmClasses::reflect_MethodAccessorImpl_klass()) &&
>
> Please add a comment:
> // Skip reflection related frames

Added!

-------------

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader [v2]

Thomas Stuefe
In reply to this post by Aleksey Shipilev-5
On Wed, 10 Feb 2021 07:34:59 GMT, Aleksey Shipilev <[hidden email]> wrote:

>> `JVM_LatestUserDefinedLoader` is called normally from `ObjectInputStream.resolveClass` -> `VM.latestUserDefinedLoader0`. And it takes a measurable time to walk the stack. There is JDK-8173368 that wants to replace it with `StackWalker`, but we can tune up the `JVM_LatestUserDefinedLoader` itself without changing the semantics of it (thus providing the backportability, including the releases that do not have `StackWalker`) and improving performance (thus providing a more aggressive baseline for `StackWalker` rewrite).
>>
>> The key is to recognize that out of two checks: 1) checking for two special subclasses; 2) checking for user classloader -- the first one usually passes, and second one fails much more frequently. First check also requires traversing the superclasses upwards looking for match. Reversing the order of the checks, plus inlining the helper method improves performance without changing the semantics.
>>
>> Out of curiosity, my previous patch dropped the first check completely, replacing it by asserts, and we definitely run into situation where that check is needed on some tests.
>>
>> On my machine, `VM.latestUserDefinedLoader` invocation time drops from 115 to 100 ns/op. Single-threaded SPECjvm2008:serial improves about 3% with this patch.
>>
>> Additional testing:
>>  - [x] Ad-hoc benchmarks
>>  - [x] Linux x86_64 fastdebug, `tier1`, `tier2`, `tier3`
>>
>> ---------
>> ### Progress
>> - [x] Change must not contain extraneous whitespace
>> - [x] Commit message must refer to an issue
>> - [ ] Change must be properly reviewed
>>
>>
>>
>> ### Download
>> `$ git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485`
>> `$ git checkout pull/2485`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>
>   Added a comment

Looks good. I find it simpler too.

You could run the tests with sun.reflect.inflationThreshold=0. Should increase the chance of encountering reflection delegator loaders a lot.

Cheers, Thomas

-------------

Marked as reviewed by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader [v2]

Aleksey Shipilev-5
On Wed, 10 Feb 2021 15:26:50 GMT, Thomas Stuefe <[hidden email]> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   Added a comment
>
> Looks good. I find it simpler too.
>
> You could run the tests with sun.reflect.inflationThreshold=0. Should increase the chance of encountering reflection delegator loaders a lot.
>
> Cheers, Thomas

Thanks! Any reviewers from JDK side? @AlanBateman?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader [v2]

Alan Bateman-2
In reply to this post by Aleksey Shipilev-5
On Wed, 10 Feb 2021 07:34:59 GMT, Aleksey Shipilev <[hidden email]> wrote:

>> `JVM_LatestUserDefinedLoader` is called normally from `ObjectInputStream.resolveClass` -> `VM.latestUserDefinedLoader0`. And it takes a measurable time to walk the stack. There is JDK-8173368 that wants to replace it with `StackWalker`, but we can tune up the `JVM_LatestUserDefinedLoader` itself without changing the semantics of it (thus providing the backportability, including the releases that do not have `StackWalker`) and improving performance (thus providing a more aggressive baseline for `StackWalker` rewrite).
>>
>> The key is to recognize that out of two checks: 1) checking for two special subclasses; 2) checking for user classloader -- the first one usually passes, and second one fails much more frequently. First check also requires traversing the superclasses upwards looking for match. Reversing the order of the checks, plus inlining the helper method improves performance without changing the semantics.
>>
>> Out of curiosity, my previous patch dropped the first check completely, replacing it by asserts, and we definitely run into situation where that check is needed on some tests.
>>
>> On my machine, `VM.latestUserDefinedLoader` invocation time drops from 115 to 100 ns/op. Single-threaded SPECjvm2008:serial improves about 3% with this patch.
>>
>> Additional testing:
>>  - [x] Ad-hoc benchmarks
>>  - [x] Linux x86_64 fastdebug, `tier1`, `tier2`, `tier3`
>>
>> ---------
>> ### Progress
>> - [x] Change must not contain extraneous whitespace
>> - [x] Commit message must refer to an issue
>> - [ ] Change must be properly reviewed
>>
>>
>>
>> ### Download
>> `$ git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485`
>> `$ git checkout pull/2485`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>
>   Added a comment

Marked as reviewed by alanb (Reviewer).

src/hotspot/share/prims/jvm.cpp line 3297:

> 3295:           !ik->is_subclass_of(vmClasses::reflect_ConstructorAccessorImpl_klass())) {
> 3296:         return JNIHandles::make_local(THREAD, loader);
> 3297:       }

This okay looks but surprised it has a measurable (or micro) difference.

There has been several proposals over the years to improve latestUserDefinedLoader in the common case that OIS.resolveClass is not overridden. It may need to be looked at again. Ideally JVM_LastestUSerDefinedLoader would go away and there would be a solution based on StackWalker (but work would be required there to match the current performance).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261449: Micro-optimize JVM_LatestUserDefinedLoader [v2]

Aleksey Shipilev-5
On Thu, 11 Feb 2021 08:04:55 GMT, Alan Bateman <[hidden email]> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   Added a comment
>
> Marked as reviewed by alanb (Reviewer).

Thanks!

-------------

PR: https://git.openjdk.java.net/jdk/pull/2485
Reply | Threaded
Open this post in threaded view
|

Integrated: 8261449: Micro-optimize JVM_LatestUserDefinedLoader

Aleksey Shipilev-5
In reply to this post by Aleksey Shipilev-5
On Tue, 9 Feb 2021 15:40:03 GMT, Aleksey Shipilev <[hidden email]> wrote:

> `JVM_LatestUserDefinedLoader` is called normally from `ObjectInputStream.resolveClass` -> `VM.latestUserDefinedLoader0`. And it takes a measurable time to walk the stack. There is JDK-8173368 that wants to replace it with `StackWalker`, but we can tune up the `JVM_LatestUserDefinedLoader` itself without changing the semantics of it (thus providing the backportability, including the releases that do not have `StackWalker`) and improving performance (thus providing a more aggressive baseline for `StackWalker` rewrite).
>
> The key is to recognize that out of two checks: 1) checking for two special subclasses; 2) checking for user classloader -- the first one usually passes, and second one fails much more frequently. First check also requires traversing the superclasses upwards looking for match. Reversing the order of the checks, plus inlining the helper method improves performance without changing the semantics.
>
> Out of curiosity, my previous patch dropped the first check completely, replacing it by asserts, and we definitely run into situation where that check is needed on some tests.
>
> On my machine, `VM.latestUserDefinedLoader` invocation time drops from 115 to 100 ns/op. Single-threaded SPECjvm2008:serial improves about 3% with this patch.
>
> Additional testing:
>  - [x] Ad-hoc benchmarks
>  - [x] Linux x86_64 fastdebug, `tier1`, `tier2`, `tier3`
>
> ---------
> ### Progress
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> - [ ] Change must be properly reviewed
>
>
>
> ### Download
> `$ git fetch https://git.openjdk.java.net/jdk pull/2485/head:pull/2485`
> `$ git checkout pull/2485`

This pull request has now been integrated.

Changeset: 49cf13d2
Author:    Aleksey Shipilev <[hidden email]>
URL:       https://git.openjdk.java.net/jdk/commit/49cf13d2
Stats:     20 lines in 3 files changed: 4 ins; 13 del; 3 mod

8261449: Micro-optimize JVM_LatestUserDefinedLoader

Reviewed-by: dholmes, stuefe, alanb

-------------

PR: https://git.openjdk.java.net/jdk/pull/2485