RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi-3
8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

-------------

Commit messages:
 - 8261483: Try to eliminate flakiness of the test by extending its allowed runtime and reducing the memory

Changes: https://git.openjdk.java.net/jdk/pull/2617/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2617&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8261483
  Stats: 6 lines in 3 files changed: 2 ins; 2 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2617.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2617/head:pull/2617

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Attila Szegedi-3
> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  8261483: Try to eliminate flakiness of the test by extending its allowed runtime and reducing the memory

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2617/files
  - new: https://git.openjdk.java.net/jdk/pull/2617/files/989dfb64..3afec32b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2617&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2617&range=00-01

  Stats: 20 lines in 2 files changed: 5 ins; 8 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2617.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2617/head:pull/2617

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi-3
In reply to this post by Attila Szegedi-3
On Wed, 17 Feb 2021 19:44:35 GMT, Attila Szegedi <[hidden email]> wrote:

> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

@plevart can I bother you for a follow-up review of my original issue? (Alternatively, @shipilev?) Unfortunately the tests as I wrote them are flaking on somewhat loaded CI servers. Shame on me for using timing-based tests… Thanks for the consideration.

I changed the flaky tests to use an upper limit on number of iterations instead of a maximum duration, so they won't be time sensitive anymore. I'm also running them with a separate VM using a very small max heap (4M) so the GC condition triggers quickly. Finally, I measured the number of iterations they need to succeed with this heap size; the numbers are deterministic across executions, but I also added a generous padding (least iterations needed by a test is 30, most needed is 219, I'm allowing the tests to run until 5000.)

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Peter Levart-3
On Thu, 18 Feb 2021 19:41:21 GMT, Attila Szegedi <[hidden email]> wrote:

>> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"
>
> @plevart can I bother you for a follow-up review of my original issue? (Alternatively, @shipilev?) Unfortunately the tests as I wrote them are flaking on somewhat loaded CI servers. Shame on me for using timing-based tests… Thanks for the consideration.
>
> I changed the flaky tests to use an upper limit on number of iterations instead of a maximum duration, so they won't be time sensitive anymore. I'm also running them with a separate VM using a very small max heap (4M) so the GC condition triggers quickly. Finally, I measured the number of iterations they need to succeed with this heap size; the numbers are deterministic across executions, but I also added a generous padding (least iterations needed by a test is 30, most needed is 219, I'm allowing the tests to run until 5000.)

> Unfortunately the tests as I wrote them are flaking on somewhat loaded CI servers...
>
> I changed the flaky tests to use an upper limit on number of iterations instead of a maximum duration, so they won't be time sensitive anymore. I'm also running them with a separate VM using a very small max heap (4M) so the GC condition triggers quickly. Finally, I measured the number of iterations they need to succeed with this heap size; the numbers are deterministic across executions, but I also added a generous padding (least iterations needed by a test is 30, most needed is 219, I'm allowing the tests to run until 5000.)

I played a little with TypeConverterFactoryMemoryLeakTest. If I add `System.gc()` call and a 10ms sleep into the loop, the test succeeds after 12 iterations even if I increase the heap size considerably.
I would like 1st to understand why the MH created in the TestLinker.convertToType is actually GCed at all. And after that, why it is not GCed before 12 invocations to makeOne() are made.

What would be more interesting to test is to create a converter between a type loaded by a custom (child) ClassLoader and a system type. After that, release all hard references to the custom ClassLoader and see if the converter gets GC-ed as a result. WDYT?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi-3
On Fri, 19 Feb 2021 11:22:57 GMT, Peter Levart <[hidden email]> wrote:

>I would like 1st to understand why the MH created in the TestLinker.convertToType is actually GCed at all.

The whole original issue was about allowing these objects to be GCd 😄.
Short story: because all data is scoped to objects created within `makeOne` method.

Longer story: It is GC'd because its reachability is dominated by the `DynamicLinker` object created on [line 96](https://github.com/openjdk/jdk/blob/3afec32bdf49703bb3537b36248f0550caae6d87/test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java#L96). It and everything it dominates (`LinkerServicesImpl` object holding the `TypeConverterFactory` object holding the `BiClassValue` objects holding the method handles) become garbage as soon as the method exits. This works because of the pains I took in `BiClassValue` to ensure we don't leak any of the infrastructural objects through implicit `this$0` outer-class references into the `ClassValue`s. That allows the `ClassValue` objects to be GCd and removed from the classes' class-value mapping. This ordinarily doesn't happen as most `ClassValue` objects are bound to static fields, but in `TypeConverterFactory` the `BiClassValue` instances are specific to the `TypeConverterFactory` instance, so they are expected to be reclaimable by G
 C if the converter factory itself goes away.

 > And after that, why it is not GCed before 12 invocations to makeOne() are made.

Because GC is at liberty to delay recognizing an object is phantom reachable? I don't think we need to read too much into this? Correct me if I don't see something.

> What would be more interesting to test is to create a converter between a type loaded by a custom (child) ClassLoader and a system type. After that, release all hard references to the custom ClassLoader and see if the converter gets GC-ed as a result. WDYT?

That does sound like [TypeConverterFactoryRetentionTests.testSystemLoaderToOtherLoader](https://github.com/openjdk/jdk/blob/3afec32bdf49703bb3537b36248f0550caae6d87/test/jdk/jdk/dynalink/TypeConverterFactoryRetentionTests.java#L118) True, it tests whether the _class loader_ itself gets released, not the converter, but the loader would hardly get released while there's a converter with the class from that loader bound in the parameter signature of its method handle, wouldn't it?

Regardless, if you think there's a valid use case for this additional test, we can discuss it. I'd rather scope this issue to fixing the flakiness of the tests, so they don't cause trouble with builds and can be backported/added to ongoing backports.

Thanks for taking time to look into this!

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Aleksey Shipilev-5
In reply to this post by Attila Szegedi-3
On Thu, 18 Feb 2021 19:31:02 GMT, Attila Szegedi <[hidden email]> wrote:

>> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"
>
> Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.

The good test would be trying with all current GCs (Serial, Parallel, G1, Shenandoah, ZGC):

make run-test TEST=jdk/dynalink TEST_VM_OPTS=-XX:+UseSerialGC

Also, make sure that GH actions are able to run this test. You probably need to enable the workflow here -- https://github.com/szegedi/jdk/actions -- and then trigger the workflow manually on your branch (or push another commit). Then "Checks" tab would show the results.

test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 28:

> 26:  * @bug 8198540
> 27:  * @summary Test TypeConverterFactory is not leaking method handles
> 28:  * @run main/othervm -Xmx4M TypeConverterFactoryMemoryLeakTest

I think `-Xmx4m` is risking it on some platforms that cannot go that low heap. Maybe do 128M, and bulk up the test allocations, so that GC definitely triggers?

test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 79:

> 77:
> 78:     public static void main(String[] args) {
> 79:         for (int count = 0; count < MAX_ITERATIONS; ++count) {

Here and later: use postfix `count++`, regular style?

-------------

Changes requested by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Peter Levart-3
On Thu, 25 Feb 2021 15:37:39 GMT, Aleksey Shipilev <[hidden email]> wrote:

>> Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
>
> The good test would be trying with all current GCs (Serial, Parallel, G1, Shenandoah, ZGC):
>
> make run-test TEST=jdk/dynalink TEST_VM_OPTS=-XX:+UseSerialGC
>
> Also, make sure that GH actions are able to run this test. You probably need to enable the workflow here -- https://github.com/szegedi/jdk/actions -- and then trigger the workflow manually on your branch (or push another commit). Then "Checks" tab would show the results.

> > I would like 1st to understand why the MH created in the TestLinker.convertToType is actually GCed at all.
>
> The whole original issue was about allowing these objects to be GCd smile.
> Short story: because all data is scoped to objects created within `makeOne` method.

Right, below description helped me understand the "chain of domination - i.e. hard references that keep the BiClassValueRoot (extends ClassValue) reachable". That doesn't explain fully why converters are not GC-ed as soon as the associated BiClassValueRoot is eligible for GC. The secret lies in the implementation of ClassValue...

>
> Longer story: It is GC'd because its reachability is dominated by the `DynamicLinker` object created on [line 96](https://github.com/openjdk/jdk/blob/3afec32bdf49703bb3537b36248f0550caae6d87/test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java#L96). It and everything it dominates (`LinkerServicesImpl` object holding the `TypeConverterFactory` object holding the `BiClassValue` objects holding the method handles) become garbage as soon as the method exits.

That is not entirely true. What becomes garbage is the objects leading to the BiClassValueRoot (extends ClassValue) instance including the BiClassValueRoot instance. But it is not the ClassValue<T> instance that keeps the associated T value(s) reachable. The associated values are strongly reachable from the associated Class<?> instances. The Class class contains an instance field of type ClassValueMap which is responsible for keeping the associated values. This answers my question why the associated converter is GCed only after 12 new converters are created for the same source/target type pair. It is not because creating new converters would cause memory pressure and consequently trigger GC (I tried adding explicit System.gc() calls and they did not expedite the GCing of the converter). It is because ClassValue<T> logic expunges associated T values from Class.ClassValueMap only at times when it has to re-arrange the map's internal structures and that eventually happens on new inserti
 ons for the same Class instance and different ClassValue instance.

That doesn't mean type converter caching is flawed in any way. It just means that some converters will be retained even though they are practically not "reachable" any more through the mechanism of caching. That doesn't mean that any ClassLoader leaks are possible either.

> This works because of the pains I took in `BiClassValue` to ensure we don't leak any of the infrastructural objects through implicit `this$0` outer-class references into the `ClassValue`s. That allows the `ClassValue` objects to be GCd and removed from the classes' class-value mapping. This ordinarily doesn't happen as most `ClassValue` objects are bound to static fields, but in `TypeConverterFactory` the `BiClassValue` instances are specific to the `TypeConverterFactory` instance, so they are expected to be reclaimable by GC if the converter factory itself goes away.
>
> > And after that, why it is not GCed before 12 invocations to makeOne() are made.
>
> Because GC is at liberty to delay recognizing an object is phantom reachable? I don't think we need to read too much into this? Correct me if I don't see something.

As described above, GC has nothing to do with this "delay". It is caused by the way ClassValue keeps associated values reachable from the associated Class instances (that appear semantically as keys, but in fact it's the other way arround - ClassValue instances are weakly reachable keys and Class instances hold the maps).

>
> > What would be more interesting to test is to create a converter between a type loaded by a custom (child) ClassLoader and a system type. After that, release all hard references to the custom ClassLoader and see if the converter gets GC-ed as a result. WDYT?
>
> That does sound like [TypeConverterFactoryRetentionTests.testSystemLoaderToOtherLoader](https://github.com/openjdk/jdk/blob/3afec32bdf49703bb3537b36248f0550caae6d87/test/jdk/jdk/dynalink/TypeConverterFactoryRetentionTests.java#L118) True, it tests whether the _class loader_ itself gets released, not the converter, but the loader would hardly get released while there's a converter with the class from that loader bound in the parameter signature of its method handle, wouldn't it?

Yes, that test is exactly what I was thinking about.

>
> Regardless, if you think there's a valid use case for this additional test, we can discuss it. I'd rather scope this issue to fixing the flakiness of the tests, so they don't cause trouble with builds and can be backported/added to ongoing backports.
>
> Thanks for taking time to look into this!

I think the tests cover what needs to be covered. I would just iterate the same test over all GC implementations (as Aleksey already suggested) and instead of minimizing the heap size (which could cause problems in some GC implementations as they evolve), insert explicit System.gc() calls in the loops to trigger GC cycles that also process Reference(s). I tried that with all current CG implementations (SerialGC, G1GC, ParallelGC, ZGC and ShenandoahGC) and they all respect System.gc() calls. Even though I increased heap size to 128M, the 1st associated converter always got GCed after 12 new converters were created for the same source/target class pair.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v3]

Attila Szegedi-3
In reply to this post by Attila Szegedi-3
> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  8261483: Eliminate flakiness of the tests by using iteration number limit and explicitly running GC

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2617/files
  - new: https://git.openjdk.java.net/jdk/pull/2617/files/3afec32b..f5922c48

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2617&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2617&range=01-02

  Stats: 25 lines in 2 files changed: 11 ins; 3 del; 11 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2617.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2617/head:pull/2617

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Attila Szegedi-3
In reply to this post by Aleksey Shipilev-5
On Thu, 25 Feb 2021 15:34:30 GMT, Aleksey Shipilev <[hidden email]> wrote:

>> Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
>
> test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 79:
>
>> 77:
>> 78:     public static void main(String[] args) {
>> 79:         for (int count = 0; count < MAX_ITERATIONS; ++count) {
>
> Here and later: use postfix `count++`, regular style?

Old habits die hard; I think I started doing this about 30 years ago when writing C code against a compiler on Atari ST that emitted more efficient MC68000 code for `++i` than for `i++`. I guess it's time to unlearn this :-) Of course, if you wanted to get the fastest code, you would've counted _decrementing to zero_ to let the compiler use the DBRA (decrement and branch) instruction.

It's funny how I _don't_ miss those days.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Attila Szegedi-3
In reply to this post by Aleksey Shipilev-5
On Thu, 25 Feb 2021 15:34:10 GMT, Aleksey Shipilev <[hidden email]> wrote:

>> Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
>
> test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 28:
>
>> 26:  * @bug 8198540
>> 27:  * @summary Test TypeConverterFactory is not leaking method handles
>> 28:  * @run main/othervm -Xmx4M TypeConverterFactoryMemoryLeakTest
>
> I think `-Xmx4m` is risking it on some platforms that cannot go that low heap. Maybe do 128M, and bulk up the test allocations, so that GC definitely triggers?

I removed the `-Xmx` option altogether on @plevart's suggestion to invoke `System.gc` explicitly instead. I added multiple @run directives, with all current GCs. (BTW, do I need to add `@requires vm.gc.Shenandoah` to the tests if I include Shenandoah?)

It's funny how ZGC and Shenandoah need 1 iteration less than all other GCs. If a test needs 12 iterations with most GCs, ZGC and Shenandoah need 11. If a test needs 2 iterations with others, these two will get it done in 1.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Attila Szegedi-3
In reply to this post by Peter Levart-3
On Fri, 26 Feb 2021 08:03:26 GMT, Peter Levart <[hidden email]> wrote:

>> The good test would be trying with all current GCs (Serial, Parallel, G1, Shenandoah, ZGC):
>>
>> make run-test TEST=jdk/dynalink TEST_VM_OPTS=-XX:+UseSerialGC
>>
>> Also, make sure that GH actions are able to run this test. You probably need to enable the workflow here -- https://github.com/szegedi/jdk/actions -- and then trigger the workflow manually on your branch (or push another commit). Then "Checks" tab would show the results.
>
>> > I would like 1st to understand why the MH created in the TestLinker.convertToType is actually GCed at all.
>>
>> The whole original issue was about allowing these objects to be GCd smile.
>> Short story: because all data is scoped to objects created within `makeOne` method.
>
> Right, below description helped me understand the "chain of domination - i.e. hard references that keep the BiClassValueRoot (extends ClassValue) reachable". That doesn't explain fully why converters are not GC-ed as soon as the associated BiClassValueRoot is eligible for GC. The secret lies in the implementation of ClassValue...
>
>>
>> Longer story: It is GC'd because its reachability is dominated by the `DynamicLinker` object created on [line 96](https://github.com/openjdk/jdk/blob/3afec32bdf49703bb3537b36248f0550caae6d87/test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java#L96). It and everything it dominates (`LinkerServicesImpl` object holding the `TypeConverterFactory` object holding the `BiClassValue` objects holding the method handles) become garbage as soon as the method exits.
>
> That is not entirely true. What becomes garbage is the objects leading to the BiClassValueRoot (extends ClassValue) instance including the BiClassValueRoot instance. But it is not the ClassValue<T> instance that keeps the associated T value(s) reachable. The associated values are strongly reachable from the associated Class<?> instances. The Class class contains an instance field of type ClassValueMap which is responsible for keeping the associated values. This answers my question why the associated converter is GCed only after 12 new converters are created for the same source/target type pair. It is not because creating new converters would cause memory pressure and consequently trigger GC (I tried adding explicit System.gc() calls and they did not expedite the GCing of the converter). It is because ClassValue<T> logic expunges associated T values from Class.ClassValueMap only at times when it has to re-arrange the map's internal structures and that eventually happens on new inser
 tions for the same Class instance and different ClassValue instance.

>
> That doesn't mean type converter caching is flawed in any way. It just means that some converters will be retained even though they are practically not "reachable" any more through the mechanism of caching. That doesn't mean that any ClassLoader leaks are possible either.
>
>> This works because of the pains I took in `BiClassValue` to ensure we don't leak any of the infrastructural objects through implicit `this$0` outer-class references into the `ClassValue`s. That allows the `ClassValue` objects to be GCd and removed from the classes' class-value mapping. This ordinarily doesn't happen as most `ClassValue` objects are bound to static fields, but in `TypeConverterFactory` the `BiClassValue` instances are specific to the `TypeConverterFactory` instance, so they are expected to be reclaimable by GC if the converter factory itself goes away.
>>
>> > And after that, why it is not GCed before 12 invocations to makeOne() are made.
>>
>> Because GC is at liberty to delay recognizing an object is phantom reachable? I don't think we need to read too much into this? Correct me if I don't see something.
>
> As described above, GC has nothing to do with this "delay". It is caused by the way ClassValue keeps associated values reachable from the associated Class instances (that appear semantically as keys, but in fact it's the other way arround - ClassValue instances are weakly reachable keys and Class instances hold the maps).
>
>>
>> > What would be more interesting to test is to create a converter between a type loaded by a custom (child) ClassLoader and a system type. After that, release all hard references to the custom ClassLoader and see if the converter gets GC-ed as a result. WDYT?
>>
>> That does sound like [TypeConverterFactoryRetentionTests.testSystemLoaderToOtherLoader](https://github.com/openjdk/jdk/blob/3afec32bdf49703bb3537b36248f0550caae6d87/test/jdk/jdk/dynalink/TypeConverterFactoryRetentionTests.java#L118) True, it tests whether the _class loader_ itself gets released, not the converter, but the loader would hardly get released while there's a converter with the class from that loader bound in the parameter signature of its method handle, wouldn't it?
>
> Yes, that test is exactly what I was thinking about.
>
>>
>> Regardless, if you think there's a valid use case for this additional test, we can discuss it. I'd rather scope this issue to fixing the flakiness of the tests, so they don't cause trouble with builds and can be backported/added to ongoing backports.
>>
>> Thanks for taking time to look into this!
>
> I think the tests cover what needs to be covered. I would just iterate the same test over all GC implementations (as Aleksey already suggested) and instead of minimizing the heap size (which could cause problems in some GC implementations as they evolve), insert explicit System.gc() calls in the loops to trigger GC cycles that also process Reference(s). I tried that with all current CG implementations (SerialGC, G1GC, ParallelGC, ZGC and ShenandoahGC) and they all respect System.gc() calls. Even though I increased heap size to 128M, the 1st associated converter always got GCed after 12 new converters were created for the same source/target class pair.

> … It is because ClassValue logic expunges associated T values from Class.ClassValueMap only at times when it has to re-arrange the map's internal structures and that eventually happens on new insertions for the same Class instance and different ClassValue instance.

When I diagnosed the original issue by taking the heap dump and tracking the undesirable reachability from GC roots, I fixed it, then wrote the tests, and could see that before my changes the tests failed, and after the changes they passed, so stuff is no longer retained by class values – good enough for me, done! You on the other hand, need to understand _why 12 iterations_. I sincerely admire how you're methodically going after the details and need to understand the whole picture.

> I think the tests cover what needs to be covered. I would just iterate the same test over all GC implementations (as Aleksey already suggested) and instead of minimizing the heap size (which could cause problems in some GC implementations as they evolve), insert explicit System.gc() calls in the loops to trigger GC cycles that also process Reference(s). I tried that with all current CG implementations (SerialGC, G1GC, ParallelGC, ZGC and ShenandoahGC) and they all respect System.gc() calls. Even though I increased heap size to 128M, the 1st associated converter always got GCed after 12 new converters were created for the same source/target class pair.

Done: added `@run` directives for all GCs, and in fact removed the `-Xmx` as it is not relevant; memory leak test succeeds after 12 iterations (11 with Shenandoah and ZGC) and the retention tests succeed after 2 iterations (or 1 with with Shenandoah and ZGC).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v3]

Aleksey Shipilev-5
In reply to this post by Attila Szegedi-3
On Sat, 27 Feb 2021 20:01:10 GMT, Attila Szegedi <[hidden email]> wrote:

>> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"
>
> Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>
>   8261483: Eliminate flakiness of the tests by using iteration number limit and explicitly running GC

Changes requested by shade (Reviewer).

test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 32:

> 30:  * @run main/othervm -XX:+UseParallelGC TypeConverterFactoryMemoryLeakTest
> 31:  * @run main/othervm -XX:+UseZGC TypeConverterFactoryMemoryLeakTest
> 32:  * @run main/othervm -XX:+UseShenandoahGC TypeConverterFactoryMemoryLeakTest

Ah, here is a test trivia. Some configurations do not have either ZGC or Shenandoah. So you need to check GC availabilty before adding `@run`. For consistency, checking every GC availability is even better. Usually done by splitting the `@test` blocks, and adding `@requires` tags: https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpCompressedTest.java#L37-L107

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v2]

Aleksey Shipilev-5
In reply to this post by Attila Szegedi-3
On Sat, 27 Feb 2021 20:10:15 GMT, Attila Szegedi <[hidden email]> wrote:

>> test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 79:
>>
>>> 77:
>>> 78:     public static void main(String[] args) {
>>> 79:         for (int count = 0; count < MAX_ITERATIONS; ++count) {
>>
>> Here and later: use postfix `count++`, regular style?
>
> Old habits die hard; I think I started doing this about 30 years ago when writing C code against a compiler on Atari ST that emitted more efficient MC68000 code for `++i` than for `i++`. I guess it's time to unlearn this :-) Of course, if you wanted to get the fastest code, you would've counted _decrementing to zero_ to let the compiler use the DBRA (decrement and branch) instruction.
>
> It's funny how I _don't_ miss those days.

Good!

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v4]

Attila Szegedi-3
In reply to this post by Attila Szegedi-3
> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:

  8261483: Eliminate flakiness of the tests by using iteration number limit and explicitly running GC

-------------

Changes: https://git.openjdk.java.net/jdk/pull/2617/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2617&range=03
  Stats: 94 lines in 3 files changed: 71 ins; 10 del; 13 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2617.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2617/head:pull/2617

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v3]

Attila Szegedi-3
In reply to this post by Aleksey Shipilev-5
On Sun, 28 Feb 2021 07:20:19 GMT, Aleksey Shipilev <[hidden email]> wrote:

>> Attila Szegedi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
>
> test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 61:
>
>> 59:     // With explicit GC calls succeeds in 11-12 iterations depending on GC used.
>> 60:     // 1000 should be a safe upper limit after which we can consider it failed.
>> 61:     private static final int MAX_ITERATIONS = 1000;
>
> Ah, here is a test trivia. Some configurations do not have either ZGC or Shenandoah. So you need to check GC availabilty before adding `@run`. For consistency, checking every GC availability is even better. Usually done by splitting the `@test` blocks, and adding `@requires` tags: https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpCompressedTest.java#L37-L107

Thanks, that's an excellent suggestion! Implemented it now.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v4]

Aleksey Shipilev-5
In reply to this post by Attila Szegedi-3
On Sun, 28 Feb 2021 10:28:56 GMT, Attila Szegedi <[hidden email]> wrote:

>> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"
>
> Attila Szegedi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision:
>
>   8261483: Eliminate flakiness of the tests by using iteration number limit and explicitly running GC

Good. Minor suggestion follows.

test/jdk/jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java line 25:

> 23:
> 24: /*
> 25:  * @test id=with_SerialGC

No need to be that explicit ("with_SerialGC"). "id=serial" should be enough.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v4]

Peter Levart-3
In reply to this post by Attila Szegedi-3
On Sun, 28 Feb 2021 10:28:56 GMT, Attila Szegedi <[hidden email]> wrote:

>> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"
>
> Attila Szegedi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:
>
>   8261483: Eliminate flakiness of the tests by using iteration number limit and explicitly running GC

I think this looks good to go, Attila.

-------------

Marked as reviewed by plevart (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now" [v4]

Attila Szegedi-3
On Mon, 1 Mar 2021 11:09:31 GMT, Peter Levart <[hidden email]> wrote:

>> Attila Szegedi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision:
>>
>>   8261483: Eliminate flakiness of the tests by using iteration number limit and explicitly running GC
>
> I think this looks good to go, Attila.

Thank you both @plevart and @shipilev for the help and suggestions!

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617
Reply | Threaded
Open this post in threaded view
|

Integrated: 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Attila Szegedi-3
In reply to this post by Attila Szegedi-3
On Wed, 17 Feb 2021 19:44:35 GMT, Attila Szegedi <[hidden email]> wrote:

> 8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

This pull request has now been integrated.

Changeset: d185a6c5
Author:    Attila Szegedi <[hidden email]>
URL:       https://git.openjdk.java.net/jdk/commit/d185a6c5
Stats:     94 lines in 3 files changed: 71 ins; 10 del; 13 mod

8261483: jdk/dynalink/TypeConverterFactoryMemoryLeakTest.java failed with "AssertionError: Should have GCd a method handle by now"

Reviewed-by: shade, plevart

-------------

PR: https://git.openjdk.java.net/jdk/pull/2617