RFR: 8264760: JVM crashes when two threads encounter the same resolution error

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8264760: JVM crashes when two threads encounter the same resolution error

Wang Huang
As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.

-------------

Commit messages:
 - 8264760: JVM crashes when two threads encounter the same resolution error

Changes: https://git.openjdk.java.net/jdk/pull/3392/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3392&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8264760
  Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3392.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3392/head:pull/3392

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v2]

Wang Huang
> As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.

Wang Huang has updated the pull request incrementally with one additional commit since the last revision:

  add test cases

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3392/files
  - new: https://git.openjdk.java.net/jdk/pull/3392/files/4774b25f..7d7c7442

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3392&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3392&range=00-01

  Stats: 196 lines in 2 files changed: 196 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3392.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3392/head:pull/3392

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

Wang Huang
In reply to this post by Wang Huang
> As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.

Wang Huang has updated the pull request incrementally with one additional commit since the last revision:

  add throw exception

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3392/files
  - new: https://git.openjdk.java.net/jdk/pull/3392/files/7d7c7442..b11b658a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3392&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3392&range=01-02

  Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3392.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3392/head:pull/3392

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

David Holmes-2
On Thu, 8 Apr 2021 04:38:37 GMT, Wang Huang <[hidden email]> wrote:

>> As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.
>
> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>
>   add throw exception

Hi Wang,

Thanks for reporting this and offering to fix it. I have a comment on the actual fix and a few issues with the testcase - though thanks for adding one. With a race condition like this it is hard to write a test that will fail reliably without the fix.

Thanks,
David

src/hotspot/share/classfile/systemDictionary.cpp line 1939:

> 1937:     ResolutionErrorEntry* entry = resolution_errors()->find_entry(index, hash, pool, which);
> 1938:     if (entry != NULL && entry->nest_host_error() == NULL) {
> 1939:       entry->set_nest_host_error(message);

You should never find an entry where the nest_host_error() is NULL. If there were such an entry then it implies a regular resolution error occurred, but that means we cannot have reached this code as `klass_at` in our caller would have thrown the exception. So this should simply be:
if (entry != NULL) {
  assert(entry->nest_host_error() != NULL, "cannot have a NULL error at this point");
}
You also need to delete part of the method comment:
//... If an entry already exists it will
// be updated with the nest host error message.
as we don't do that.

test/hotspot/jtreg/runtime/Nestmates/membership/HostNoNestMember.jcod line 1:

> 1: class HostNoNestMember {

This new file needs a copyright notice and GPL header.

Also please add a comment stating what this jcod file describes - see other jcod files in the directory for example.

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 38:

> 36: import java.util.concurrent.CountDownLatch;
> 37:
> 38: class HostNoNestMember {

Please add a comment describing how this class is replaced with the jcode version, and the difference in that version.

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 43:

> 41:   }
> 42:
> 43:   public int foo() {

Please call this "test".

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 53:

> 51:   public static void main(String args[]) throws Throwable {
> 52:     TestNestHostErrorWithMultiThread t = new TestNestHostErrorWithMultiThread();
> 53:     t.test();

There is no need to create an instance  - just use a static test method.

The test code could be placed directly in main in this case.

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 58:

> 56:   public void test() throws Throwable {
> 57:
> 58:     CountDownLatch latch = new CountDownLatch(1);

The use of CDL doesn't guarantee that both threads have reached the await() before the main thread does the countDown(). If you use a CyclicBarrier instead it will trip when the second thread arrives.

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 62:

> 60:     new Thread(() -> {
> 61:       try {
> 62:         latch.await();

Please add a comment:

// Try to have all threads trigger the nesthost check at the same time

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 64:

> 62:         latch.await();
> 63:         HostNoNestMember h = new HostNoNestMember();
> 64:         h.foo();

Please follow this by:
`throw new RuntimeException("Did not get expected IllegalAccessError");`

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 68:

> 66:         System.out.println("OK - got expected exception: " + expected);
> 67:       } catch (InterruptedException e) {}
> 68:     }).start();

Please extract the logic to a Runnable so that both threads can use that one runnable rather than duplicating the code.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

David Holmes-2
In reply to this post by Wang Huang
On Thu, 8 Apr 2021 04:38:37 GMT, Wang Huang <[hidden email]> wrote:

>> As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.
>
> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>
>   add throw exception

test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 51:

> 49: public class TestNestHostErrorWithMultiThread {
> 50:
> 51:   public static void main(String args[]) throws Throwable {

"throws Throwable" is only necessary if you want unlisted checked exceptions to propagate.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

Harold Seigel-2
In reply to this post by Wang Huang
On Thu, 8 Apr 2021 04:38:37 GMT, Wang Huang <[hidden email]> wrote:

>> As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.
>
> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>
>   add throw exception

Class HostNoNestMember could be removed entirely from class TestNestHostErrorWithMultiThread by doing the following:

Generate a jcod representation for class HostNoNestMember$Member and add it to HostNoNestMember.jcod.

Delete class HostNoNestMember from TestNestHostErrorWithMultiThread.java.

Change the test to compile HostNoNestMember.jcod before compiling TestNestHostErrorWithMultiThread.java.


Doing this removes the confusion of class HostNoNestMember being compiled twice.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

David Holmes-2
In reply to this post by David Holmes-2
On Thu, 8 Apr 2021 04:40:05 GMT, David Holmes <[hidden email]> wrote:

>> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   add throw exception
>
> src/hotspot/share/classfile/systemDictionary.cpp line 1939:
>
>> 1937:     ResolutionErrorEntry* entry = resolution_errors()->find_entry(index, hash, pool, which);
>> 1938:     if (entry != NULL && entry->nest_host_error() == NULL) {
>> 1939:       entry->set_nest_host_error(message);
>
> You should never find an entry where the nest_host_error() is NULL. If there were such an entry then it implies a regular resolution error occurred, but that means we cannot have reached this code as `klass_at` in our caller would have thrown the exception. So this should simply be:
> if (entry != NULL) {
>   assert(entry->nest_host_error() != NULL, "cannot have a NULL error at this point");
> }
> You also need to delete part of the method comment:
> //... If an entry already exists it will
> // be updated with the nest host error message.
> as we don't do that.

Sorry the above is not correct. I believe the code you have is correct. I will double-check some details then update my review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

David Holmes-2
In reply to this post by Wang Huang
On Thu, 8 Apr 2021 04:38:37 GMT, Wang Huang <[hidden email]> wrote:

>> As shown in JDK-8264760, I changed notes with @dholmes-ora and only fixed this issue by deleting the assert. The other whole bigger big will be fixed in the other issue.
>
> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>
>   add throw exception

I think the code you have is correct. I've suggested some additional commentary. There are a couple of races possible here, but I think the remaining ones are benign and we do not need to try and impose any stricter synchronization.

Thanks,
David

src/hotspot/share/classfile/systemDictionary.cpp line 1939:

> 1937:     ResolutionErrorEntry* entry = resolution_errors()->find_entry(index, hash, pool, which);
> 1938:     if (entry != NULL && entry->nest_host_error() == NULL) {
> 1939:       entry->set_nest_host_error(message);

Please add a comment before line 1939:
// An existing entry means we had a true resolution failure (LinkageError) with our nest host, but we
// still want to add the error message for the higher-level access checks to report. We should
// only reach here under the same error condition, so we can ignore the potential race with setting
// the message. If we see it is already set then we can ignore it.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

Wang Huang
In reply to this post by David Holmes-2
On Thu, 8 Apr 2021 05:02:27 GMT, David Holmes <[hidden email]> wrote:

>> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   add throw exception
>
> test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 58:
>
>> 56:   public void test() throws Throwable {
>> 57:
>> 58:     CountDownLatch latch = new CountDownLatch(1);
>
> The use of CDL doesn't guarantee that both threads have reached the await() before the main thread does the countDown(). If you use a CyclicBarrier instead it will trip when the second thread arrives.

I have tested `CyclicBarrier ` here and found that `CountDownLatch` could trigger this bug fast while `CyclicBarrier ` couldn't always trigger this bug.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

Wang Huang
In reply to this post by David Holmes-2
On Thu, 8 Apr 2021 05:02:27 GMT, David Holmes <[hidden email]> wrote:

>> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   add throw exception
>
> test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 58:
>
>> 56:   public void test() throws Throwable {
>> 57:
>> 58:     CountDownLatch latch = new CountDownLatch(1);
>
> The use of CDL doesn't guarantee that both threads have reached the await() before the main thread does the countDown(). If you use a CyclicBarrier instead it will trip when the second thread arrives.

@dholmes-ora

-------------

PR: https://git.openjdk.java.net/jdk/pull/3392
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264760: JVM crashes when two threads encounter the same resolution error [v3]

David Holmes
On 17/04/2021 5:42 pm, Wang Huang wrote:

> On Thu, 8 Apr 2021 05:02:27 GMT, David Holmes <[hidden email]> wrote:
>
>>> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>>>
>>>    add throw exception
>>
>> test/hotspot/jtreg/runtime/Nestmates/membership/TestNestHostErrorWithMultiThread.java line 58:
>>
>>> 56:   public void test() throws Throwable {
>>> 57:
>>> 58:     CountDownLatch latch = new CountDownLatch(1);
>>
>> The use of CDL doesn't guarantee that both threads have reached the await() before the main thread does the countDown(). If you use a CyclicBarrier instead it will trip when the second thread arrives.
>
> @dholmes-ora

Not sure what commentary you are looking for. CyclicBarrier should work
better for ensuring concurrent attempts in theory.

David

> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/3392
>