RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

Yi Yang
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
#  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
#
# JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
# Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31

Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
C  [libjli.so+0x4b4f]  JavaMain+0xc61
C  [libjli.so+0xad93]  ThreadJavaMain+0x27
We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:

1. SIGINT
at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
- locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
at java.lang.Thread.run(java.base@17-internal/Thread.java:831)

2. Normal Exit
JavaThread::invoke_shutdown_hooks()+0x46
Threads::destroy_vm()+0xe7
jni_DestroyJavaVM_inner+0x91
jni_DestroyJavaVM+0x1f
JavaMain+0xc61
ThreadJavaMain+0x27

They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.

Testing(linux_x64):
[+] test/hotspot/jtreg/runtime/cds
[+] test/hotspot/jtreg/gc

-------------

Commit messages:
 - CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

Changes: https://git.openjdk.java.net/jdk/pull/3320/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3320&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8264634
  Stats: 21 lines in 2 files changed: 8 ins; 5 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3320.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3320/head:pull/3320

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v2]

Yi Yang
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
> #
> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>
> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
> C  [libjli.so+0x4b4f]  JavaMain+0xc61
> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>
> 1. SIGINT
> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>
> 2. Normal Exit
> JavaThread::invoke_shutdown_hooks()+0x46
> Threads::destroy_vm()+0xe7
> jni_DestroyJavaVM_inner+0x91
> jni_DestroyJavaVM+0x1f
> JavaMain+0xc61
> ThreadJavaMain+0x27
>
> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>
> Testing(linux_x64):
> [+] test/hotspot/jtreg/runtime/cds
> [+] test/hotspot/jtreg/gc

Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3320/files
  - new: https://git.openjdk.java.net/jdk/pull/3320/files/bdc9c723..56a47fce

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3320&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3320&range=00-01

  Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3320.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3320/head:pull/3320

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v2]

Yumin Qi-3
On Fri, 2 Apr 2021 10:21:56 GMT, Yi Yang <[hidden email]> wrote:

>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
>> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
>> #
>> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
>> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> # Problematic frame:
>> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>>
>> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
>> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
>> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
>> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
>> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
>> C  [libjli.so+0x4b4f]  JavaMain+0xc61
>> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
>> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>>
>> 1. SIGINT
>> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
>> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
>> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
>> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
>> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
>> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>>
>> 2. Normal Exit
>> JavaThread::invoke_shutdown_hooks()+0x46
>> Threads::destroy_vm()+0xe7
>> jni_DestroyJavaVM_inner+0x91
>> jni_DestroyJavaVM+0x1f
>> JavaMain+0xc61
>> ThreadJavaMain+0x27
>>
>> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>>
>> Testing(linux_x64):
>> [+] test/hotspot/jtreg/runtime/cds
>> [+] test/hotspot/jtreg/gc
>
> Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>
>   CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

Hi, Yi
  The _loaded_cld is a global list, in this case it looks contain duplicated CLD in it.
   The duplication could from the thread run shutdown hook.
   Could you try
  if (!cld->is_unloading()) {
      cld->inc_keep_alive();
`+`     if (!_loaded_cld->contains(cld)) {
      _loaded_cld->append(cld);
`+`  }
    }
Please let us know if you can avoid the crash.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v2]

Ioi Lam-2
In reply to this post by Yi Yang
On Fri, 2 Apr 2021 10:21:56 GMT, Yi Yang <[hidden email]> wrote:

>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
>> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
>> #
>> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
>> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> # Problematic frame:
>> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>>
>> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
>> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
>> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
>> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
>> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
>> C  [libjli.so+0x4b4f]  JavaMain+0xc61
>> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
>> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>>
>> 1. SIGINT
>> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
>> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
>> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
>> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
>> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
>> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>>
>> 2. Normal Exit
>> JavaThread::invoke_shutdown_hooks()+0x46
>> Threads::destroy_vm()+0xe7
>> jni_DestroyJavaVM_inner+0x91
>> jni_DestroyJavaVM+0x1f
>> JavaMain+0xc61
>> ThreadJavaMain+0x27
>>
>> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>>
>> Testing(linux_x64):
>> [+] test/hotspot/jtreg/runtime/cds
>> [+] test/hotspot/jtreg/gc
>
> Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>
>   CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

The fix looks reasonable. If MetaspaceShared::link_and_cleanup_shared_classes may be called twice, it's better to isolate the loaded_cld for each invocation. Allocating it locally will also avoid any potential threading issues.

I have some requests for cleaning up the code.

src/hotspot/share/memory/metaspaceShared.cpp line 569:

> 567:   ResourceMark rm;
> 568:   GrowableArray<ClassLoaderData*> loaded_cld;
> 569:   CollectCLDClosure collect_cld(&loaded_cld);

I think we should add a comment to say why it's necessary to first collect the ClassLoaderDatas first:

// ClassLoaderDataGraph::loaded_cld_do requires ClassLoaderDataGraph_lock.
// We cannot link the classes while holding this lock (or else we may run into deadlock).
// Therefore, we need to first collect all the CLDs, and then link their classes after
// releasing the lock.

src/hotspot/share/memory/metaspaceShared.cpp line 600:

> 598:     cld->dec_keep_alive();
> 599:   }
> 600:   loaded_cld.trunc_to(0);

There's no need for the trucate -- `loaded_cld` is locally allocated and will be freed after this function returns.

Also, to improve modularity, I think we should move the dec_keep_alive loop into the destructor of CollectCLDClosure.

Also, `loaded_cld` can be moved as a field into CollectCLDClosure.

-------------

Changes requested by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v2]

Yi Yang
In reply to this post by Yumin Qi-3
On Fri, 2 Apr 2021 21:55:20 GMT, Yumin Qi <[hidden email]> wrote:

> Hi, Yi
> The _loaded_cld is a global list, in this case it looks contain duplicated CLD in it.
> The duplication could from the thread run shutdown hook.
> Could you try
> if (!cld->is_unloading()) {
> cld->inc_keep_alive();
> `+` if (!_loaded_cld->contains(cld)) {
> _loaded_cld->append(cld);
> `+` }
> }
> Please let us know if you can avoid the crash.

Hi Yumin, this fix still crashes because the CLDs collected at the first invocation of MetaspaceShared::link_and_cleanup_shared_classes are not cleaned, they will decrement their _keep_alives as before at the second invocation of MetaspaceShared::link_and_cleanup_shared_classes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v3]

Yi Yang
In reply to this post by Yi Yang
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
> #
> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>
> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
> C  [libjli.so+0x4b4f]  JavaMain+0xc61
> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>
> 1. SIGINT
> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>
> 2. Normal Exit
> JavaThread::invoke_shutdown_hooks()+0x46
> Threads::destroy_vm()+0xe7
> jni_DestroyJavaVM_inner+0x91
> jni_DestroyJavaVM+0x1f
> JavaMain+0xc61
> ThreadJavaMain+0x27
>
> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>
> Testing(linux_x64):
> [+] test/hotspot/jtreg/runtime/cds
> [+] test/hotspot/jtreg/gc

Yi Yang has updated the pull request incrementally with one additional commit since the last revision:

  improve modularity

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3320/files
  - new: https://git.openjdk.java.net/jdk/pull/3320/files/56a47fce..fea3c4b6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3320&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3320&range=01-02

  Stats: 26 lines in 1 file changed: 12 ins; 7 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3320.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3320/head:pull/3320

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v2]

Yi Yang
In reply to this post by Yi Yang
On Mon, 5 Apr 2021 03:45:58 GMT, Yi Yang <[hidden email]> wrote:

>> Hi, Yi
>>   The _loaded_cld is a global list, in this case it looks contain duplicated CLD in it.
>>    The duplication could from the thread run shutdown hook.
>>    Could you try
>>   if (!cld->is_unloading()) {
>>       cld->inc_keep_alive();
>> `+`     if (!_loaded_cld->contains(cld)) {
>>       _loaded_cld->append(cld);
>> `+`  }
>>     }
>> Please let us know if you can avoid the crash.
>
>> Hi, Yi
>> The _loaded_cld is a global list, in this case it looks contain duplicated CLD in it.
>> The duplication could from the thread run shutdown hook.
>> Could you try
>> if (!cld->is_unloading()) {
>> cld->inc_keep_alive();
>> `+` if (!_loaded_cld->contains(cld)) {
>> _loaded_cld->append(cld);
>> `+` }
>> }
>> Please let us know if you can avoid the crash.
>
> Hi Yumin, this fix still crashes because the CLDs collected at the first invocation of MetaspaceShared::link_and_cleanup_shared_classes are not cleaned, they will decrement their _keep_alives as before at the second invocation of MetaspaceShared::link_and_cleanup_shared_classes.

Hi Ioi,

> Also, to improve modularity, I think we should move the dec_keep_alive loop into the destructor of CollectCLDClosure.
> Also, loaded_cld can be moved as a field into CollectCLDClosure.

Suggestions make sense, changed. Tests under runtime/cds/ are all passed with slowdebug mode.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v3]

Ioi Lam-2
In reply to this post by Yi Yang
On Mon, 5 Apr 2021 04:33:24 GMT, Yi Yang <[hidden email]> wrote:

>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
>> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
>> #
>> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
>> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> # Problematic frame:
>> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>>
>> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
>> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
>> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
>> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
>> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
>> C  [libjli.so+0x4b4f]  JavaMain+0xc61
>> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
>> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>>
>> 1. SIGINT
>> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
>> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
>> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
>> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
>> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
>> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>>
>> 2. Normal Exit
>> JavaThread::invoke_shutdown_hooks()+0x46
>> Threads::destroy_vm()+0xe7
>> jni_DestroyJavaVM_inner+0x91
>> jni_DestroyJavaVM+0x1f
>> JavaMain+0xc61
>> ThreadJavaMain+0x27
>>
>> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>>
>> Testing(linux_x64):
>> [+] test/hotspot/jtreg/runtime/cds
>> [+] test/hotspot/jtreg/gc
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
>
>   improve modularity

Marked as reviewed by iklam (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v3]

Yumin Qi-3
In reply to this post by Yi Yang
On Mon, 5 Apr 2021 04:33:24 GMT, Yi Yang <[hidden email]> wrote:

>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
>> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
>> #
>> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
>> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> # Problematic frame:
>> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>>
>> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
>> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
>> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
>> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
>> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
>> C  [libjli.so+0x4b4f]  JavaMain+0xc61
>> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
>> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>>
>> 1. SIGINT
>> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
>> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
>> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
>> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
>> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
>> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>>
>> 2. Normal Exit
>> JavaThread::invoke_shutdown_hooks()+0x46
>> Threads::destroy_vm()+0xe7
>> jni_DestroyJavaVM_inner+0x91
>> jni_DestroyJavaVM+0x1f
>> JavaMain+0xc61
>> ThreadJavaMain+0x27
>>
>> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>>
>> Testing(linux_x64):
>> [+] test/hotspot/jtreg/runtime/cds
>> [+] test/hotspot/jtreg/gc
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
>
>   improve modularity

Make the CLD list local is a reasonable solution. LGTM.

-------------

Marked as reviewed by minqi (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive [v3]

Yi Yang
On Mon, 5 Apr 2021 15:35:11 GMT, Yumin Qi <[hidden email]> wrote:

>> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   improve modularity
>
> Make the CLD list local is a reasonable solution. LGTM.

Thanks @yminqi @iklam for the reviews!

-------------

PR: https://git.openjdk.java.net/jdk/pull/3320
Reply | Threaded
Open this post in threaded view
|

Integrated: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

Yi Yang
In reply to this post by Yi Yang
On Fri, 2 Apr 2021 08:15:38 GMT, Yi Yang <[hidden email]> wrote:

> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/qingfeng.yy/openjdk16_so_warning/jdk/src/hotspot/share/classfile/classLoaderData.cpp:316), pid=68929, tid=68930
> #  assert(_keep_alive > 0) failed: Invalid keep alive decrement count
> #
> # JRE version: OpenJDK Runtime Environment (17.0) (slowdebug build 17-internal+0-adhoc.qingfengyy.jdk)
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.qingfengyy.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
>
> Stack: [0x00007f1593072000,0x00007f1593173000],  sp=0x00007f1593171c00,  free space=1023k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x781087]  ClassLoaderData::dec_keep_alive()+0x31
> V  [libjvm.so+0xef19e7]  MetaspaceShared::link_and_cleanup_shared_classes(Thread*)+0x181
> V  [libjvm.so+0x1260834]  JavaThread::invoke_shutdown_hooks()+0x46
> V  [libjvm.so+0x12609e5]  Threads::destroy_vm()+0xe7
> V  [libjvm.so+0xbb40ec]  jni_DestroyJavaVM_inner+0x91
> V  [libjvm.so+0xbb4147]  jni_DestroyJavaVM+0x1f
> C  [libjli.so+0x4b4f]  JavaMain+0xc61
> C  [libjli.so+0xad93]  ThreadJavaMain+0x27
> We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
>
> 1. SIGINT
> at java.lang.Shutdown.beforeHalt(java.base@17-internal/Native Method)
> at java.lang.Shutdown.exit(java.base@17-internal/Shutdown.java:172)
> - locked <0x00000007fef02040> (a java.lang.Class for java.lang.Shutdown)
> at java.lang.Terminator$1.handle(java.base@17-internal/Terminator.java:51)
> at jdk.internal.misc.Signal$1.run(java.base@17-internal/Signal.java:219)
> at java.lang.Thread.run(java.base@17-internal/Thread.java:831)
>
> 2. Normal Exit
> JavaThread::invoke_shutdown_hooks()+0x46
> Threads::destroy_vm()+0xe7
> jni_DestroyJavaVM_inner+0x91
> jni_DestroyJavaVM+0x1f
> JavaMain+0xc61
> ThreadJavaMain+0x27
>
> They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
>
> Testing(linux_x64):
> [+] test/hotspot/jtreg/runtime/cds
> [+] test/hotspot/jtreg/gc

This pull request has now been integrated.

Changeset: 54b4070d
Author:    Yi Yang <[hidden email]>
Committer: Yumin Qi <[hidden email]>
URL:       https://git.openjdk.java.net/jdk/commit/54b4070d
Stats:     27 lines in 1 file changed: 14 ins; 7 del; 6 mod

8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive

Reviewed-by: minqi, iklam

-------------

PR: https://git.openjdk.java.net/jdk/pull/3320