Quantcast

RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Greetings,

 

Kindly asking for reviews for the following changeset:

 

Bug: https://bugs.openjdk.java.net/browse/JDK-8175178 

Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/ 

 

Summary:

 

vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).

#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624, tid=3640
# assert(false) failed: invalid bci or invalid scope desc
#

There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.

 

Thanks

Markus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Daniel D. Daugherty
On 2/17/17 5:29 AM, Markus Gronlund wrote:

> Greetings,
>
>  
>
> Kindly asking for reviews for the following changeset:
>
>  
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/

src/cpu/x86/vm/templateTable_x86.cpp
     (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  
// restore target bytecode
         I was a little concerned about not restoring the target
         bytecode into the 'rbx' register, but when I looked at
         dispatch code:

         L2196:     __ bind(dispatch);
         L2197:   }
         L2198:
         L2199:   // Pre-load the next target bytecode into rbx
         L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));

         so it looks like rbx gets the target bytecode just fine.

     L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
restore target bytecode
     L2214:       __ set_method_data_pointer_for_bcp();
     L2215:       __ jmp(dispatch);
         I think the "restore target bytecode" here is also redundant.

         src/cpu/x86/vm/interp_masm_x86.cpp:

InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
           <snip>

           push(rbx);

           get_method(rbx);

         src/cpu/x86/vm/interp_masm_x86.hpp:

         void get_method(Register reg) {
           movptr(reg, Address(rbp,
frame::interpreter_frame_method_offset * wordSize));
         }

         set_method_data_pointer_for_bcp() doesn't use the rbx
         value that we bothered to restore.

OK, so on 64-bit the code saved the nmethod before the call to
SharedRuntime::OSR_migration_begin() and it was using 'r13'
which is the register we use for 'bcp' in 64-bit. This use of
r13/bcp was visible to stack walkers (because of save_bcp()) and
caused the assert() failure. What's the failure mode in release
bits?

This is outstanding sleuthing!

You've switched the save to use 'rbx' on both 64-bit and 32-bit
and you've removed stale code that was using 'rbx' for saving
the target bytecode unnecessarily.

Thumbs up on this change!

Please don't forget to update the copyright before you push.

Dan

P.S.
It feels like we're missing some infrastructure to prevent
the accidental use of 'r13'. We have other "special" registers
that we guard against being used... Perhaps we need something
for 'r13'.


>
>  
>
> Summary:
>
>  
>
> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # Internal Error (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624, tid=3640
> # assert(false) failed: invalid bci or invalid scope desc
> #
>
> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>
>  
>
> Thanks
>
> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Thanks a lot Dan for taking a look at this.

The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.

Thanks also for pointing to the other bytecode restore location (@L2213).

I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).

In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.

Updated webrev:

http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/

Regarding:

"It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."

I agree with you, lets see what we can do to improve this.

Thanks again
Markus


-----Original Message-----
From: Daniel D. Daugherty
Sent: den 17 februari 2017 18:51
To: Markus Gronlund; [hidden email]
Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

On 2/17/17 5:29 AM, Markus Gronlund wrote:

> Greetings,
>
>  
>
> Kindly asking for reviews for the following changeset:
>
>  
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/

src/cpu/x86/vm/templateTable_x86.cpp
     (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  
// restore target bytecode
         I was a little concerned about not restoring the target
         bytecode into the 'rbx' register, but when I looked at
         dispatch code:

         L2196:     __ bind(dispatch);
         L2197:   }
         L2198:
         L2199:   // Pre-load the next target bytecode into rbx
         L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));

         so it looks like rbx gets the target bytecode just fine.

     L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
restore target bytecode
     L2214:       __ set_method_data_pointer_for_bcp();
     L2215:       __ jmp(dispatch);
         I think the "restore target bytecode" here is also redundant.

         src/cpu/x86/vm/interp_masm_x86.cpp:

InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
           <snip>

           push(rbx);

           get_method(rbx);

         src/cpu/x86/vm/interp_masm_x86.hpp:

         void get_method(Register reg) {
           movptr(reg, Address(rbp,
frame::interpreter_frame_method_offset * wordSize));
         }

         set_method_data_pointer_for_bcp() doesn't use the rbx
         value that we bothered to restore.

OK, so on 64-bit the code saved the nmethod before the call to
SharedRuntime::OSR_migration_begin() and it was using 'r13'
which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?

This is outstanding sleuthing!

You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.

Thumbs up on this change!

Please don't forget to update the copyright before you push.

Dan

P.S.
It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.


>
>  
>
> Summary:
>
>  
>
> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # Internal Error
> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>
> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>
>  
>
> Thanks
>
> Markus

Loading...