Quantcast

RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Greetings,

 

Kindly asking for reviews for the following changeset:

 

Bug: https://bugs.openjdk.java.net/browse/JDK-8175178 

Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/ 

 

Summary:

 

vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).

#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624, tid=3640
# assert(false) failed: invalid bci or invalid scope desc
#

There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.

 

Thanks

Markus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Daniel D. Daugherty
On 2/17/17 5:29 AM, Markus Gronlund wrote:

> Greetings,
>
>  
>
> Kindly asking for reviews for the following changeset:
>
>  
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/

src/cpu/x86/vm/templateTable_x86.cpp
     (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  
// restore target bytecode
         I was a little concerned about not restoring the target
         bytecode into the 'rbx' register, but when I looked at
         dispatch code:

         L2196:     __ bind(dispatch);
         L2197:   }
         L2198:
         L2199:   // Pre-load the next target bytecode into rbx
         L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));

         so it looks like rbx gets the target bytecode just fine.

     L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
restore target bytecode
     L2214:       __ set_method_data_pointer_for_bcp();
     L2215:       __ jmp(dispatch);
         I think the "restore target bytecode" here is also redundant.

         src/cpu/x86/vm/interp_masm_x86.cpp:

InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
           <snip>

           push(rbx);

           get_method(rbx);

         src/cpu/x86/vm/interp_masm_x86.hpp:

         void get_method(Register reg) {
           movptr(reg, Address(rbp,
frame::interpreter_frame_method_offset * wordSize));
         }

         set_method_data_pointer_for_bcp() doesn't use the rbx
         value that we bothered to restore.

OK, so on 64-bit the code saved the nmethod before the call to
SharedRuntime::OSR_migration_begin() and it was using 'r13'
which is the register we use for 'bcp' in 64-bit. This use of
r13/bcp was visible to stack walkers (because of save_bcp()) and
caused the assert() failure. What's the failure mode in release
bits?

This is outstanding sleuthing!

You've switched the save to use 'rbx' on both 64-bit and 32-bit
and you've removed stale code that was using 'rbx' for saving
the target bytecode unnecessarily.

Thumbs up on this change!

Please don't forget to update the copyright before you push.

Dan

P.S.
It feels like we're missing some infrastructure to prevent
the accidental use of 'r13'. We have other "special" registers
that we guard against being used... Perhaps we need something
for 'r13'.


>
>  
>
> Summary:
>
>  
>
> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # Internal Error (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624, tid=3640
> # assert(false) failed: invalid bci or invalid scope desc
> #
>
> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>
>  
>
> Thanks
>
> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Thanks a lot Dan for taking a look at this.

The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.

Thanks also for pointing to the other bytecode restore location (@L2213).

I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).

In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.

Updated webrev:

http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/

Regarding:

"It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."

I agree with you, lets see what we can do to improve this.

Thanks again
Markus


-----Original Message-----
From: Daniel D. Daugherty
Sent: den 17 februari 2017 18:51
To: Markus Gronlund; [hidden email]
Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

On 2/17/17 5:29 AM, Markus Gronlund wrote:

> Greetings,
>
>  
>
> Kindly asking for reviews for the following changeset:
>
>  
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/

src/cpu/x86/vm/templateTable_x86.cpp
     (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  
// restore target bytecode
         I was a little concerned about not restoring the target
         bytecode into the 'rbx' register, but when I looked at
         dispatch code:

         L2196:     __ bind(dispatch);
         L2197:   }
         L2198:
         L2199:   // Pre-load the next target bytecode into rbx
         L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));

         so it looks like rbx gets the target bytecode just fine.

     L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
restore target bytecode
     L2214:       __ set_method_data_pointer_for_bcp();
     L2215:       __ jmp(dispatch);
         I think the "restore target bytecode" here is also redundant.

         src/cpu/x86/vm/interp_masm_x86.cpp:

InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
           <snip>

           push(rbx);

           get_method(rbx);

         src/cpu/x86/vm/interp_masm_x86.hpp:

         void get_method(Register reg) {
           movptr(reg, Address(rbp,
frame::interpreter_frame_method_offset * wordSize));
         }

         set_method_data_pointer_for_bcp() doesn't use the rbx
         value that we bothered to restore.

OK, so on 64-bit the code saved the nmethod before the call to
SharedRuntime::OSR_migration_begin() and it was using 'r13'
which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?

This is outstanding sleuthing!

You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.

Thumbs up on this change!

Please don't forget to update the copyright before you push.

Dan

P.S.
It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.


>
>  
>
> Summary:
>
>  
>
> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # Internal Error
> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>
> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>
>  
>
> Thanks
>
> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

coleen.phillimore

This change looks good to me.  Although I think this

        call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));


Should be:

        call_VM(rax, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));


Since this returns the OSR buffer in rax.


On 2/19/17 8:07 AM, Markus Gronlund wrote:

> Thanks a lot Dan for taking a look at this.
>
> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>
> Thanks also for pointing to the other bytecode restore location (@L2213).
>
> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>
> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/
>
> Regarding:
>
> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."

We use r13 (and rlocals) as a scratch register in lots of places due to
not having enough registers on x86.  As long as there isn't a safepoint
(or call into the runtime as you found), this should be perfectly safe,
as long as there's a restore_bcp() or restore_locals() call.

Coleen

> I agree with you, lets see what we can do to improve this.
>
> Thanks again
> Markus
>
>
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: den 17 februari 2017 18:51
> To: Markus Gronlund; [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>
> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>> Greetings,
>>
>>    
>>
>> Kindly asking for reviews for the following changeset:
>>
>>    
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>
>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
> src/cpu/x86/vm/templateTable_x86.cpp
>       (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
> // restore target bytecode
>           I was a little concerned about not restoring the target
>           bytecode into the 'rbx' register, but when I looked at
>           dispatch code:
>
>           L2196:     __ bind(dispatch);
>           L2197:   }
>           L2198:
>           L2199:   // Pre-load the next target bytecode into rbx
>           L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>
>           so it looks like rbx gets the target bytecode just fine.
>
>       L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
> restore target bytecode
>       L2214:       __ set_method_data_pointer_for_bcp();
>       L2215:       __ jmp(dispatch);
>           I think the "restore target bytecode" here is also redundant.
>
>           src/cpu/x86/vm/interp_masm_x86.cpp:
>
> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>             <snip>
>
>             push(rbx);
>
>             get_method(rbx);
>
>           src/cpu/x86/vm/interp_masm_x86.hpp:
>
>           void get_method(Register reg) {
>             movptr(reg, Address(rbp,
> frame::interpreter_frame_method_offset * wordSize));
>           }
>
>           set_method_data_pointer_for_bcp() doesn't use the rbx
>           value that we bothered to restore.
>
> OK, so on 64-bit the code saved the nmethod before the call to
> SharedRuntime::OSR_migration_begin() and it was using 'r13'
> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>
> This is outstanding sleuthing!
>
> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>
> Thumbs up on this change!
>
> Please don't forget to update the copyright before you push.
>
> Dan
>
> P.S.
> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>
>
>>    
>>
>> Summary:
>>
>>    
>>
>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error
>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>
>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>
>>    
>>
>> Thanks
>>
>> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Thanks Coleen,

I will also update the "noreg" to "rax" before putback.

Thanks again
Markus

-----Original Message-----
From: Coleen Phillimore
Sent: den 22 februari 2017 22:21
To: [hidden email]
Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86


This change looks good to me.  Although I think this

        call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));


Should be:

        call_VM(rax, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));


Since this returns the OSR buffer in rax.


On 2/19/17 8:07 AM, Markus Gronlund wrote:

> Thanks a lot Dan for taking a look at this.
>
> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>
> Thanks also for pointing to the other bytecode restore location (@L2213).
>
> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>
> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/
>
> Regarding:
>
> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."

We use r13 (and rlocals) as a scratch register in lots of places due to not having enough registers on x86.  As long as there isn't a safepoint (or call into the runtime as you found), this should be perfectly safe, as long as there's a restore_bcp() or restore_locals() call.

Coleen

> I agree with you, lets see what we can do to improve this.
>
> Thanks again
> Markus
>
>
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: den 17 februari 2017 18:51
> To: Markus Gronlund; [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration
> asserts with invalid bci or invalid scope desc on x86
>
> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>> Greetings,
>>
>>    
>>
>> Kindly asking for reviews for the following changeset:
>>
>>    
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>
>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
> src/cpu/x86/vm/templateTable_x86.cpp
>       (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
> // restore target bytecode
>           I was a little concerned about not restoring the target
>           bytecode into the 'rbx' register, but when I looked at
>           dispatch code:
>
>           L2196:     __ bind(dispatch);
>           L2197:   }
>           L2198:
>           L2199:   // Pre-load the next target bytecode into rbx
>           L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>
>           so it looks like rbx gets the target bytecode just fine.
>
>       L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
> restore target bytecode
>       L2214:       __ set_method_data_pointer_for_bcp();
>       L2215:       __ jmp(dispatch);
>           I think the "restore target bytecode" here is also redundant.
>
>           src/cpu/x86/vm/interp_masm_x86.cpp:
>
> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>             <snip>
>
>             push(rbx);
>
>             get_method(rbx);
>
>           src/cpu/x86/vm/interp_masm_x86.hpp:
>
>           void get_method(Register reg) {
>             movptr(reg, Address(rbp,
> frame::interpreter_frame_method_offset * wordSize));
>           }
>
>           set_method_data_pointer_for_bcp() doesn't use the rbx
>           value that we bothered to restore.
>
> OK, so on 64-bit the code saved the nmethod before the call to
> SharedRuntime::OSR_migration_begin() and it was using 'r13'
> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>
> This is outstanding sleuthing!
>
> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>
> Thumbs up on this change!
>
> Please don't forget to update the copyright before you push.
>
> Dan
>
> P.S.
> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>
>
>>    
>>
>> Summary:
>>
>>    
>>
>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error
>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>
>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>
>>    
>>
>> Thanks
>>
>> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Hi again Coleen,

I looked into changing "noreg" to "rax" for the call to SharedRuntime::OSR_migration_begin(), but this is not correct.

By explicitly passing rax, we are indicating the interest in receiving some kind of "managed" return value (like an oop in this example).

MacroAssembler::call_VM_base() will proceed with storing back the thread->vm_result() oop into rax, instead of leaving the platform specific return value holding the natively allocated buffer.

This is because CONSTANT_REGISTER_DECLARATION(rax) == 0 entails oop_result->is_valid() while CONSTANT_REGISTER_DECLARATION(noreg) == -1 entails !oop_result->is_valid().

Therefore I will keep the noreg in place as-is.

Thanks
Markus

-----Original Message-----
From: Markus Gronlund
Sent: den 23 februari 2017 00:55
To: Coleen Phillimore; Daniel Daugherty
Cc: [hidden email]
Subject: RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Thanks Coleen,

I will also update the "noreg" to "rax" before putback.

Thanks again
Markus

-----Original Message-----
From: Coleen Phillimore
Sent: den 22 februari 2017 22:21
To: [hidden email]
Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86


This change looks good to me.  Although I think this

        call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));


Should be:

        call_VM(rax, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));


Since this returns the OSR buffer in rax.


On 2/19/17 8:07 AM, Markus Gronlund wrote:

> Thanks a lot Dan for taking a look at this.
>
> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>
> Thanks also for pointing to the other bytecode restore location (@L2213).
>
> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>
> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/
>
> Regarding:
>
> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."

We use r13 (and rlocals) as a scratch register in lots of places due to not having enough registers on x86.  As long as there isn't a safepoint (or call into the runtime as you found), this should be perfectly safe, as long as there's a restore_bcp() or restore_locals() call.

Coleen

> I agree with you, lets see what we can do to improve this.
>
> Thanks again
> Markus
>
>
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: den 17 februari 2017 18:51
> To: Markus Gronlund; [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration
> asserts with invalid bci or invalid scope desc on x86
>
> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>> Greetings,
>>
>>    
>>
>> Kindly asking for reviews for the following changeset:
>>
>>    
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>
>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
> src/cpu/x86/vm/templateTable_x86.cpp
>       (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
> // restore target bytecode
>           I was a little concerned about not restoring the target
>           bytecode into the 'rbx' register, but when I looked at
>           dispatch code:
>
>           L2196:     __ bind(dispatch);
>           L2197:   }
>           L2198:
>           L2199:   // Pre-load the next target bytecode into rbx
>           L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>
>           so it looks like rbx gets the target bytecode just fine.
>
>       L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
> restore target bytecode
>       L2214:       __ set_method_data_pointer_for_bcp();
>       L2215:       __ jmp(dispatch);
>           I think the "restore target bytecode" here is also redundant.
>
>           src/cpu/x86/vm/interp_masm_x86.cpp:
>
> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>             <snip>
>
>             push(rbx);
>
>             get_method(rbx);
>
>           src/cpu/x86/vm/interp_masm_x86.hpp:
>
>           void get_method(Register reg) {
>             movptr(reg, Address(rbp,
> frame::interpreter_frame_method_offset * wordSize));
>           }
>
>           set_method_data_pointer_for_bcp() doesn't use the rbx
>           value that we bothered to restore.
>
> OK, so on 64-bit the code saved the nmethod before the call to
> SharedRuntime::OSR_migration_begin() and it was using 'r13'
> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>
> This is outstanding sleuthing!
>
> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>
> Thumbs up on this change!
>
> Please don't forget to update the copyright before you push.
>
> Dan
>
> P.S.
> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>
>
>>    
>>
>> Summary:
>>
>>    
>>
>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error
>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>
>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>
>>    
>>
>> Thanks
>>
>> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Daniel D. Daugherty
In reply to this post by Markus Gronlund
On 2/19/17 6:07 AM, Markus Gronlund wrote:

> Thanks a lot Dan for taking a look at this.
>
> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>
> Thanks also for pointing to the other bytecode restore location (@L2213).
>
> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>
> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/

src/cpu/x86/vm/templateTable_x86.cpp
     No comments.

src/share/vm/runtime/vframe.cpp
     L467: void vframeStreamCommon::found_bad_method_frame() const {
     L468:   // 6379830 Cut point for an assertion that occasionally
fires when
     L469:   // we are using the performance analyzer.
     L470:   // Disable this assert when testing the analyzer with
fastdebug.
     L471:   // -XX:SuppressErrorAt=vframe.cpp:XXX (XXX=following line
number)
     L472:   assert(false, "invalid bci or invalid scope desc");
     L473: }
         The comment makes me wonder if the performance analyzer
         was periodically running into your bug.

         The "assert(false," should be:

             fatal("invalid bci or invalid scope desc");

         It's not your bug so feel free to ignore. :-)

src/share/vm/runtime/vframe.hpp
     No comments.

Thumbs up!

Dan


>
> Regarding:
>
> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."
>
> I agree with you, lets see what we can do to improve this.
>
> Thanks again
> Markus
>
>
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: den 17 februari 2017 18:51
> To: Markus Gronlund; [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>
> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>> Greetings,
>>
>>    
>>
>> Kindly asking for reviews for the following changeset:
>>
>>    
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>
>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
> src/cpu/x86/vm/templateTable_x86.cpp
>       (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
> // restore target bytecode
>           I was a little concerned about not restoring the target
>           bytecode into the 'rbx' register, but when I looked at
>           dispatch code:
>
>           L2196:     __ bind(dispatch);
>           L2197:   }
>           L2198:
>           L2199:   // Pre-load the next target bytecode into rbx
>           L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>
>           so it looks like rbx gets the target bytecode just fine.
>
>       L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
> restore target bytecode
>       L2214:       __ set_method_data_pointer_for_bcp();
>       L2215:       __ jmp(dispatch);
>           I think the "restore target bytecode" here is also redundant.
>
>           src/cpu/x86/vm/interp_masm_x86.cpp:
>
> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>             <snip>
>
>             push(rbx);
>
>             get_method(rbx);
>
>           src/cpu/x86/vm/interp_masm_x86.hpp:
>
>           void get_method(Register reg) {
>             movptr(reg, Address(rbp,
> frame::interpreter_frame_method_offset * wordSize));
>           }
>
>           set_method_data_pointer_for_bcp() doesn't use the rbx
>           value that we bothered to restore.
>
> OK, so on 64-bit the code saved the nmethod before the call to
> SharedRuntime::OSR_migration_begin() and it was using 'r13'
> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>
> This is outstanding sleuthing!
>
> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>
> Thumbs up on this change!
>
> Please don't forget to update the copyright before you push.
>
> Dan
>
> P.S.
> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>
>
>>    
>>
>> Summary:
>>
>>    
>>
>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error
>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>
>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>
>>    
>>
>> Thanks
>>
>> Markus
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Markus Gronlund
Thanks again Dan,

It is most likely the analyzer has run into this previously.
I will also update to use "fatal" instead, thanks for noticing.

Markus

-----Original Message-----
From: Daniel D. Daugherty
Sent: den 23 februari 2017 16:12
To: Markus Gronlund; [hidden email]
Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

On 2/19/17 6:07 AM, Markus Gronlund wrote:

> Thanks a lot Dan for taking a look at this.
>
> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>
> Thanks also for pointing to the other bytecode restore location (@L2213).
>
> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>
> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/

src/cpu/x86/vm/templateTable_x86.cpp
     No comments.

src/share/vm/runtime/vframe.cpp
     L467: void vframeStreamCommon::found_bad_method_frame() const {
     L468:   // 6379830 Cut point for an assertion that occasionally
fires when
     L469:   // we are using the performance analyzer.
     L470:   // Disable this assert when testing the analyzer with
fastdebug.
     L471:   // -XX:SuppressErrorAt=vframe.cpp:XXX (XXX=following line
number)
     L472:   assert(false, "invalid bci or invalid scope desc");
     L473: }
         The comment makes me wonder if the performance analyzer
         was periodically running into your bug.

         The "assert(false," should be:

             fatal("invalid bci or invalid scope desc");

         It's not your bug so feel free to ignore. :-)

src/share/vm/runtime/vframe.hpp
     No comments.

Thumbs up!

Dan


>
> Regarding:
>
> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."
>
> I agree with you, lets see what we can do to improve this.
>
> Thanks again
> Markus
>
>
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: den 17 februari 2017 18:51
> To: Markus Gronlund; [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>
> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>> Greetings,
>>
>>    
>>
>> Kindly asking for reviews for the following changeset:
>>
>>    
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>
>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
> src/cpu/x86/vm/templateTable_x86.cpp
>       (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
> // restore target bytecode
>           I was a little concerned about not restoring the target
>           bytecode into the 'rbx' register, but when I looked at
>           dispatch code:
>
>           L2196:     __ bind(dispatch);
>           L2197:   }
>           L2198:
>           L2199:   // Pre-load the next target bytecode into rbx
>           L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>
>           so it looks like rbx gets the target bytecode just fine.
>
>       L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
> restore target bytecode
>       L2214:       __ set_method_data_pointer_for_bcp();
>       L2215:       __ jmp(dispatch);
>           I think the "restore target bytecode" here is also redundant.
>
>           src/cpu/x86/vm/interp_masm_x86.cpp:
>
> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>             <snip>
>
>             push(rbx);
>
>             get_method(rbx);
>
>           src/cpu/x86/vm/interp_masm_x86.hpp:
>
>           void get_method(Register reg) {
>             movptr(reg, Address(rbp,
> frame::interpreter_frame_method_offset * wordSize));
>           }
>
>           set_method_data_pointer_for_bcp() doesn't use the rbx
>           value that we bothered to restore.
>
> OK, so on 64-bit the code saved the nmethod before the call to
> SharedRuntime::OSR_migration_begin() and it was using 'r13'
> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>
> This is outstanding sleuthing!
>
> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>
> Thumbs up on this change!
>
> Please don't forget to update the copyright before you push.
>
> Dan
>
> P.S.
> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>
>
>>    
>>
>> Summary:
>>
>>    
>>
>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error
>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>
>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>
>>    
>>
>> Thanks
>>
>> Markus
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

Daniel D. Daugherty
On 2/23/17 8:31 AM, Markus Gronlund wrote:
> Thanks again Dan,
>
> It is most likely the analyzer has run into this previously.
> I will also update to use "fatal" instead, thanks for noticing.

I used to use the "assert(false, ..." style until one of my recent
reviewers pointed out "fatal(...". I am trying to retrain my brain... :-)

Dan


>
> Markus
>
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: den 23 februari 2017 16:12
> To: Markus Gronlund; [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>
> On 2/19/17 6:07 AM, Markus Gronlund wrote:
>> Thanks a lot Dan for taking a look at this.
>>
>> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>>
>> Thanks also for pointing to the other bytecode restore location (@L2213).
>>
>> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>>
>> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>>
>> Updated webrev:
>>
>> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/
> src/cpu/x86/vm/templateTable_x86.cpp
>       No comments.
>
> src/share/vm/runtime/vframe.cpp
>       L467: void vframeStreamCommon::found_bad_method_frame() const {
>       L468:   // 6379830 Cut point for an assertion that occasionally
> fires when
>       L469:   // we are using the performance analyzer.
>       L470:   // Disable this assert when testing the analyzer with
> fastdebug.
>       L471:   // -XX:SuppressErrorAt=vframe.cpp:XXX (XXX=following line
> number)
>       L472:   assert(false, "invalid bci or invalid scope desc");
>       L473: }
>           The comment makes me wonder if the performance analyzer
>           was periodically running into your bug.
>
>           The "assert(false," should be:
>
>               fatal("invalid bci or invalid scope desc");
>
>           It's not your bug so feel free to ignore. :-)
>
> src/share/vm/runtime/vframe.hpp
>       No comments.
>
> Thumbs up!
>
> Dan
>
>
>> Regarding:
>>
>> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."
>>
>> I agree with you, lets see what we can do to improve this.
>>
>> Thanks again
>> Markus
>>
>>
>> -----Original Message-----
>> From: Daniel D. Daugherty
>> Sent: den 17 februari 2017 18:51
>> To: Markus Gronlund; [hidden email]
>> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>>
>> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>>> Greetings,
>>>
>>>    
>>>
>>> Kindly asking for reviews for the following changeset:
>>>
>>>    
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>>
>>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
>> src/cpu/x86/vm/templateTable_x86.cpp
>>        (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
>> // restore target bytecode
>>            I was a little concerned about not restoring the target
>>            bytecode into the 'rbx' register, but when I looked at
>>            dispatch code:
>>
>>            L2196:     __ bind(dispatch);
>>            L2197:   }
>>            L2198:
>>            L2199:   // Pre-load the next target bytecode into rbx
>>            L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>>
>>            so it looks like rbx gets the target bytecode just fine.
>>
>>        L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
>> restore target bytecode
>>        L2214:       __ set_method_data_pointer_for_bcp();
>>        L2215:       __ jmp(dispatch);
>>            I think the "restore target bytecode" here is also redundant.
>>
>>            src/cpu/x86/vm/interp_masm_x86.cpp:
>>
>> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>>              <snip>
>>
>>              push(rbx);
>>
>>              get_method(rbx);
>>
>>            src/cpu/x86/vm/interp_masm_x86.hpp:
>>
>>            void get_method(Register reg) {
>>              movptr(reg, Address(rbp,
>> frame::interpreter_frame_method_offset * wordSize));
>>            }
>>
>>            set_method_data_pointer_for_bcp() doesn't use the rbx
>>            value that we bothered to restore.
>>
>> OK, so on 64-bit the code saved the nmethod before the call to
>> SharedRuntime::OSR_migration_begin() and it was using 'r13'
>> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>>
>> This is outstanding sleuthing!
>>
>> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>>
>> Thumbs up on this change!
>>
>> Please don't forget to update the copyright before you push.
>>
>> Dan
>>
>> P.S.
>> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>>
>>
>>>    
>>>
>>> Summary:
>>>
>>>    
>>>
>>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> # Internal Error
>>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>>
>>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>>
>>>    
>>>
>>> Thanks
>>>
>>> Markus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86

coleen.phillimore
In reply to this post by Markus Gronlund


On 2/23/17 5:34 AM, Markus Gronlund wrote:

> Hi again Coleen,
>
> I looked into changing "noreg" to "rax" for the call to SharedRuntime::OSR_migration_begin(), but this is not correct.
>
> By explicitly passing rax, we are indicating the interest in receiving some kind of "managed" return value (like an oop in this example).
>
> MacroAssembler::call_VM_base() will proceed with storing back the thread->vm_result() oop into rax, instead of leaving the platform specific return value holding the natively allocated buffer.
>
> This is because CONSTANT_REGISTER_DECLARATION(rax) == 0 entails oop_result->is_valid() while CONSTANT_REGISTER_DECLARATION(noreg) == -1 entails !oop_result->is_valid().
>
> Therefore I will keep the noreg in place as-is.

Okay, thanks for the explanation!  This is reviewed by me.
Coleen

>
> Thanks
> Markus
>
> -----Original Message-----
> From: Markus Gronlund
> Sent: den 23 februari 2017 00:55
> To: Coleen Phillimore; Daniel Daugherty
> Cc: [hidden email]
> Subject: RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>
> Thanks Coleen,
>
> I will also update the "noreg" to "rax" before putback.
>
> Thanks again
> Markus
>
> -----Original Message-----
> From: Coleen Phillimore
> Sent: den 22 februari 2017 22:21
> To: [hidden email]
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope desc on x86
>
>
> This change looks good to me.  Although I think this
>
>          call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));
>
>
> Should be:
>
>          call_VM(rax, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));
>
>
> Since this returns the OSR buffer in rax.
>
>
> On 2/19/17 8:07 AM, Markus Gronlund wrote:
>> Thanks a lot Dan for taking a look at this.
>>
>> The "failure mode" in a release build is that any unresolved bci will be unconditionally re-set to 0 (see vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that stacktraces are currently reporting erroneous bci information, especially in the context of OSR frames, which would normally have a bci > 0 if the OSR is triggered by backedge counter overflow.
>>
>> Thanks also for pointing to the other bytecode restore location (@L2213).
>>
>> I have updated the webrev with your inputs in light of the other redundant bytecode restore in addition to updates to the copyright header(s).
>>
>> In addition, I took the opportunity to turn vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional function.
>>
>> Updated webrev:
>>
>> http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/
>>
>> Regarding:
>>
>> "It feels like we're missing some infrastructure to prevent the accidental use of 'r13'."
> We use r13 (and rlocals) as a scratch register in lots of places due to not having enough registers on x86.  As long as there isn't a safepoint (or call into the runtime as you found), this should be perfectly safe, as long as there's a restore_bcp() or restore_locals() call.
>
> Coleen
>
>> I agree with you, lets see what we can do to improve this.
>>
>> Thanks again
>> Markus
>>
>>
>> -----Original Message-----
>> From: Daniel D. Daugherty
>> Sent: den 17 februari 2017 18:51
>> To: Markus Gronlund; [hidden email]
>> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration
>> asserts with invalid bci or invalid scope desc on x86
>>
>> On 2/17/17 5:29 AM, Markus Gronlund wrote:
>>> Greetings,
>>>
>>>    
>>>
>>> Kindly asking for reviews for the following changeset:
>>>
>>>    
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
>>>
>>> Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
>> src/cpu/x86/vm/templateTable_x86.cpp
>>        (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
>> // restore target bytecode
>>            I was a little concerned about not restoring the target
>>            bytecode into the 'rbx' register, but when I looked at
>>            dispatch code:
>>
>>            L2196:     __ bind(dispatch);
>>            L2197:   }
>>            L2198:
>>            L2199:   // Pre-load the next target bytecode into rbx
>>            L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
>>
>>            so it looks like rbx gets the target bytecode just fine.
>>
>>        L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
>> restore target bytecode
>>        L2214:       __ set_method_data_pointer_for_bcp();
>>        L2215:       __ jmp(dispatch);
>>            I think the "restore target bytecode" here is also redundant.
>>
>>            src/cpu/x86/vm/interp_masm_x86.cpp:
>>
>> InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
>>              <snip>
>>
>>              push(rbx);
>>
>>              get_method(rbx);
>>
>>            src/cpu/x86/vm/interp_masm_x86.hpp:
>>
>>            void get_method(Register reg) {
>>              movptr(reg, Address(rbp,
>> frame::interpreter_frame_method_offset * wordSize));
>>            }
>>
>>            set_method_data_pointer_for_bcp() doesn't use the rbx
>>            value that we bothered to restore.
>>
>> OK, so on 64-bit the code saved the nmethod before the call to
>> SharedRuntime::OSR_migration_begin() and it was using 'r13'
>> which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible to stack walkers (because of save_bcp()) and caused the assert() failure. What's the failure mode in release bits?
>>
>> This is outstanding sleuthing!
>>
>> You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've removed stale code that was using 'rbx' for saving the target bytecode unnecessarily.
>>
>> Thumbs up on this change!
>>
>> Please don't forget to update the copyright before you push.
>>
>> Dan
>>
>> P.S.
>> It feels like we're missing some infrastructure to prevent the accidental use of 'r13'. We have other "special" registers that we guard against being used... Perhaps we need something for 'r13'.
>>
>>
>>>    
>>>
>>> Summary:
>>>
>>>    
>>>
>>> vframeStream stack traversal can assert on x86 when trying to decode an interpreter frame that is in the process of being migrated for On-Stack Replacement (OSR). This is because the interpreter frame does not have a valid bcp, it instead has an nmethod in the bcp slot (the OSR nmethod).
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> # Internal Error
>>> (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
>>> tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
>>>
>>> There is currently a save operation that uses r13 for saving the OSR nmethod over the VM call into SharedRuntime::OSR_migration_begin(). This has the side-effect of installing the OSR nmethod into the interpreter frame bcp slot.
>>>
>>>    
>>>
>>> Thanks
>>>
>>> Markus

Loading...