RFR(s-ish): 8191101: Show register content in hs-err file on assert

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
Hi all,

may I please have reviews for the following enhancement:

issue: https://bugs.openjdk.java.net/browse/JDK-8191101
webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/8191101-show-registers-on-assert/webrev.00/webrev/


(Patch looks big but lot of it is os_cpu fluff.)

Prior email discussion at:
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/025018.html

----

Basically, this adds the ability to show register values in assert
situations into the error report. This can be useful in certain corner
cases, e.g. if you want to know the value of some local variables or how
deep your stack currently runs.

This works by triggering a fault right when an assert happens and
squirreling the context away for later error reporting.

When an assert happens, we touch a poison page. To preserve the context as
best as possible, we want to avoid running too much code after the assert
condition has been evaluated, so this is done in a very simple way:
directly in the assert macro, right after the condition is evaluated, we
dereference the content of a global poison page pointer. In my tests, even
with slowdebug, this only spoils one register, rax on x64.

In the signal handler, we recognize the assertion poison fault by the
faulting address. We disable the poison page and store the ucontext away.
Then we just return from signal handling. Poison page is now disarmed, the
load from it is retried and now goes through. Normal assertion handling is
then resumed - so, things like -XX:SuppressAt are unaffected and work fine.

Then, when an error report is generated due to this assert, we now also use
the stored context. So now we get registers and instructions at the assert
point.

Right now, this is implemented on (non-zero) Linux, though other posix
platforms should be no problem. Have not yet thought deeply about windows.
It is tested and - in debug - switched on by default on linux x86, ppc,
s390.

If implemented, it can be switched on and off with
-XX:+ShowRegistersOnAssert. This is a failsafe, in case the mechanism does
not work and we want to have clean asserts.

To test this, do a java -XX:ErrorHandlerTest=1 -XX:+ShowRegistersOnAssert
with a not-product VM. On Linux x64, ppc and s390 we should now see the
register output in the hs-err file:

  4 #  Internal Error
(/shared/projects/openjdk/jdk-hs/source/src/hotspot/share/utilities/vmError.cpp:1660),
pid=4022, tid=4023
  5 #  assert(str == NULL) failed: expected null
  6 #
  7 #
...
 59 Registers:
 60 RAX=0x00007f736c8f7000, RBX=0x0000000000000000, RCX=0x0000000000000000,
RDX=0xffffffffffb5ded4
 61 RSP=0x00007f736c8d8ce0, RBP=0x00007f736c8d8d30, RSI=0x0000000000000001,
RDI=0x0000000000000001
 62 R8 =0x0000000000000040, R9 =0x0000000000000001, R10=0x0000000000efb028,
R11=0x00040788a244f42a
 63 R12=0x0000000000000000, R13=0x00007fff3de46dbf, R14=0x00007f736c8d99c0,
R15=0x0000000000000000
 64 RIP=0x00007f736aec5529, EFLAGS=0x0000000000010202,
CSGSFS=0x002b000000000033, ERR=0x0000000000000006
 65   TRAPNO=0x000000000000000e
...
 72
 73 Instructions: (pc=0x00007f736aec5529)
 74 0x00007f736aec5509:   8d 05 31 21 4a 00 48 01 d0 ff e0 48 83 7d c8 00
 75 0x00007f736aec5519:   0f 84 7f 03 00 00 48 8d 05 82 fc b1 00 48 8b 00
 76 0x00007f736aec5529:   c6 00 58 e8 cb 17 62 ff 84 c0 74 11 48 8d 3d 2f
 77 0x00007f736aec5539:   1f 4a 00 b8 00 00 00 00 e8 ed 17 62 ff 48 8d 0d
 78

---

Implementation notes:

- when handling the poison fault, we need to copy the context away from the
signal handler stack. For posix, this means copying the ucontext_t. This is
undefined territory. On most platforms, this simply means copying the
ucontex_t as a flat structure. On some platforms more is needed, e.g. on
linux ppc, we need to patch up the context after copying (the context is
not position independent), and on MacOS, the context is not self-contained
but contains pointers to sub structures which need to be copied too and
whose size is unknown at compile time. Because of these platform
dependencies, I factored out the copying of ucontext_t to
os::Posix::copy_ucontext and its implementations are os_cpu specific.

- As an added precaution, when copying the context, we use a safe version
of memcpy (os::safe_memcpy) which I added to copy from potentially invalid
memory regions. The reason is that we have seen on some Unices - e.g. hpux
- that the size of the ucontext_t structure at runtime may be different
from the build machine, so we tread carefully. os::safe_memcpy() uses
SafeFetch to copy a range of memory.

Risk:

If this does not work, asserts will become segfaults, which can be
confusing. But the feature can be disabled with -XX:+ShowRegistersOnAssert
and for now on most platforms is disabled by default.

---

Limitations:
- as it is now implemented, this is a one-shot mechanism and only works for
the first assert.
- -XX:SuppressAt=... is not affected and works fine. However, if the first
assert is suppressed, follow-up asserts will not show register values.
- When multiple threads run into an assert, we may or may not see register
values depending on which thread is the first of finishing the poison page
handler.

I do not think these limitations are severe. They can be solved, but at the
cost of added complexity, which I preferred not to add.

Thank you,

Thomas
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
Small correction, see below...

On Wed 22. Nov 2017 at 19:01, Thomas Stüfe <[hidden email]> wrote:

> Hi all,
>
> may I please have reviews for the following enhancement:
>
> issue: https://bugs.openjdk.java.net/browse/JDK-8191101
> webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8191101-show-registers-on-assert/webrev.00/webrev/
>
>
> (Patch looks big but lot of it is os_cpu fluff.)
>
> Prior email discussion at:
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/025018.html
>
> ----
>
> Basically, this adds the ability to show register values in assert
> situations into the error report. This can be useful in certain corner
> cases, e.g. if you want to know the value of some local variables or how
> deep your stack currently runs.
>
> This works by triggering a fault right when an assert happens and
> squirreling the context away for later error reporting.
>
> When an assert happens, we touch a poison page. To preserve the context as
> best as possible, we want to avoid running too much code after the assert
> condition has been evaluated, so this is done in a very simple way:
> directly in the assert macro, right after the condition is evaluated, we
> dereference the content of a global poison page pointer. In my tests, even
> with slowdebug, this only spoils one register, rax on x64.
>
> In the signal handler, we recognize the assertion poison fault by the
> faulting address. We disable the poison page and store the ucontext away.
> Then we just return from signal handling. Poison page is now disarmed, the
> load from it is retried and now goes through. Normal assertion handling is
> then resumed - so, things like -XX:SuppressAt are unaffected and work fine.
>
> Then, when an error report is generated due to this assert, we now also
> use the stored context. So now we get registers and instructions at the
> assert point.
>
> Right now, this is implemented on (non-zero) Linux, though other posix
> platforms should be no problem. Have not yet thought deeply about windows.
> It is tested and - in debug - switched on by default on linux x86, ppc,
> s390.
>
> If implemented, it can be switched on and off with
> -XX:+ShowRegistersOnAssert. This is a failsafe, in case the mechanism does
> not work and we want to have clean asserts.
>
> To test this, do a java -XX:ErrorHandlerTest=1 -XX:+ShowRegistersOnAssert
> with a not-product VM. On Linux x64, ppc and s390 we should now see the
> register output in the hs-err file:
>
>   4 #  Internal Error
> (/shared/projects/openjdk/jdk-hs/source/src/hotspot/share/utilities/vmError.cpp:1660),
> pid=4022, tid=4023
>   5 #  assert(str == NULL) failed: expected null
>   6 #
>   7 #
> ...
>  59 Registers:
>  60 RAX=0x00007f736c8f7000, RBX=0x0000000000000000,
> RCX=0x0000000000000000, RDX=0xffffffffffb5ded4
>  61 RSP=0x00007f736c8d8ce0, RBP=0x00007f736c8d8d30,
> RSI=0x0000000000000001, RDI=0x0000000000000001
>  62 R8 =0x0000000000000040, R9 =0x0000000000000001,
> R10=0x0000000000efb028, R11=0x00040788a244f42a
>  63 R12=0x0000000000000000, R13=0x00007fff3de46dbf,
> R14=0x00007f736c8d99c0, R15=0x0000000000000000
>  64 RIP=0x00007f736aec5529, EFLAGS=0x0000000000010202,
> CSGSFS=0x002b000000000033, ERR=0x0000000000000006
>  65   TRAPNO=0x000000000000000e
> ...
>  72
>  73 Instructions: (pc=0x00007f736aec5529)
>  74 0x00007f736aec5509:   8d 05 31 21 4a 00 48 01 d0 ff e0 48 83 7d c8 00
>  75 0x00007f736aec5519:   0f 84 7f 03 00 00 48 8d 05 82 fc b1 00 48 8b 00
>  76 0x00007f736aec5529:   c6 00 58 e8 cb 17 62 ff 84 c0 74 11 48 8d 3d 2f
>  77 0x00007f736aec5539:   1f 4a 00 b8 00 00 00 00 e8 ed 17 62 ff 48 8d 0d
>  78
>
> ---
>
> Implementation notes:
>
> - when handling the poison fault, we need to copy the context away from
> the signal handler stack. For posix, this means copying the ucontext_t.
> This is undefined territory. On most platforms, this simply means copying
> the ucontex_t as a flat structure. On some platforms more is needed, e.g.
> on linux ppc, we need to patch up the context after copying (the context is
> not position independent), and on MacOS, the context is not self-contained
> but contains pointers to sub structures which need to be copied too and
> whose size is unknown at compile time. Because of these platform
> dependencies, I factored out the copying of ucontext_t to
> os::Posix::copy_ucontext and its implementations are os_cpu specific.
>
> - As an added precaution, when copying the context, we use a safe version
> of memcpy (os::safe_memcpy) which I added to copy from potentially invalid
> memory regions. The reason is that we have seen on some Unices - e.g. hpux
> - that the size of the ucontext_t structure at runtime may be different
> from the build machine, so we tread carefully. os::safe_memcpy() uses
> SafeFetch to copy a range of memory.
>
> Risk:
>
> If this does not work, asserts will become segfaults, which can be
> confusing. But the feature can be disabled with -XX:+ShowRegistersOnAssert
> and for now on most platforms is disabled by default.
>

Feature is disabled with -XX:-ShowRegistersOnAssert, of course. Sorry for
the typo.


> ---
>
> Limitations:
> - as it is now implemented, this is a one-shot mechanism and only works
> for the first assert.
> - -XX:SuppressAt=... is not affected and works fine. However, if the first
> assert is suppressed, follow-up asserts will not show register values.
> - When multiple threads run into an assert, we may or may not see register
> values depending on which thread is the first of finishing the poison page
> handler.
>
> I do not think these limitations are severe. They can be solved, but at
> the cost of added complexity, which I preferred not to add.
>
> Thank you,
>
> Thomas
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
In reply to this post by Thomas Stüfe-2
Ping... Could I please have reviews? Thank you!
..Thomas

On Wed, Nov 22, 2017 at 7:01 PM, Thomas Stüfe <[hidden email]>
wrote:

> Hi all,
>
> may I please have reviews for the following enhancement:
>
> issue: https://bugs.openjdk.java.net/browse/JDK-8191101
> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/
> 8191101-show-registers-on-assert/webrev.00/webrev/
>
> (Patch looks big but lot of it is os_cpu fluff.)
>
> Prior email discussion at: http://mail.openjdk.java.net/
> pipermail/hotspot-runtime-dev/2017-October/025018.html
>
> ----
>
> Basically, this adds the ability to show register values in assert
> situations into the error report. This can be useful in certain corner
> cases, e.g. if you want to know the value of some local variables or how
> deep your stack currently runs.
>
> This works by triggering a fault right when an assert happens and
> squirreling the context away for later error reporting.
>
> When an assert happens, we touch a poison page. To preserve the context as
> best as possible, we want to avoid running too much code after the assert
> condition has been evaluated, so this is done in a very simple way:
> directly in the assert macro, right after the condition is evaluated, we
> dereference the content of a global poison page pointer. In my tests, even
> with slowdebug, this only spoils one register, rax on x64.
>
> In the signal handler, we recognize the assertion poison fault by the
> faulting address. We disable the poison page and store the ucontext away.
> Then we just return from signal handling. Poison page is now disarmed, the
> load from it is retried and now goes through. Normal assertion handling is
> then resumed - so, things like -XX:SuppressAt are unaffected and work fine.
>
> Then, when an error report is generated due to this assert, we now also
> use the stored context. So now we get registers and instructions at the
> assert point.
>
> Right now, this is implemented on (non-zero) Linux, though other posix
> platforms should be no problem. Have not yet thought deeply about windows.
> It is tested and - in debug - switched on by default on linux x86, ppc,
> s390.
>
> If implemented, it can be switched on and off with
> -XX:+ShowRegistersOnAssert. This is a failsafe, in case the mechanism does
> not work and we want to have clean asserts.
>
> To test this, do a java -XX:ErrorHandlerTest=1 -XX:+ShowRegistersOnAssert
> with a not-product VM. On Linux x64, ppc and s390 we should now see the
> register output in the hs-err file:
>
>   4 #  Internal Error (/shared/projects/openjdk/jdk-
> hs/source/src/hotspot/share/utilities/vmError.cpp:1660), pid=4022,
> tid=4023
>   5 #  assert(str == NULL) failed: expected null
>   6 #
>   7 #
> ...
>  59 Registers:
>  60 RAX=0x00007f736c8f7000, RBX=0x0000000000000000,
> RCX=0x0000000000000000, RDX=0xffffffffffb5ded4
>  61 RSP=0x00007f736c8d8ce0, RBP=0x00007f736c8d8d30,
> RSI=0x0000000000000001, RDI=0x0000000000000001
>  62 R8 =0x0000000000000040, R9 =0x0000000000000001,
> R10=0x0000000000efb028, R11=0x00040788a244f42a
>  63 R12=0x0000000000000000, R13=0x00007fff3de46dbf,
> R14=0x00007f736c8d99c0, R15=0x0000000000000000
>  64 RIP=0x00007f736aec5529, EFLAGS=0x0000000000010202,
> CSGSFS=0x002b000000000033, ERR=0x0000000000000006
>  65   TRAPNO=0x000000000000000e
> ...
>  72
>  73 Instructions: (pc=0x00007f736aec5529)
>  74 0x00007f736aec5509:   8d 05 31 21 4a 00 48 01 d0 ff e0 48 83 7d c8 00
>  75 0x00007f736aec5519:   0f 84 7f 03 00 00 48 8d 05 82 fc b1 00 48 8b 00
>  76 0x00007f736aec5529:   c6 00 58 e8 cb 17 62 ff 84 c0 74 11 48 8d 3d 2f
>  77 0x00007f736aec5539:   1f 4a 00 b8 00 00 00 00 e8 ed 17 62 ff 48 8d 0d
>  78
>
> ---
>
> Implementation notes:
>
> - when handling the poison fault, we need to copy the context away from
> the signal handler stack. For posix, this means copying the ucontext_t.
> This is undefined territory. On most platforms, this simply means copying
> the ucontex_t as a flat structure. On some platforms more is needed, e.g.
> on linux ppc, we need to patch up the context after copying (the context is
> not position independent), and on MacOS, the context is not self-contained
> but contains pointers to sub structures which need to be copied too and
> whose size is unknown at compile time. Because of these platform
> dependencies, I factored out the copying of ucontext_t to
> os::Posix::copy_ucontext and its implementations are os_cpu specific.
>
> - As an added precaution, when copying the context, we use a safe version
> of memcpy (os::safe_memcpy) which I added to copy from potentially invalid
> memory regions. The reason is that we have seen on some Unices - e.g. hpux
> - that the size of the ucontext_t structure at runtime may be different
> from the build machine, so we tread carefully. os::safe_memcpy() uses
> SafeFetch to copy a range of memory.
>
> Risk:
>
> If this does not work, asserts will become segfaults, which can be
> confusing. But the feature can be disabled with -XX:+ShowRegistersOnAssert
> and for now on most platforms is disabled by default.
>
> ---
>
> Limitations:
> - as it is now implemented, this is a one-shot mechanism and only works
> for the first assert.
> - -XX:SuppressAt=... is not affected and works fine. However, if the first
> assert is suppressed, follow-up asserts will not show register values.
> - When multiple threads run into an assert, we may or may not see register
> values depending on which thread is the first of finishing the poison page
> handler.
>
> I do not think these limitations are severe. They can be solved, but at
> the cost of added complexity, which I preferred not to add.
>
> Thank you,
>
> Thomas
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Andrew Dinn
On 27/11/17 10:33, Thomas Stüfe wrote:
> Ping... Could I please have reviews? Thank you!
> ..Thomas

This looks ok except that I'm a bit unsure about the changes to
VMError.cpp, specifically

  1) the message "Life goes on..." printed at the bottom of the loop
  2) the fact that this replaces the call to ShouldNotReachHere();

I understand that the message is there to give a positive for the test.
Also that it only gets printed if you suppress error reporting in this
code. However, to someone who doesn't know that the idea that one might
continue through to this point is rather misleading and, perhaps,
confusing. Could this be commented out? Or could the positive
confirmation just be omitted?

Anyway, I built and tested on AArch64 and it passed the test (for what
that indicates about register printing -- as you say a more stringent
test is needed that allows for different behaviour on different platforms.)

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
Hi Andrew,

thanks a lot for the quick review!

Pls find comments inline.

On Mon, Nov 27, 2017 at 1:43 PM, Andrew Dinn <[hidden email]> wrote:

> On 27/11/17 10:33, Thomas Stüfe wrote:
> > Ping... Could I please have reviews? Thank you!
> > ..Thomas
>
> This looks ok except that I'm a bit unsure about the changes to
> VMError.cpp, specifically
>
>   1) the message "Life goes on..." printed at the bottom of the loop
>   2) the fact that this replaces the call to ShouldNotReachHere();
>
> I understand that the message is there to give a positive for the test.
> Also that it only gets printed if you suppress error reporting in this
> code. However, to someone who doesn't know that the idea that one might
> continue through to this point is rather misleading and, perhaps,
> confusing. Could this be commented out? Or could the positive
> confirmation just be omitted?
>
>
Okay, I see your point. I will revert back to ShouldNotReachHere() and
change the jtreg test to recognize that as a positive test.


> Anyway, I built and tested on AArch64 and it passed the test (for what
> that indicates about register printing -- as you say a more stringent
> test is needed that allows for different behaviour on different platforms.)
>
>
Could you please quickly run

java -XX:+ShowRegistersOnAssert -XX:ErrorHandlerTest=1

on aarch64 and take a quick glance at the hs-err file, whether it contains
register output?


regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>

Thanks again,

Best Regards, Thomas
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Andrew Dinn
Hi Thomas,

On 27/11/17 13:14, Thomas Stüfe wrote:
> thanks a lot for the quick review!

Thanks for the fix :-)

> Okay, I see your point. I will revert back to ShouldNotReachHere() and
> change the jtreg test to recognize that as a positive test.

Ok, thanks.

> Could you please quickly run 
>
> java -XX:+ShowRegistersOnAssert -XX:ErrorHandlerTest=1
>  
> on aarch64 and take a quick glance at the hs-err file, whether it
> contains register output?
Oh my! The relevant (beautiful) output is appended below. Very nice.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander


#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error
(/home/adinn/openjdk/jdkdev/hs/src/hotspot/share/utilities/vmError.cpp:1660),
pid=29190, tid=29191

  . . .

Registers:
R0 =0x000003ff83340000 is an unknown value
R1 =0x0000000000000058 is an unknown value
R2 =0x000003ff826da55c: <offset 0x00000000011ca55c> in
/home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-server-slowdebug/jdk/lib/server/libjvm.so
at 0x000003ff81510000
R3 =0x000003ff826da55c: <offset 0x00000000011ca55c> in
/home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-server-slowdebug/jdk/lib/server/libjvm.so
at 0x000003ff81510000
R4 =0x000003ff812fe748 is pointing into the stack for thread:
0x000003ff7c018000
R5 =0x000003ff7c016a28 is an unknown value
R6 =0x000003ff83360000 is an unknown value
R7 =0x0000000000000000 is an unknown value
R8 =0x00000000000c4c96 is an unknown value
R9 =0x00000000000c4c96 is an unknown value
R10=0x0000000000004438 is an unknown value
R11=0x000000000a0751f8 is an unknown value
R12=0x0000000000000018 is an unknown value
R13=0xffffffffa5e40b06 is an unknown value
R14=0x001b901602000000 is an unknown value
R15=0x003b9aca00000000 is an unknown value
R16=0x000003ff82fc6bc0: <offset 0x0000000001ab6bc0> in
/home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-server-slowdebug/jdk/lib/server/libjvm.so
at 0x000003ff81510000
R17=0x000003ff831548ac: memset+0x0000000000000000 in /lib64/libc.so.6 at
0x000003ff830d0000
R18=0x00000006cf9f754a is pointing into object: 0x00000006cf9f7538
[B
{0x00000006cf9f7538} - klass: {type array byte}
 - length: 21
 -   0: 73 s
 -   1: 75 u
 -   2: 6e n
 -   3: 2e .
 -   4: 62 b
 -   5: 6f o
 -   6: 6f o
 -   7: 74 t
 -   8: 2e .
 -   9: 6c l
 -  10: 69 i
 -  11: 62 b
 -  12: 72 r
 -  13: 61 a
 -  14: 72 r
 -  15: 79 y
 -  16: 2e .
 -  17: 70 p
 -  18: 61 a
 -  19: 74 t
 -  20: 68 h
R19=0x0000000000000000 is an unknown value
R20=0x000003ffe7d21468 is an unknown value
R21=0x000003ff832f0000: <offset 0x0000000000030000> in
/lib64/libpthread.so.0 at 0x000003ff832c0000
R22=0x000003ff812feb10 is pointing into the stack for thread:
0x000003ff7c018000
R23=0x000003ffe7d21468 is an unknown value
R24=0x0000000000000000 is an unknown value
R25=0x000003ffe7d21468 is an unknown value
R26=0x000003ff832f4000: <offset 0x0000000000034000> in
/lib64/libpthread.so.0 at 0x000003ff832c0000
R27=0x000003ff812ff2c0 is pointing into the stack for thread:
0x000003ff7c018000
R28=0x000003ff812ff8f0 is pointing into the stack for thread:
0x000003ff7c018000
R29=0x000003ff812fe750 is pointing into the stack for thread:
0x000003ff7c018000
R30=0x000003ff826da518: <offset 0x00000000011ca518> in
/home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-server-slowdebug/jdk/lib/server/libjvm.so
at 0x000003ff81510000

Top of Stack: (sp=0x000003ff812fe690)
  . . .
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
On Mon, Nov 27, 2017 at 5:18 PM, Andrew Dinn <[hidden email]> wrote:

> Hi Thomas,
>
> On 27/11/17 13:14, Thomas Stüfe wrote:
> > thanks a lot for the quick review!
>
> Thanks for the fix :-)
>
> > Okay, I see your point. I will revert back to ShouldNotReachHere() and
> > change the jtreg test to recognize that as a positive test.
>
> Ok, thanks.
>
> > Could you please quickly run
> >
> > java -XX:+ShowRegistersOnAssert -XX:ErrorHandlerTest=1
> >
> > on aarch64 and take a quick glance at the hs-err file, whether it
> > contains register output?
> Oh my! The relevant (beautiful) output is appended below. Very nice.
>
>
:) Cool! Thanks for testing!

I'll post an update with the corrected vmError.cpp soon

..Thomas


> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error
> (/home/adinn/openjdk/jdkdev/hs/src/hotspot/share/
> utilities/vmError.cpp:1660),
> pid=29190, tid=29191
>
>   . . .
>
> Registers:
> R0 =0x000003ff83340000 is an unknown value
> R1 =0x0000000000000058 is an unknown value
> R2 =0x000003ff826da55c: <offset 0x00000000011ca55c> in
> /home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-
> server-slowdebug/jdk/lib/server/libjvm.so
> at 0x000003ff81510000
> R3 =0x000003ff826da55c: <offset 0x00000000011ca55c> in
> /home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-
> server-slowdebug/jdk/lib/server/libjvm.so
> at 0x000003ff81510000
> R4 =0x000003ff812fe748 is pointing into the stack for thread:
> 0x000003ff7c018000
> R5 =0x000003ff7c016a28 is an unknown value
> R6 =0x000003ff83360000 is an unknown value
> R7 =0x0000000000000000 is an unknown value
> R8 =0x00000000000c4c96 is an unknown value
> R9 =0x00000000000c4c96 is an unknown value
> R10=0x0000000000004438 is an unknown value
> R11=0x000000000a0751f8 is an unknown value
> R12=0x0000000000000018 is an unknown value
> R13=0xffffffffa5e40b06 is an unknown value
> R14=0x001b901602000000 is an unknown value
> R15=0x003b9aca00000000 is an unknown value
> R16=0x000003ff82fc6bc0: <offset 0x0000000001ab6bc0> in
> /home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-
> server-slowdebug/jdk/lib/server/libjvm.so
> at 0x000003ff81510000
> R17=0x000003ff831548ac: memset+0x0000000000000000 in /lib64/libc.so.6 at
> 0x000003ff830d0000
> R18=0x00000006cf9f754a is pointing into object: 0x00000006cf9f7538
> [B
> {0x00000006cf9f7538} - klass: {type array byte}
>  - length: 21
>  -   0: 73 s
>  -   1: 75 u
>  -   2: 6e n
>  -   3: 2e .
>  -   4: 62 b
>  -   5: 6f o
>  -   6: 6f o
>  -   7: 74 t
>  -   8: 2e .
>  -   9: 6c l
>  -  10: 69 i
>  -  11: 62 b
>  -  12: 72 r
>  -  13: 61 a
>  -  14: 72 r
>  -  15: 79 y
>  -  16: 2e .
>  -  17: 70 p
>  -  18: 61 a
>  -  19: 74 t
>  -  20: 68 h
> R19=0x0000000000000000 is an unknown value
> R20=0x000003ffe7d21468 is an unknown value
> R21=0x000003ff832f0000: <offset 0x0000000000030000> in
> /lib64/libpthread.so.0 at 0x000003ff832c0000
> R22=0x000003ff812feb10 is pointing into the stack for thread:
> 0x000003ff7c018000
> R23=0x000003ffe7d21468 is an unknown value
> R24=0x0000000000000000 is an unknown value
> R25=0x000003ffe7d21468 is an unknown value
> R26=0x000003ff832f4000: <offset 0x0000000000034000> in
> /lib64/libpthread.so.0 at 0x000003ff832c0000
> R27=0x000003ff812ff2c0 is pointing into the stack for thread:
> 0x000003ff7c018000
> R28=0x000003ff812ff8f0 is pointing into the stack for thread:
> 0x000003ff7c018000
> R29=0x000003ff812fe750 is pointing into the stack for thread:
> 0x000003ff7c018000
> R30=0x000003ff826da518: <offset 0x00000000011ca518> in
> /home/adinn/openjdk/jdkdev/hs/build/linux-aarch64-normal-
> server-slowdebug/jdk/lib/server/libjvm.so
> at 0x000003ff81510000
>
> Top of Stack: (sp=0x000003ff812fe690)
>   . . .
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
In reply to this post by Thomas Stüfe-2
Updated webrev with Andrews request worked in:

http://cr.openjdk.java.net/~stuefe/webrevs/8191101-show-registers-on-assert/webrev.01/webrev/

The only difference to before is that I restored the ShouldNotReachHere()
at the end of VMError::controlled_crash(). Before that assert, I added a
hopefully clearer message than before to enable my test to scan for this
message. I also changed the jtreg test to scan for this new message.

Note: in the jtreg test, I test that assert supression works with
-XX:SupressErrorAt. But I just give it the file
(-XX:SuppressErrorAt=/vmError.cpp)
and omit the line number, because I do not want the test to stop working
when someone adds lines in front of VMError::controlled_crash(). But that
means that the ShouldNotReachHere() at the end of controlled_crash() does
not fire either, because now all asserts in vmError.cpp are disabled. So, I
need this new message.

Thanks, Thomas

On Mon, Nov 27, 2017 at 11:33 AM, Thomas Stüfe <[hidden email]>
wrote:

> Ping... Could I please have reviews? Thank you!
> ..Thomas
>
> On Wed, Nov 22, 2017 at 7:01 PM, Thomas Stüfe <[hidden email]>
> wrote:
>
>> Hi all,
>>
>> may I please have reviews for the following enhancement:
>>
>> issue: https://bugs.openjdk.java.net/browse/JDK-8191101
>> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8191101-
>> show-registers-on-assert/webrev.00/webrev/
>>
>> (Patch looks big but lot of it is os_cpu fluff.)
>>
>> Prior email discussion at: http://mail.openjdk.java.net/p
>> ipermail/hotspot-runtime-dev/2017-October/025018.html
>>
>> ----
>>
>> Basically, this adds the ability to show register values in assert
>> situations into the error report. This can be useful in certain corner
>> cases, e.g. if you want to know the value of some local variables or how
>> deep your stack currently runs.
>>
>> This works by triggering a fault right when an assert happens and
>> squirreling the context away for later error reporting.
>>
>> When an assert happens, we touch a poison page. To preserve the context
>> as best as possible, we want to avoid running too much code after the
>> assert condition has been evaluated, so this is done in a very simple way:
>> directly in the assert macro, right after the condition is evaluated, we
>> dereference the content of a global poison page pointer. In my tests, even
>> with slowdebug, this only spoils one register, rax on x64.
>>
>> In the signal handler, we recognize the assertion poison fault by the
>> faulting address. We disable the poison page and store the ucontext away.
>> Then we just return from signal handling. Poison page is now disarmed, the
>> load from it is retried and now goes through. Normal assertion handling is
>> then resumed - so, things like -XX:SuppressAt are unaffected and work fine.
>>
>> Then, when an error report is generated due to this assert, we now also
>> use the stored context. So now we get registers and instructions at the
>> assert point.
>>
>> Right now, this is implemented on (non-zero) Linux, though other posix
>> platforms should be no problem. Have not yet thought deeply about windows.
>> It is tested and - in debug - switched on by default on linux x86, ppc,
>> s390.
>>
>> If implemented, it can be switched on and off with
>> -XX:+ShowRegistersOnAssert. This is a failsafe, in case the mechanism does
>> not work and we want to have clean asserts.
>>
>> To test this, do a java -XX:ErrorHandlerTest=1 -XX:+ShowRegistersOnAssert
>> with a not-product VM. On Linux x64, ppc and s390 we should now see the
>> register output in the hs-err file:
>>
>>   4 #  Internal Error (/shared/projects/openjdk/jdk-
>> hs/source/src/hotspot/share/utilities/vmError.cpp:1660), pid=4022,
>> tid=4023
>>   5 #  assert(str == NULL) failed: expected null
>>   6 #
>>   7 #
>> ...
>>  59 Registers:
>>  60 RAX=0x00007f736c8f7000, RBX=0x0000000000000000,
>> RCX=0x0000000000000000, RDX=0xffffffffffb5ded4
>>  61 RSP=0x00007f736c8d8ce0, RBP=0x00007f736c8d8d30,
>> RSI=0x0000000000000001, RDI=0x0000000000000001
>>  62 R8 =0x0000000000000040, R9 =0x0000000000000001,
>> R10=0x0000000000efb028, R11=0x00040788a244f42a
>>  63 R12=0x0000000000000000, R13=0x00007fff3de46dbf,
>> R14=0x00007f736c8d99c0, R15=0x0000000000000000
>>  64 RIP=0x00007f736aec5529, EFLAGS=0x0000000000010202,
>> CSGSFS=0x002b000000000033, ERR=0x0000000000000006
>>  65   TRAPNO=0x000000000000000e
>> ...
>>  72
>>  73 Instructions: (pc=0x00007f736aec5529)
>>  74 0x00007f736aec5509:   8d 05 31 21 4a 00 48 01 d0 ff e0 48 83 7d c8 00
>>  75 0x00007f736aec5519:   0f 84 7f 03 00 00 48 8d 05 82 fc b1 00 48 8b 00
>>  76 0x00007f736aec5529:   c6 00 58 e8 cb 17 62 ff 84 c0 74 11 48 8d 3d 2f
>>  77 0x00007f736aec5539:   1f 4a 00 b8 00 00 00 00 e8 ed 17 62 ff 48 8d 0d
>>  78
>>
>> ---
>>
>> Implementation notes:
>>
>> - when handling the poison fault, we need to copy the context away from
>> the signal handler stack. For posix, this means copying the ucontext_t.
>> This is undefined territory. On most platforms, this simply means copying
>> the ucontex_t as a flat structure. On some platforms more is needed, e.g.
>> on linux ppc, we need to patch up the context after copying (the context is
>> not position independent), and on MacOS, the context is not self-contained
>> but contains pointers to sub structures which need to be copied too and
>> whose size is unknown at compile time. Because of these platform
>> dependencies, I factored out the copying of ucontext_t to
>> os::Posix::copy_ucontext and its implementations are os_cpu specific.
>>
>> - As an added precaution, when copying the context, we use a safe version
>> of memcpy (os::safe_memcpy) which I added to copy from potentially invalid
>> memory regions. The reason is that we have seen on some Unices - e.g. hpux
>> - that the size of the ucontext_t structure at runtime may be different
>> from the build machine, so we tread carefully. os::safe_memcpy() uses
>> SafeFetch to copy a range of memory.
>>
>> Risk:
>>
>> If this does not work, asserts will become segfaults, which can be
>> confusing. But the feature can be disabled with -XX:+ShowRegistersOnAssert
>> and for now on most platforms is disabled by default.
>>
>> ---
>>
>> Limitations:
>> - as it is now implemented, this is a one-shot mechanism and only works
>> for the first assert.
>> - -XX:SuppressAt=... is not affected and works fine. However, if the
>> first assert is suppressed, follow-up asserts will not show register values.
>> - When multiple threads run into an assert, we may or may not see
>> register values depending on which thread is the first of finishing the
>> poison page handler.
>>
>> I do not think these limitations are severe. They can be solved, but at
>> the cost of added complexity, which I preferred not to add.
>>
>> Thank you,
>>
>> Thomas
>>
>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Andrew Dinn
On 28/11/17 08:08, Thomas Stüfe wrote:
> Updated webrev with Andrews request worked in:
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8191101-show-registers-on-assert/webrev.01/webrev/
>
> The only difference to before is that I restored the
> ShouldNotReachHere() at the end of VMError::controlled_crash(). Before
> that assert, I added a hopefully clearer message than before to enable
> my test to scan for this message. I also changed the jtreg test to scan
> for this new message.

Ok, thanks. That makes it clear that this point is only arrived at
artificially.

> Note: in the jtreg test, I test that assert supression works with
> -XX:SupressErrorAt. But I just give it the file
> (-XX:SuppressErrorAt=/vmError.cpp) and omit the line number, because I
> do not want the test to stop working when someone adds lines in front of
> VMError::controlled_crash(). But that means that the
> ShouldNotReachHere() at the end of controlled_crash() does not fire
> either, because now all asserts in vmError.cpp are disabled. So, I need
> this new message.
Yes, and strictly this means the call to ShouldNotReachHere() is never
actually to execute. However, it still serves in a declarative role so
is good to keep it there. This version looks good.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

Thomas Stüfe-2
On Tue, Nov 28, 2017 at 11:29 AM, Andrew Dinn <[hidden email]> wrote:

> On 28/11/17 08:08, Thomas Stüfe wrote:
> > Updated webrev with Andrews request worked in:
> >
> > http://cr.openjdk.java.net/~stuefe/webrevs/8191101-show-
> registers-on-assert/webrev.01/webrev/
> >
> > The only difference to before is that I restored the
> > ShouldNotReachHere() at the end of VMError::controlled_crash(). Before
> > that assert, I added a hopefully clearer message than before to enable
> > my test to scan for this message. I also changed the jtreg test to scan
> > for this new message.
>
> Ok, thanks. That makes it clear that this point is only arrived at
> artificially.
>
> > Note: in the jtreg test, I test that assert supression works with
> > -XX:SupressErrorAt. But I just give it the file
> > (-XX:SuppressErrorAt=/vmError.cpp) and omit the line number, because I
> > do not want the test to stop working when someone adds lines in front of
> > VMError::controlled_crash(). But that means that the
> > ShouldNotReachHere() at the end of controlled_crash() does not fire
> > either, because now all asserts in vmError.cpp are disabled. So, I need
> > this new message.
> Yes, and strictly this means the call to ShouldNotReachHere() is never
> actually to execute. However, it still serves in a declarative role so
> is good to keep it there. This version looks good.
>
> regards,
>
>
> Andrew Dinn
>

Thanks, Andrew


> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s-ish): 8191101: Show register content in hs-err file on assert

David Holmes
In reply to this post by Thomas Stüfe-2
Hi Thomas,

I took a look at this and have a few concerns ...

But first, this is currently targeted to JDK 11 - is that what you are
aiming for? The JDK 10 gate closes on Friday for hotspot changes.

On 28/11/2017 6:08 PM, Thomas Stüfe wrote:
> Updated webrev with Andrews request worked in:
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8191101-show-registers-on-assert/webrev.01/webrev/

I don't understand why a number of platforms define
os::Posix::copy_context as a no-op instead of using
os::Posix::default_copy_ucontext - this is either common posix code or
it is not. ?? Is it just a case of you not being able to test these
other "posix" platforms? But if the actual ucontext_t layout is OS
specific and not Posix specific then this should not be defined in the
os::Posix class. (os::Posix is supposed to be a place for standard Posix
conforming functionality, not a convenient place to share stuff on
non-windows OS.)

Any why the combination of build-time and run-time checking? This seems
overly complicated. Can't you just rely on runtime checks?

Your bug synopsis and preamble only mentions assert so I was going to
comment that:

!   diagnostic_pd(bool, ShowRegistersOnAssert,

should be a develop flag not diagnostic but then I saw you do this for
guarantees and all the other abort points too! In which case use of
"assert" in the flag name and CAN_SHOW_REGISTERS_ON_ASSERT, is
misleading - how about s/assert/abort/ ?

---

I can't really comment on the safe* functionality - I'm not sure what
that is all about. ??

---

src/hotspot/os/posix/os_posix.cpp

Style nit:  if (copy) -> if (copy != NULL)

+   ::free(uc); // [sic] see os::Posix::copy_context(): the ucontext
should have been copied with raw ::malloc,

I don't understand the comment - should it be referring to
os::Posix::default_copy_ucontext? Shouldn't the issue with os::malloc
use be documented in os::Posix::default_copy_ucontext? And the comment
here just needs to say:

::free(uc);  // Use raw free to match the raw malloc

---

src/hotspot/os/posix/os_posix.hpp

+   // Copy an ucontext.

an -> a

+   // (which is most of them).Do

Space before Do

---

src/hotspot/os_cpu/linux_ppc/os_linux_ppc.cpp

Style nit: if (uc) -> if (uc != NULL)

---
src/hotspot/share/utilities/debug.hpp

   38 #endif
   39
   40 #ifdef CAN_SHOW_REGISTERS_ON_ASSERT

You can delete these three lines as you know you can show registers on
assert.

  42   #define TOUCH_ASSERT_POISON (*(char*)g_assert_poison = 'X');

Why the casting? Why not declare g_assert_poison as intptr_t* and write
an int into it?

Nit: usually macros don't include the trailing ; so that in the code
they look like statements. (Though I think the use of macro and
conditional compilation is overkill here.)

---

  test/hotspot/jtreg/runtime/ErrorHandling/ShowRegistersOnAssertTest.java

I think we need to be able to test a guarantee failure in a product
build, else how do we know this works okay in that scenario? Do we have
something in place to allow that?

---

Thanks,
David
-----


> The only difference to before is that I restored the ShouldNotReachHere()
> at the end of VMError::controlled_crash(). Before that assert, I added a
> hopefully clearer message than before to enable my test to scan for this
> message. I also changed the jtreg test to scan for this new message.
>
> Note: in the jtreg test, I test that assert supression works with
> -XX:SupressErrorAt. But I just give it the file
> (-XX:SuppressErrorAt=/vmError.cpp)
> and omit the line number, because I do not want the test to stop working
> when someone adds lines in front of VMError::controlled_crash(). But that
> means that the ShouldNotReachHere() at the end of controlled_crash() does
> not fire either, because now all asserts in vmError.cpp are disabled. So, I
> need this new message.
>
> Thanks, Thomas
>
> On Mon, Nov 27, 2017 at 11:33 AM, Thomas Stüfe <[hidden email]>
> wrote:
>
>> Ping... Could I please have reviews? Thank you!
>> ..Thomas
>>
>> On Wed, Nov 22, 2017 at 7:01 PM, Thomas Stüfe <[hidden email]>
>> wrote:
>>
>>> Hi all,
>>>
>>> may I please have reviews for the following enhancement:
>>>
>>> issue: https://bugs.openjdk.java.net/browse/JDK-8191101
>>> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8191101-
>>> show-registers-on-assert/webrev.00/webrev/
>>>
>>> (Patch looks big but lot of it is os_cpu fluff.)
>>>
>>> Prior email discussion at: http://mail.openjdk.java.net/p
>>> ipermail/hotspot-runtime-dev/2017-October/025018.html
>>>
>>> ----
>>>
>>> Basically, this adds the ability to show register values in assert
>>> situations into the error report. This can be useful in certain corner
>>> cases, e.g. if you want to know the value of some local variables or how
>>> deep your stack currently runs.
>>>
>>> This works by triggering a fault right when an assert happens and
>>> squirreling the context away for later error reporting.
>>>
>>> When an assert happens, we touch a poison page. To preserve the context
>>> as best as possible, we want to avoid running too much code after the
>>> assert condition has been evaluated, so this is done in a very simple way:
>>> directly in the assert macro, right after the condition is evaluated, we
>>> dereference the content of a global poison page pointer. In my tests, even
>>> with slowdebug, this only spoils one register, rax on x64.
>>>
>>> In the signal handler, we recognize the assertion poison fault by the
>>> faulting address. We disable the poison page and store the ucontext away.
>>> Then we just return from signal handling. Poison page is now disarmed, the
>>> load from it is retried and now goes through. Normal assertion handling is
>>> then resumed - so, things like -XX:SuppressAt are unaffected and work fine.
>>>
>>> Then, when an error report is generated due to this assert, we now also
>>> use the stored context. So now we get registers and instructions at the
>>> assert point.
>>>
>>> Right now, this is implemented on (non-zero) Linux, though other posix
>>> platforms should be no problem. Have not yet thought deeply about windows.
>>> It is tested and - in debug - switched on by default on linux x86, ppc,
>>> s390.
>>>
>>> If implemented, it can be switched on and off with
>>> -XX:+ShowRegistersOnAssert. This is a failsafe, in case the mechanism does
>>> not work and we want to have clean asserts.
>>>
>>> To test this, do a java -XX:ErrorHandlerTest=1 -XX:+ShowRegistersOnAssert
>>> with a not-product VM. On Linux x64, ppc and s390 we should now see the
>>> register output in the hs-err file:
>>>
>>>    4 #  Internal Error (/shared/projects/openjdk/jdk-
>>> hs/source/src/hotspot/share/utilities/vmError.cpp:1660), pid=4022,
>>> tid=4023
>>>    5 #  assert(str == NULL) failed: expected null
>>>    6 #
>>>    7 #
>>> ...
>>>   59 Registers:
>>>   60 RAX=0x00007f736c8f7000, RBX=0x0000000000000000,
>>> RCX=0x0000000000000000, RDX=0xffffffffffb5ded4
>>>   61 RSP=0x00007f736c8d8ce0, RBP=0x00007f736c8d8d30,
>>> RSI=0x0000000000000001, RDI=0x0000000000000001
>>>   62 R8 =0x0000000000000040, R9 =0x0000000000000001,
>>> R10=0x0000000000efb028, R11=0x00040788a244f42a
>>>   63 R12=0x0000000000000000, R13=0x00007fff3de46dbf,
>>> R14=0x00007f736c8d99c0, R15=0x0000000000000000
>>>   64 RIP=0x00007f736aec5529, EFLAGS=0x0000000000010202,
>>> CSGSFS=0x002b000000000033, ERR=0x0000000000000006
>>>   65   TRAPNO=0x000000000000000e
>>> ...
>>>   72
>>>   73 Instructions: (pc=0x00007f736aec5529)
>>>   74 0x00007f736aec5509:   8d 05 31 21 4a 00 48 01 d0 ff e0 48 83 7d c8 00
>>>   75 0x00007f736aec5519:   0f 84 7f 03 00 00 48 8d 05 82 fc b1 00 48 8b 00
>>>   76 0x00007f736aec5529:   c6 00 58 e8 cb 17 62 ff 84 c0 74 11 48 8d 3d 2f
>>>   77 0x00007f736aec5539:   1f 4a 00 b8 00 00 00 00 e8 ed 17 62 ff 48 8d 0d
>>>   78
>>>
>>> ---
>>>
>>> Implementation notes:
>>>
>>> - when handling the poison fault, we need to copy the context away from
>>> the signal handler stack. For posix, this means copying the ucontext_t.
>>> This is undefined territory. On most platforms, this simply means copying
>>> the ucontex_t as a flat structure. On some platforms more is needed, e.g.
>>> on linux ppc, we need to patch up the context after copying (the context is
>>> not position independent), and on MacOS, the context is not self-contained
>>> but contains pointers to sub structures which need to be copied too and
>>> whose size is unknown at compile time. Because of these platform
>>> dependencies, I factored out the copying of ucontext_t to
>>> os::Posix::copy_ucontext and its implementations are os_cpu specific.
>>>
>>> - As an added precaution, when copying the context, we use a safe version
>>> of memcpy (os::safe_memcpy) which I added to copy from potentially invalid
>>> memory regions. The reason is that we have seen on some Unices - e.g. hpux
>>> - that the size of the ucontext_t structure at runtime may be different
>>> from the build machine, so we tread carefully. os::safe_memcpy() uses
>>> SafeFetch to copy a range of memory.
>>>
>>> Risk:
>>>
>>> If this does not work, asserts will become segfaults, which can be
>>> confusing. But the feature can be disabled with -XX:+ShowRegistersOnAssert
>>> and for now on most platforms is disabled by default.
>>>
>>> ---
>>>
>>> Limitations:
>>> - as it is now implemented, this is a one-shot mechanism and only works
>>> for the first assert.
>>> - -XX:SuppressAt=... is not affected and works fine. However, if the
>>> first assert is suppressed, follow-up asserts will not show register values.
>>> - When multiple threads run into an assert, we may or may not see
>>> register values depending on which thread is the first of finishing the
>>> poison page handler.
>>>
>>> I do not think these limitations are severe. They can be solved, but at
>>> the cost of added complexity, which I preferred not to add.
>>>
>>> Thank you,
>>>
>>> Thomas
>>>
>>>
>>>
>>>
>>