RFR: 8256916: Add JFR event for OutOfMemoryError

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
OOM on Metaspace would be reported in `MetaspaceOOM` JFR event, however other OOM (e.g. Java heap) would not be reported.

It is useful if JFR reports OOMs.

-------------

Commit messages:
 - 8256916: Add JFR event for OutOfMemoryError

Changes: https://git.openjdk.java.net/jdk/pull/1403/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1403&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256916
  Stats: 94 lines in 5 files changed: 94 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1403/head:pull/1403

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Tue, 24 Nov 2020 04:31:53 GMT, Yasumasa Suenaga <[hidden email]> wrote:

> OOM on Metaspace would be reported in `MetaspaceOOM` JFR event, however other OOM (e.g. Java heap) would not be reported.
>
> It is useful if JFR reports OOMs.

All of failures on GitHub Actions are not caused by this PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Erik Gahlin-2
On Tue, 24 Nov 2020 07:01:56 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> OOM on Metaspace would be reported in `MetaspaceOOM` JFR event, however other OOM (e.g. Java heap) would not be reported.
>>
>> It is useful if JFR reports OOMs.
>
> All of failures on GitHub Actions are not caused by this PR.

Will this fix be able to handla all cases of OOM?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Tue, 24 Nov 2020 17:43:30 GMT, Erik Gahlin <[hidden email]> wrote:

> Will this fix be able to handla all cases of OOM?

This PR can handle all OOMs which are thrown by `Exceptions::_throw()` e.g.

* `Universe::out_of_memory_error_java_heap()`
* `Universe::out_of_memory_error_c_heap()`
* `Universe::out_of_memory_error_metaspace()`
* `Universe::out_of_memory_error_class_metaspace()`
* `Universe::out_of_memory_error_array_size()`
* `Universe::out_of_memory_error_gc_overhead_limit()`
* `Universe::out_of_memory_error_realloc_objects()`
* `Universe::out_of_memory_error_retry()`

However OOMs which are generated in Java ( `new OutOfMemoryError()` ) cannot be handled by this PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
In reply to this post by Erik Gahlin-2
On Tue, 24 Nov 2020 17:43:30 GMT, Erik Gahlin <[hidden email]> wrote:

>> All of failures on GitHub Actions are not caused by this PR.
>
> Will this fix be able to handla all cases of OOM?

@egahlin Is this answer sufficient for you about OOM handling? All comments are welcome.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Erik Gahlin-2
On Wed, 2 Dec 2020 04:15:38 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> Will this fix be able to handla all cases of OOM?
>
> @egahlin Is this answer sufficient for you about OOM handling? All comments are welcome.

I think there is a use case for getting an event when an OOM occurs, but the long term plan is to change so that a jdk.JavaErrorThrow is emitted when the exception is thrown, not when the object is allocated. When that is fixed, it will take care of OOME automatically - at all places. I think it is confusing for users to not get an event at every OOME and users/customers will file bugs.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Wed, 2 Dec 2020 07:19:19 GMT, Erik Gahlin <[hidden email]> wrote:

>> @egahlin Is this answer sufficient for you about OOM handling? All comments are welcome.
>
> I think there is a use case for getting an event when an OOM occurs, but the long term plan is to change so that a jdk.JavaErrorThrow is emitted when the exception is thrown, not when the object is allocated. When that is fixed, it will take care of OOME automatically - at all places. I think it is confusing for users to not get an event at every OOME and users/customers will file bugs.

Why OOME is not reported?
I guess it is hard corded at `ThrowableTracer::traceError`. If it is correct, because it is implemented in Java? (JFR code in Java might not be work when OOME happens)

`ErrorThrownEvent` has description that says OutOfMemoryErrors are ignored, and also OOM in metaspace is reported in other event (`MetaspaceOOM`). So I think we can report OOM other than `jdk.JavaErrorThrow` now until "long term plan" is realized.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
In reply to this post by Erik Gahlin-2
On Wed, 2 Dec 2020 07:19:19 GMT, Erik Gahlin <[hidden email]> wrote:

>> @egahlin Is this answer sufficient for you about OOM handling? All comments are welcome.
>
> I think there is a use case for getting an event when an OOM occurs, but the long term plan is to change so that a jdk.JavaErrorThrow is emitted when the exception is thrown, not when the object is allocated. When that is fixed, it will take care of OOME automatically - at all places. I think it is confusing for users to not get an event at every OOME and users/customers will file bugs.

@egahlin OOME from JVM will not be reported because they are instantiated at JVM init stage and are kept in `Universe`, so they cannot be handled in `ThrowableTracer::traceError`.
We can trace OOMEs which are happened in Java code at `ThrowableTracer::traceError`. However I'm not sure where the event should be handled in.

For sake of simply, I want to add `jdk.OutOfMemoryError` event to `jdk.jfr` module as a Java class, and I want to emit it from `ThrowableTracer::traceError`, however it duplicates with a definition in JVM.
OTOH we can implement to emit event from `ThrowableTracer::traceError` via `JVM` class, however need to add API to emit events with arbitrary arguments.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Mon, 21 Dec 2020 01:33:58 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> I think there is a use case for getting an event when an OOM occurs, but the long term plan is to change so that a jdk.JavaErrorThrow is emitted when the exception is thrown, not when the object is allocated. When that is fixed, it will take care of OOME automatically - at all places. I think it is confusing for users to not get an event at every OOME and users/customers will file bugs.
>
> @egahlin OOME from JVM will not be reported because they are instantiated at JVM init stage and are kept in `Universe`, so they cannot be handled in `ThrowableTracer::traceError`.
> We can trace OOMEs which are happened in Java code at `ThrowableTracer::traceError`. However I'm not sure where the event should be handled in.
>
> For sake of simply, I want to add `jdk.OutOfMemoryError` event to `jdk.jfr` module as a Java class, and I want to emit it from `ThrowableTracer::traceError`, however it duplicates with a definition in JVM.
> OTOH we can implement to emit event from `ThrowableTracer::traceError` via `JVM` class, however need to add API to emit events with arbitrary arguments.

`Universe` holds some of throwable objects such as `NullPointerException`, `VirtualMachineError`. They could be lost in flight record. This problem is not only for OOME.

So I think we need to track all `Throwable` objects in `Exceptions::_throw()` in HotSpot instead of bytecode injection.
`ThrowableTracer` has a counter to track num of throwing (`numThrowables`) which is used in `ExceptionStatisticsEvent`, so we also need to track it in HotSpot.

I want to hear opinions about it before making changes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Erik Gahlin-2
On Tue, 12 Jan 2021 06:59:37 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> @egahlin OOME from JVM will not be reported because they are instantiated at JVM init stage and are kept in `Universe`, so they cannot be handled in `ThrowableTracer::traceError`.
>> We can trace OOMEs which are happened in Java code at `ThrowableTracer::traceError`. However I'm not sure where the event should be handled in.
>>
>> For sake of simply, I want to add `jdk.OutOfMemoryError` event to `jdk.jfr` module as a Java class, and I want to emit it from `ThrowableTracer::traceError`, however it duplicates with a definition in JVM.
>> OTOH we can implement to emit event from `ThrowableTracer::traceError` via `JVM` class, however need to add API to emit events with arbitrary arguments.
>
> `Universe` holds some of throwable objects such as `NullPointerException`, `VirtualMachineError`. They could be lost in flight record. This problem is not only for OOME.
>
> So I think we need to track all `Throwable` objects in `Exceptions::_throw()` in HotSpot instead of bytecode injection.
> `ThrowableTracer` has a counter to track num of throwing (`numThrowables`) which is used in `ExceptionStatisticsEvent`, so we also need to track it in HotSpot.
>
> I want to hear opinions about it before making changes.

The right way to do this is to emit the event when the exception is thrown and I think it must happen inside the VM.

One problem is that the C2 compiler can eliminate the throw if it can prove the it can't be seen. That is the reason, we ended up with the current implementation many years ago. It was complicated to implement in the VM.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError [v2]

Yasumasa Suenaga-7
In reply to this post by Yasumasa Suenaga-7
> OOM on Metaspace would be reported in `MetaspaceOOM` JFR event, however other OOM (e.g. Java heap) would not be reported.
>
> It is useful if JFR reports OOMs.

Yasumasa Suenaga has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Move JavaErrorThrow and JavaExceptionThrow into HotSpot
 - Merge remote-tracking branch 'upstream/master' into JDK-8256916
 - 8256916: Add JFR event for OutOfMemoryError

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1403/files
  - new: https://git.openjdk.java.net/jdk/pull/1403/files/8f8799d6..7601b2f3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1403&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1403&range=00-01

  Stats: 122944 lines in 3018 files changed: 71590 ins; 25940 del; 25414 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1403/head:pull/1403

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
In reply to this post by Erik Gahlin-2
On Tue, 12 Jan 2021 21:45:32 GMT, Erik Gahlin <[hidden email]> wrote:

> The right way to do this is to emit the event when the exception is thrown and I think it must happen inside the VM.

I updated PR. Could you review?
Windows build on GitHub Actions failed, but it is not caused by this PR (due to configure error).

I removed bytecode injection to hook c'tor of `Error` and `Exception`. JavaErrorThrown and JavaExceptionThrown would be thrown from HotSpot. And also number of thrown is counted at HotSpot, so I added new JNI function to get it, it is used for ExceptionStatistics event.

> One problem is that the C2 compiler can eliminate the throw if it can prove the it can't be seen. That is the reason, we ended up with the current implementation many years ago. It was complicated to implement in the VM.

I added a function call which emits JavaErrorThrown and/or JavaExceptionThrown around `Exceptions::log_exception()` call. It is not completely solution for your concern, but we can hook most of throwable objects like unified logging.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Wed, 13 Jan 2021 14:13:30 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> The right way to do this is to emit the event when the exception is thrown and I think it must happen inside the VM.
>>
>> One problem is that the C2 compiler can eliminate the throw if it can prove the it can't be seen. That is the reason, we ended up with the current implementation many years ago. It was complicated to implement in the VM.
>
>> The right way to do this is to emit the event when the exception is thrown and I think it must happen inside the VM.
>
> I updated PR. Could you review?
> Windows build on GitHub Actions failed, but it is not caused by this PR (due to configure error).
>
> I removed bytecode injection to hook c'tor of `Error` and `Exception`. JavaErrorThrown and JavaExceptionThrown would be thrown from HotSpot. And also number of thrown is counted at HotSpot, so I added new JNI function to get it, it is used for ExceptionStatistics event.
>
>> One problem is that the C2 compiler can eliminate the throw if it can prove the it can't be seen. That is the reason, we ended up with the current implementation many years ago. It was complicated to implement in the VM.
>
> I added a function call which emits JavaErrorThrown and/or JavaExceptionThrown around `Exceptions::log_exception()` call. It is not completely solution for your concern, but we can hook most of throwable objects like unified logging.

This PR passed all tests in jdk/jfr on Linux x64.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Erik Gahlin-2
On Wed, 13 Jan 2021 14:26:39 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>>> The right way to do this is to emit the event when the exception is thrown and I think it must happen inside the VM.
>>
>> I updated PR. Could you review?
>> Windows build on GitHub Actions failed, but it is not caused by this PR (due to configure error).
>>
>> I removed bytecode injection to hook c'tor of `Error` and `Exception`. JavaErrorThrown and JavaExceptionThrown would be thrown from HotSpot. And also number of thrown is counted at HotSpot, so I added new JNI function to get it, it is used for ExceptionStatistics event.
>>
>>> One problem is that the C2 compiler can eliminate the throw if it can prove the it can't be seen. That is the reason, we ended up with the current implementation many years ago. It was complicated to implement in the VM.
>>
>> I added a function call which emits JavaErrorThrown and/or JavaExceptionThrown around `Exceptions::log_exception()` call. It is not completely solution for your concern, but we can hook most of throwable objects like unified logging.
>
> This PR passed all tests in jdk/jfr on Linux x64.

I don't have time to look at this now, I need to fix some high priority issues for JDK 16, but the problem is with C2. Several people have looked at this before.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Thu, 14 Jan 2021 14:40:34 GMT, Erik Gahlin <[hidden email]> wrote:

> I don't have time to look at this now, I need to fix some high priority issues for JDK 16, but the problem is with C2. Several people have looked at this before.

I agree with you the problem is also with C2, however current JFR implementation has also problem in exception events - they hook throwing exceptions lesser than unified logging. It should be fixed.

As you can see in this PR, to move throwing events into HotSpot makes big change, so I think it is worth to work now without solving the problem with C2.
(We will be able to make lesser change when we want to solve the problem with C2 later.)

So I hope other JFR folks review this PR if you are busy.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Markus Grönlund
On Fri, 15 Jan 2021 03:05:40 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> I don't have time to look at this now, I need to fix some high priority issues for JDK 16, but the problem is with C2. Several people have looked at this before.
>
>> I don't have time to look at this now, I need to fix some high priority issues for JDK 16, but the problem is with C2. Several people have looked at this before.
>
> I agree with you the problem is also with C2, however current JFR implementation has also problem in exception events - they hook throwing exceptions lesser than unified logging. It should be fixed.
>
> As you can see in this PR, to move throwing events into HotSpot makes big change, so I think it is worth to work now without solving the problem with C2.
> (We will be able to make lesser change when we want to solve the problem with C2 later.)
>
> So I hope other JFR folks review this PR if you are busy.

As you say: "move throwing events into HotSpot makes big change" - I agree and Erik has already stated that many have tried already to make this kind of change, but attempts have faltered because the impact / disruptions are unclear. Do you fully understand the consequences of how it will change the way users work with exceptions in JFR? If C2 optimizations now start to remove exception sites that were previously reported?

I understand and appreciate the ambition, and I acknowledge the existing mechanisms has drawbacks (known for a long time) but reworking how exceptions are reported is a big project. Such a project should also involve exception throttling - and throttling might pose additional constraints / opportunities that need to be considered.

Modifying this subsystem is a sensitive undertaking and should solve the fundamental and already known problems before it is to be endorsed and for the disruption to end user motivated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Fri, 15 Jan 2021 09:59:08 GMT, Markus Grönlund <[hidden email]> wrote:

> As you say: "move throwing events into HotSpot makes big change" - I agree and Erik has already stated that many have tried already to make this kind of change, but attempts have faltered because the impact / disruptions are unclear. Do you fully understand the consequences of how it will change the way users work with exceptions in JFR? If C2 optimizations now start to remove exception sites that were previously reported?

I believe the user who hooks JavaErrorThrow event wants to know the occurrence of OOME.
Let's think about the user who want to restart the system automatically if the fatal error happens.
The user defines OOME as an event that should be restarted and monitored by remote recording. If JavaErrorThrow event for OOME cannot be hooked, it is meaningless.

I understand all `Throwable` s cannot be hooked due to C2 optimization, but most of them should be hooked. They should be hooked as well as unified logging at least, and we can do it like this PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8256916: Add JFR event for OutOfMemoryError

Yasumasa Suenaga-7
On Tue, 19 Jan 2021 01:35:20 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> As you say: "move throwing events into HotSpot makes big change" - I agree and Erik has already stated that many have tried already to make this kind of change, but attempts have faltered because the impact / disruptions are unclear. Do you fully understand the consequences of how it will change the way users work with exceptions in JFR? If C2 optimizations now start to remove exception sites that were previously reported?
>>
>> I understand and appreciate the ambition, and I acknowledge the existing mechanisms has drawbacks (known for a long time) but reworking how exceptions are reported is a big project. Such a project should also involve exception throttling - and throttling might pose additional constraints / opportunities that need to be considered.
>>
>> Modifying this subsystem is a sensitive undertaking and should solve the fundamental and already known problems before it is to be endorsed and for the disruption to end user motivated.
>
>> As you say: "move throwing events into HotSpot makes big change" - I agree and Erik has already stated that many have tried already to make this kind of change, but attempts have faltered because the impact / disruptions are unclear. Do you fully understand the consequences of how it will change the way users work with exceptions in JFR? If C2 optimizations now start to remove exception sites that were previously reported?
>
> I believe the user who hooks JavaErrorThrow event wants to know the occurrence of OOME.
> Let's think about the user who want to restart the system automatically if the fatal error happens.
> The user defines OOME as an event that should be restarted and monitored by remote recording. If JavaErrorThrow event for OOME cannot be hooked, it is meaningless.
>
> I understand all `Throwable` s cannot be hooked due to C2 optimization, but most of them should be hooked. They should be hooked as well as unified logging at least, and we can do it like this PR.

I said in before, it is big change, but JFR should hook exceptions as possible like unified logging with `exceptions` tag.

I understand some exceptions cannot be hooked due to JIT compilation in this discussion, but even if so, it is a problem that JFR cannot hook various exceptions which are thrown in HotSpot.

Again, I understand to fix this problem is very difficult, but we should work for it as possible, or we cannot analyze the relation between exceptions and various metrics in flight record.
Of course I agree to separate to smaller fix.

Anyway, I want to record exceptions in JFR. It is a problem not to be able to hook critical errors such as OutOfMemoryError and StackOverflowError.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1403