RFR: 8203359: Container level resources events

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8203359: Container level resources events

Jaroslav Bachorik-3
With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.

Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
* CPU related metrics
* Memory related metrics
* I/O related metrics

For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

-------------

Commit messages:
 - Formatting spaces
 - 8203359: Container level resources events

Changes: https://git.openjdk.java.net/jdk/pull/3126/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8203359
  Stats: 389 lines in 8 files changed: 386 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events [v2]

Jaroslav Bachorik-3
> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision:

  Split off the CPU throttling metrics

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3126/files
  - new: https://git.openjdk.java.net/jdk/pull/3126/files/9ba3c5a6..90de0d8c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=00-01

  Stats: 88 lines in 3 files changed: 73 ins; 15 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events [v3]

Jaroslav Bachorik-3
In reply to this post by Jaroslav Bachorik-3
> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision:

  Update the JFR control files

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3126/files
  - new: https://git.openjdk.java.net/jdk/pull/3126/files/90de0d8c..82570022

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=01-02

  Stats: 10 lines in 2 files changed: 10 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events

Severin Gehwolf-3
In reply to this post by Jaroslav Bachorik-3
On Mon, 22 Mar 2021 15:57:12 GMT, Jaroslav Bachorik <[hidden email]> wrote:

> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

@jbachorik Would it make sense for `ContainerConfigurationEvent` to include the underlying cgroup version info (v1 or legacy vs. v2 or unified)? `Metrics.getProvider()` should give that info.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events

Erik Gahlin-2
On Wed, 24 Mar 2021 18:39:06 GMT, Severin Gehwolf <[hidden email]> wrote:

>> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>>
>> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
>> * CPU related metrics
>> * Memory related metrics
>> * I/O related metrics
>>
>> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
>> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).
>
> @jbachorik Would it make sense for `ContainerConfigurationEvent` to include the underlying cgroup version info (v1 or legacy vs. v2 or unified)? `Metrics.getProvider()` should give that info.

Does each getter call result in parsing /proc, or do things aggregated over several calls or hooks?

Do you have any data how expensive the invocations are?

You could for example try to measure it by temporary making the events durational, and fetch the values between begin() and end(), and perhaps show a 'jfr print --events Container* recording.jfr' printout.

If possible, it would be interesting to get some idea about the startup cost as well

If not too much overhead, I think it would be nice to skip the "flag" in the .jfcs, and always record the events in a container environment.

I know there is a way to test JFR using Docker, maybe @mseledts could provide information? Some sanity tests would be good to have.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events

Severin Gehwolf-3
On Thu, 25 Mar 2021 23:28:18 GMT, Erik Gahlin <[hidden email]> wrote:

> Does each getter call result in parsing /proc, or do things aggregated over several calls or hooks?

From the looks of it the event emitting code uses `Metrics.java` interface for retrieving the info. Each call to a method exposed by Metrics result in file IO on some cgroup (v1 or v2) interface file(s) in `/sys/fs/...`. I don't see any aggregation being done.

On the hotspot side, we implemented some caching for frequent calls (JDK-8232207, JDK-8227006), but we didn't do that yet for the Java side since there wasn't any need (so far). If calls are becoming frequent with this it should be reconsidered.

So +1 on getting some data on what the perf penalty of this is.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events [v4]

Jaroslav Bachorik-3
In reply to this post by Jaroslav Bachorik-3
> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

Jaroslav Bachorik has updated the pull request incrementally with two additional commits since the last revision:

 - Remove unused test files
 - Initial test support for JFR container events

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3126/files
  - new: https://git.openjdk.java.net/jdk/pull/3126/files/82570022..79c91f57

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=02-03

  Stats: 117 lines in 4 files changed: 99 ins; 6 del; 12 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events

Jaroslav Bachorik-3
In reply to this post by Severin Gehwolf-3
On Fri, 26 Mar 2021 11:05:56 GMT, Severin Gehwolf <[hidden email]> wrote:

>> Does each getter call result in parsing /proc, or do things aggregated over several calls or hooks?
>>
>> Do you have any data how expensive the invocations are?
>>
>> You could for example try to measure it by temporary making the events durational, and fetch the values between begin() and end(), and perhaps show a 'jfr print --events Container* recording.jfr' printout.
>>
>> If possible, it would be interesting to get some idea about the startup cost as well
>>
>> If not too much overhead, I think it would be nice to skip the "flag" in the .jfcs, and always record the events in a container environment.
>>
>> I know there is a way to test JFR using Docker, maybe @mseledts could provide information? Some sanity tests would be good to have.
>
>> Does each getter call result in parsing /proc, or do things aggregated over several calls or hooks?
>
> From the looks of it the event emitting code uses `Metrics.java` interface for retrieving the info. Each call to a method exposed by Metrics result in file IO on some cgroup (v1 or v2) interface file(s) in `/sys/fs/...`. I don't see any aggregation being done.
>
> On the hotspot side, we implemented some caching for frequent calls (JDK-8232207, JDK-8227006), but we didn't do that yet for the Java side since there wasn't any need (so far). If calls are becoming frequent with this it should be reconsidered.
>
> So +1 on getting some data on what the perf penalty of this is.

Thanks to all for chiming in!

I have added the tests to `test/hotspot/jtreg/containers/docker/TestJFREvents.java` where there already were some templates for the container event data.

As for the performance - as expected, extracting the data from `/proc` is not exactly cheap. On my test c5.4xlarge instance I am getting an average wall-clock time to generate the usage/throttling events (one instance of each) of ~15ms.
I would argue that 15ms per 30s (the default emission period for those events) might be acceptable to start with.

Caching of cgroups parsed data would help if the emission period is shorter than the cache TTL. This is exacerbated by the fact that (almost) each container event type requires data from a different cgroups control file - hence the data will not be shared between the event type instances even if cached. Realistically, caching benefits would become visible only for sub-second emission periods.

If the caching is still required I would suggest having a follow up ticket just for that - it will require setting up some benchmarks to justify the changes that would need to be done in the metrics implementation.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events

Jaroslav Bachorik-3
On Thu, 1 Apr 2021 15:55:59 GMT, Jaroslav Bachorik <[hidden email]> wrote:

>>> Does each getter call result in parsing /proc, or do things aggregated over several calls or hooks?
>>
>> From the looks of it the event emitting code uses `Metrics.java` interface for retrieving the info. Each call to a method exposed by Metrics result in file IO on some cgroup (v1 or v2) interface file(s) in `/sys/fs/...`. I don't see any aggregation being done.
>>
>> On the hotspot side, we implemented some caching for frequent calls (JDK-8232207, JDK-8227006), but we didn't do that yet for the Java side since there wasn't any need (so far). If calls are becoming frequent with this it should be reconsidered.
>>
>> So +1 on getting some data on what the perf penalty of this is.
>
> Thanks to all for chiming in!
>
> I have added the tests to `test/hotspot/jtreg/containers/docker/TestJFREvents.java` where there already were some templates for the container event data.
>
> As for the performance - as expected, extracting the data from `/proc` is not exactly cheap. On my test c5.4xlarge instance I am getting an average wall-clock time to generate the usage/throttling events (one instance of each) of ~15ms.
> I would argue that 15ms per 30s (the default emission period for those events) might be acceptable to start with.
>
> Caching of cgroups parsed data would help if the emission period is shorter than the cache TTL. This is exacerbated by the fact that (almost) each container event type requires data from a different cgroups control file - hence the data will not be shared between the event type instances even if cached. Realistically, caching benefits would become visible only for sub-second emission periods.
>
> If the caching is still required I would suggest having a follow up ticket just for that - it will require setting up some benchmarks to justify the changes that would need to be done in the metrics implementation.

I tried to measure the startup regression and here are my observations:
* Startup is not affected unless the application is started with JFR
* The extra events and hooks take ~5ms on my work machine
* It is possible not to register those events and hooks in a non-container env - then the overhead is 20-50us which it takes to figure out whether running in container

In order to minimize the effect this change will have on the startup I would suggest using conditional registration unless I hear strong objections to that.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events [v5]

Jaroslav Bachorik-3
In reply to this post by Jaroslav Bachorik-3
> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision:

  Report container type and register events conditionally

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3126/files
  - new: https://git.openjdk.java.net/jdk/pull/3126/files/79c91f57..2514b1a7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=03-04

  Stats: 41 lines in 2 files changed: 24 ins; 8 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events [v6]

Jaroslav Bachorik-3
In reply to this post by Jaroslav Bachorik-3
> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision:

  Doh

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3126/files
  - new: https://git.openjdk.java.net/jdk/pull/3126/files/2514b1a7..78dc85d3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=04-05

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8203359: Container level resources events [v7]

Jaroslav Bachorik-3
In reply to this post by Jaroslav Bachorik-3
> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>
> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
> * CPU related metrics
> * Memory related metrics
> * I/O related metrics
>
> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval.
> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).

Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision:

  Remove trailing spaces

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3126/files
  - new: https://git.openjdk.java.net/jdk/pull/3126/files/78dc85d3..f4372e23

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3126&range=05-06

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3126/head:pull/3126

PR: https://git.openjdk.java.net/jdk/pull/3126