RFR: 8264482: container info misleads on non-container environment

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFR: 8264482: container info misleads on non-container environment

Yasumasa Suenaga-7
hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.

container (cgroup) information:
container_type: cgroupv2
cpu_cpuset_cpus: not supported
cpu_memory_nodes: not supported
active_processor_count: 4
cpu_quota: not supported
cpu_period: not supported
cpu_shares: not supported
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 164163584
memory_max_usage_in_bytes: not supported

We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.

cgroup information:
cgroup_type: cgroupv2
container runtime: podman
cpu_cpuset_cpus: not supported
cpu_memory_nodes: not supported
active_processor_count: 4
cpu_quota: not supported
cpu_period: not supported
cpu_shares: not supported
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 256176128
memory_max_usage_in_bytes: not supported

In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.

https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619

-------------

Commit messages:
 - 8264482: container info misleads on non-container environment

Changes: https://git.openjdk.java.net/jdk/pull/3280/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3280&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8264482
  Stats: 44 lines in 3 files changed: 34 ins; 0 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3280.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3280/head:pull/3280

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Severin Gehwolf-3
On Wed, 31 Mar 2021 06:24:06 GMT, Yasumasa Suenaga <[hidden email]> wrote:

> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>
> container (cgroup) information:
> container_type: cgroupv2
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 164163584
> memory_max_usage_in_bytes: not supported
>
> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>
> cgroup information:
> cgroup_type: cgroupv2
> container runtime: podman
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 256176128
> memory_max_usage_in_bytes: not supported
>
> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>
> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619

I'm a bit nervous about the container_runtime addition. Can this be done in a separate bug, please? What testing has been done? Has this been tested with docker and podman? Did you run tests in `test/hotspot/jtreg/containers` ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Severin Gehwolf-3
In reply to this post by Yasumasa Suenaga-7
On Wed, 31 Mar 2021 06:24:06 GMT, Yasumasa Suenaga <[hidden email]> wrote:

> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>
> container (cgroup) information:
> container_type: cgroupv2
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 164163584
> memory_max_usage_in_bytes: not supported
>
> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>
> cgroup information:
> cgroup_type: cgroupv2
> container runtime: podman
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 256176128
> memory_max_usage_in_bytes: not supported
>
> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>
> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619

src/hotspot/os/linux/osContainer_linux.cpp line 75:

> 73:   if (getpid() == 1) {
> 74:     // This process is in container
> 75:     _runtime = os::strdup_check_oom(getenv("container"));

In my testing this shows `oci`:

$ podman run --rm -ti fedora:33
[root@2322a30ef7cd /]# echo $container
oci

So I'm not sure this will be very helpful. Systemd does some fairly involved translation:
https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L677

Those heuristics will involve a partial implementation of https://bugs.openjdk.java.net/browse/JDK-8261242

Also consider that there are multiple container runtimes when podman is in use (I don't know about docker). For example `crun` and `runc`. In a way, container runtime then becomes ambiguous too.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

David Holmes
In reply to this post by Yasumasa Suenaga-7
On 31/03/2021 7:51 pm, Yasumasa Suenaga wrote:

> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>
> container (cgroup) information:
> container_type: cgroupv2
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 164163584
> memory_max_usage_in_bytes: not supported
>
> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>
> cgroup information:
> cgroup_type: cgroupv2
> container runtime: podman
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 256176128
> memory_max_usage_in_bytes: not supported
>
> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>
> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619

Our container support is based around cgroups. The actual containers are
still too ad-hoc to reliably interact with. I would not want to see this
additional code added at startup up time, but rather the container
environment should be interrogated when the information is desired. It
really bugs me that there are no (even informally) standardized API's
around containers and we have to provide custom support to deal with
each of them.

David
-----

> -------------
>
> Commit messages:
>   - 8264482: container info misleads on non-container environment
>
> Changes: https://git.openjdk.java.net/jdk/pull/3280/files
>   Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3280&range=00
>    Issue: https://bugs.openjdk.java.net/browse/JDK-8264482
>    Stats: 44 lines in 3 files changed: 34 ins; 0 del; 10 mod
>    Patch: https://git.openjdk.java.net/jdk/pull/3280.diff
>    Fetch: git fetch https://git.openjdk.java.net/jdk pull/3280/head:pull/3280
>
> PR: https://git.openjdk.java.net/jdk/pull/3280
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Yasumasa Suenaga-7
In reply to this post by Severin Gehwolf-3
On Wed, 31 Mar 2021 11:57:40 GMT, Severin Gehwolf <[hidden email]> wrote:

>> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>>
>> container (cgroup) information:
>> container_type: cgroupv2
>> cpu_cpuset_cpus: not supported
>> cpu_memory_nodes: not supported
>> active_processor_count: 4
>> cpu_quota: not supported
>> cpu_period: not supported
>> cpu_shares: not supported
>> memory_limit_in_bytes: unlimited
>> memory_and_swap_limit_in_bytes: unlimited
>> memory_soft_limit_in_bytes: unlimited
>> memory_usage_in_bytes: 164163584
>> memory_max_usage_in_bytes: not supported
>>
>> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
>> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>>
>> cgroup information:
>> cgroup_type: cgroupv2
>> container runtime: podman
>> cpu_cpuset_cpus: not supported
>> cpu_memory_nodes: not supported
>> active_processor_count: 4
>> cpu_quota: not supported
>> cpu_period: not supported
>> cpu_shares: not supported
>> memory_limit_in_bytes: unlimited
>> memory_and_swap_limit_in_bytes: unlimited
>> memory_soft_limit_in_bytes: unlimited
>> memory_usage_in_bytes: 256176128
>> memory_max_usage_in_bytes: not supported
>>
>> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>>
>> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619
>
> I'm a bit nervous about the container_runtime addition. Can this be done in a separate bug, please? What testing has been done? Has this been tested with docker and podman? Did you run tests in `test/hotspot/jtreg/containers` ?

Thanks @jerboaa and @dholmes-ora for the comment!
I haven't aware JDK-8261242, so I will remove the code to set container runtime name.

The reason why I sent this PR is I found "container_type" in hs_err log even though I did not run JVM on the container. It was confising. I agree to check cgroups configuration, but we should not use "container" for them. cgroups is not only for containers.

> Did you run tests in test/hotspot/jtreg/containers ?

I haven't run them yet. I attempted to fix `OSContainer::is_containerized()` as JDK-8261242 said, but I do not have environment to run test/hotspot/jtreg/containers test. So I fixed UL output only, and confirmed it does not affect their tests.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Yasumasa Suenaga-7
In reply to this post by Severin Gehwolf-3
On Wed, 31 Mar 2021 12:22:36 GMT, Severin Gehwolf <[hidden email]> wrote:

>> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>>
>> container (cgroup) information:
>> container_type: cgroupv2
>> cpu_cpuset_cpus: not supported
>> cpu_memory_nodes: not supported
>> active_processor_count: 4
>> cpu_quota: not supported
>> cpu_period: not supported
>> cpu_shares: not supported
>> memory_limit_in_bytes: unlimited
>> memory_and_swap_limit_in_bytes: unlimited
>> memory_soft_limit_in_bytes: unlimited
>> memory_usage_in_bytes: 164163584
>> memory_max_usage_in_bytes: not supported
>>
>> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
>> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>>
>> cgroup information:
>> cgroup_type: cgroupv2
>> container runtime: podman
>> cpu_cpuset_cpus: not supported
>> cpu_memory_nodes: not supported
>> active_processor_count: 4
>> cpu_quota: not supported
>> cpu_period: not supported
>> cpu_shares: not supported
>> memory_limit_in_bytes: unlimited
>> memory_and_swap_limit_in_bytes: unlimited
>> memory_soft_limit_in_bytes: unlimited
>> memory_usage_in_bytes: 256176128
>> memory_max_usage_in_bytes: not supported
>>
>> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>>
>> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619
>
> src/hotspot/os/linux/osContainer_linux.cpp line 75:
>
>> 73:   if (getpid() == 1) {
>> 74:     // This process is in container
>> 75:     _runtime = os::strdup_check_oom(getenv("container"));
>
> In my testing this shows `oci`:
>
> $ podman run --rm -ti fedora:33
> [root@2322a30ef7cd /]# echo $container
> oci
>
> So I'm not sure this will be very helpful. Systemd does some fairly involved translation:
> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L677
>
> Those heuristics will involve a partial implementation of https://bugs.openjdk.java.net/browse/JDK-8261242
>
> Also consider that there are multiple container runtimes when podman is in use (I don't know about docker). For example `crun` and `runc`. In a way, container runtime then becomes ambiguous too.

I think it is ok if we can know the process was run on a container. It is better to know what runtime was used if possible - it might not be mandatory. So I think we can refer /run/.containerenv and /.dockerenv now like systemd. It might not be enough, but it would be acceptable.

JDK-8261242 has been assigned to Harold, so I will not start to work for it now.
(I can work for it of course if you agree with the above.)

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

Yasumasa Suenaga-7
In reply to this post by Yasumasa Suenaga-7
> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>
> container (cgroup) information:
> container_type: cgroupv2
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 164163584
> memory_max_usage_in_bytes: not supported
>
> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>
> cgroup information:
> cgroup_type: cgroupv2
> container runtime: podman
> cpu_cpuset_cpus: not supported
> cpu_memory_nodes: not supported
> active_processor_count: 4
> cpu_quota: not supported
> cpu_period: not supported
> cpu_shares: not supported
> memory_limit_in_bytes: unlimited
> memory_and_swap_limit_in_bytes: unlimited
> memory_soft_limit_in_bytes: unlimited
> memory_usage_in_bytes: 256176128
> memory_max_usage_in_bytes: not supported
>
> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>
> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619

Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision:

  Remove code to detect container runtime

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3280/files
  - new: https://git.openjdk.java.net/jdk/pull/3280/files/42862529..8fd9b3f9

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3280&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3280&range=00-01

  Stats: 38 lines in 3 files changed: 0 ins; 34 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3280.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3280/head:pull/3280

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Yasumasa Suenaga-7
In reply to this post by Yasumasa Suenaga-7
On Wed, 31 Mar 2021 14:05:37 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> I'm a bit nervous about the container_runtime addition. Can this be done in a separate bug, please? What testing has been done? Has this been tested with docker and podman? Did you run tests in `test/hotspot/jtreg/containers` ?
>
> Thanks @jerboaa and @dholmes-ora for the comment!
> I haven't aware JDK-8261242, so I will remove the code to set container runtime name.
>
> The reason why I sent this PR is I found "container_type" in hs_err log even though I did not run JVM on the container. It was confising. I agree to check cgroups configuration, but we should not use "container" for them. cgroups is not only for containers.
>
>> Did you run tests in test/hotspot/jtreg/containers ?
>
> I haven't run them yet. I attempted to fix `OSContainer::is_containerized()` as JDK-8261242 said, but I do not have environment to run test/hotspot/jtreg/containers test. So I fixed UL output only, and confirmed it does not affect their tests.

I removed container runtime detection code - this PR changes log / hs_err / VM.info message only at result.
How about this? please review it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Yasumasa Suenaga-7
On Thu, 1 Apr 2021 13:00:32 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> Thanks @jerboaa and @dholmes-ora for the comment!
>> I haven't aware JDK-8261242, so I will remove the code to set container runtime name.
>>
>> The reason why I sent this PR is I found "container_type" in hs_err log even though I did not run JVM on the container. It was confising. I agree to check cgroups configuration, but we should not use "container" for them. cgroups is not only for containers.
>>
>>> Did you run tests in test/hotspot/jtreg/containers ?
>>
>> I haven't run them yet. I attempted to fix `OSContainer::is_containerized()` as JDK-8261242 said, but I do not have environment to run test/hotspot/jtreg/containers test. So I fixed UL output only, and confirmed it does not affect their tests.
>
> I removed container runtime detection code - this PR changes log / hs_err / VM.info message only at result.
> How about this? please review it.

Commit 8fd9b3f95d40783ef3b70769ba0381f55df832a5 seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Gerard Ziemski-2
On Fri, 2 Apr 2021 00:05:40 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> I removed container runtime detection code - this PR changes log / hs_err / VM.info message only at result.
>> How about this? please review it.
>
> Commit 8fd9b3f95d40783ef3b70769ba0381f55df832a5 seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.

> Commit [8fd9b3f](https://github.com/openjdk/jdk/commit/8fd9b3f95d40783ef3b70769ba0381f55df832a5) seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.

I'll check it out on my local macOS machine and report back...

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Gerard Ziemski-2
On Mon, 5 Apr 2021 16:22:13 GMT, Gerard Ziemski <[hidden email]> wrote:

> > Commit [8fd9b3f](https://github.com/openjdk/jdk/commit/8fd9b3f95d40783ef3b70769ba0381f55df832a5) seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.
>
> I'll check it out on my local macOS machine and report back...

On my local mac with the latest jdk+PR3280 I see all tests pass:

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR  
   jtreg:test/hotspot/jtreg:tier1_compiler             783   783     0     0  
==============================
TEST SUCCESS

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment

Yasumasa Suenaga-7
On Mon, 5 Apr 2021 17:56:35 GMT, Gerard Ziemski <[hidden email]> wrote:

>>> Commit [8fd9b3f](https://github.com/openjdk/jdk/commit/8fd9b3f95d40783ef3b70769ba0381f55df832a5) seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.
>>
>> I'll check it out on my local macOS machine and report back...
>
>> > Commit [8fd9b3f](https://github.com/openjdk/jdk/commit/8fd9b3f95d40783ef3b70769ba0381f55df832a5) seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.
>>
>> I'll check it out on my local macOS machine and report back...
>
> On my local mac with the latest jdk+PR3280 I see all tests pass:
>
> ==============================
> Test summary
> ==============================
>    TEST                                              TOTAL  PASS  FAIL ERROR  
>    jtreg:test/hotspot/jtreg:tier1_compiler             783   783     0     0  
> ==============================
> TEST SUCCESS

Thanks @gerard-ziemski !
I'm waiting for reviewers...

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

David Holmes-2
In reply to this post by Yasumasa Suenaga-7
On Thu, 1 Apr 2021 13:03:50 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.
>>
>> container (cgroup) information:
>> container_type: cgroupv2
>> cpu_cpuset_cpus: not supported
>> cpu_memory_nodes: not supported
>> active_processor_count: 4
>> cpu_quota: not supported
>> cpu_period: not supported
>> cpu_shares: not supported
>> memory_limit_in_bytes: unlimited
>> memory_and_swap_limit_in_bytes: unlimited
>> memory_soft_limit_in_bytes: unlimited
>> memory_usage_in_bytes: 164163584
>> memory_max_usage_in_bytes: not supported
>>
>> We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
>> And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.
>>
>> cgroup information:
>> cgroup_type: cgroupv2
>> container runtime: podman
>> cpu_cpuset_cpus: not supported
>> cpu_memory_nodes: not supported
>> active_processor_count: 4
>> cpu_quota: not supported
>> cpu_period: not supported
>> cpu_shares: not supported
>> memory_limit_in_bytes: unlimited
>> memory_and_swap_limit_in_bytes: unlimited
>> memory_soft_limit_in_bytes: unlimited
>> memory_usage_in_bytes: 256176128
>> memory_max_usage_in_bytes: not supported
>>
>> In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.
>>
>> https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619
>
> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision:
>
>   Remove code to detect container runtime

Sorry Yasumasa but I don't agree with this. OSContainer is supposed to be the abstraction representing a containerized environment. Calling code doesn't know (nor care) what the underlying mechanism is - maybe it is cgroups, maybe it isn't. If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.

David

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

Yasumasa Suenaga-7
On Tue, 6 Apr 2021 01:38:50 GMT, David Holmes <[hidden email]> wrote:

> If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.

To resolve this problem, I think [JDK-8261242](https://bugs.openjdk.java.net/browse/JDK-8261242) is filed. But the discussion does not seem to active...  it might take a lot of time to fix because there might not be clear solution to detect the process runs on the container.

As I said in before, I want to remove "container" from log output because it makes to mislead. Especially next JDK release is LTS, so the code will use a long time.
As an option, we can move logging code into `CgroupSubsystem`, but it might not be better...

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

David Holmes
On 6/04/2021 12:31 pm, Yasumasa Suenaga wrote:
> On Tue, 6 Apr 2021 01:38:50 GMT, David Holmes <[hidden email]> wrote:
>
>> If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.
>
> To resolve this problem, I think [JDK-8261242](https://bugs.openjdk.java.net/browse/JDK-8261242) is filed. But the discussion does not seem to active...  it might take a lot of time to fix because there might not be clear solution to detect the process runs on the container.
>
> As I said in before, I want to remove "container" from log output because it makes to mislead. Especially next JDK release is LTS, so the code will use a long time.
> As an option, we can move logging code into `CgroupSubsystem`, but it might not be better...

If you are actually running under a cgroup and have resource limits
applied then aren't you effectively running in a "container"? its just
not a container that has been productized like Docker et al.

David


> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/3280
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

Yasumasa Suenaga-7
In reply to this post by Yasumasa Suenaga-7
On Tue, 6 Apr 2021 02:28:46 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> Sorry Yasumasa but I don't agree with this. OSContainer is supposed to be the abstraction representing a containerized environment. Calling code doesn't know (nor care) what the underlying mechanism is - maybe it is cgroups, maybe it isn't. If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.
>>
>> David
>
>> If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.
>
> To resolve this problem, I think [JDK-8261242](https://bugs.openjdk.java.net/browse/JDK-8261242) is filed. But the discussion does not seem to active...  it might take a lot of time to fix because there might not be clear solution to detect the process runs on the container.
>
> As I said in before, I want to remove "container" from log output because it makes to mislead. Especially next JDK release is LTS, so the code will use a long time.
> As an option, we can move logging code into `CgroupSubsystem`, but it might not be better...

> If you are actually running under a cgroup and have resource limits
> applied then aren't you effectively running in a "container"?

No, we can see container logs even if the process is not limited by cgroups. I (and maybe JDK-8261242) want to say it is a problem.

HotSpot detects whether it is run on container in following code. It checks whether specified directories exist.

https://github.com/openjdk/jdk/blob/54b4070da76e79597a57412a39b85660dc49ce7c/src/hotspot/os/linux/cgroupSubsystem_linux.cpp#L44-L48

According to [cgoups manpage](https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES), `/proc/cgroups` seems to generated by kernel, so we cannot use them to detect.
I think we should fix like systemd, and JDK-8261242 has been discussed on to do so.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

Severin Gehwolf-3
On Tue, 6 Apr 2021 03:26:52 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>>> If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.
>>
>> To resolve this problem, I think [JDK-8261242](https://bugs.openjdk.java.net/browse/JDK-8261242) is filed. But the discussion does not seem to active...  it might take a lot of time to fix because there might not be clear solution to detect the process runs on the container.
>>
>> As I said in before, I want to remove "container" from log output because it makes to mislead. Especially next JDK release is LTS, so the code will use a long time.
>> As an option, we can move logging code into `CgroupSubsystem`, but it might not be better...
>
>> If you are actually running under a cgroup and have resource limits
>> applied then aren't you effectively running in a "container"?
>
> No, we can see container logs even if the process is not limited by cgroups. I (and maybe JDK-8261242) want to say it is a problem.
>
> HotSpot detects whether it is run on container in following code. It checks whether specified directories exist.
>
> https://github.com/openjdk/jdk/blob/54b4070da76e79597a57412a39b85660dc49ce7c/src/hotspot/os/linux/cgroupSubsystem_linux.cpp#L44-L48
>
> According to [cgoups manpage](https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES), `/proc/cgroups` seems to generated by kernel, so we cannot use them to detect.
> I think we should fix like systemd, and JDK-8261242 has been discussed on to do so.

@YaSuenag Unfortunately that's the status quo, yes. On a plain linux system without cgroup limits for the java process, but with the cgroup subsystem mounted, it'll report some cgroup (container) metrics (same as the host values, though). What I'm unsure about is whether or not this is a problem worth fixing. The main idea is abstraction from cgroups in the container code. It does say `container (cgroup) information` in the heading.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

Yasumasa Suenaga-7
On Tue, 6 Apr 2021 10:06:18 GMT, Severin Gehwolf <[hidden email]> wrote:

> What I'm unsure about is whether or not this is a problem worth fixing.

I think fundamental solutions will be implemented in JDK-8261242, but It seems to be a difficult task.

> The main idea is abstraction from cgroups in the container code. It does say `container (cgroup) information` in the heading.

I know that, but "container_type" label is in hs_err / VM.info log. In addition, UL reports some metrics as "container" in spite of cgroups.
If it takes a long time to fix JDK-8261242, I want to fix log messages for cgroups for next LTS release at least. "container" messages may make the customer do extra work to collect / analyzer their system.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

David Holmes
On 6/04/2021 10:09 pm, Yasumasa Suenaga wrote:

> On Tue, 6 Apr 2021 10:06:18 GMT, Severin Gehwolf <[hidden email]> wrote:
>
>> What I'm unsure about is whether or not this is a problem worth fixing.
>
> I think fundamental solutions will be implemented in JDK-8261242, but It seems to be a difficult task.
>
>> The main idea is abstraction from cgroups in the container code. It does say `container (cgroup) information` in the heading.
>
> I know that, but "container_type" label is in hs_err / VM.info log. In addition, UL reports some metrics as "container" in spite of cgroups.
> If it takes a long time to fix JDK-8261242, I want to fix log messages for cgroups for next LTS release at least. "container" messages may make the customer do extra work to collect / analyzer their system.

Then can we compromise and just say "container/cgroup" everywhere?

David

> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/3280
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264482: container info misleads on non-container environment [v2]

Yasumasa Suenaga-7
In reply to this post by Yasumasa Suenaga-7
On Tue, 6 Apr 2021 12:06:37 GMT, Yasumasa Suenaga <[hidden email]> wrote:

>> @YaSuenag Unfortunately that's the status quo, yes. On a plain linux system without cgroup limits for the java process, but with the cgroup subsystem mounted, it'll report some cgroup (container) metrics (same as the host values, though). What I'm unsure about is whether or not this is a problem worth fixing. The main idea is abstraction from cgroups in the container code. It does say `container (cgroup) information` in the heading.
>
>> What I'm unsure about is whether or not this is a problem worth fixing.
>
> I think fundamental solutions will be implemented in JDK-8261242, but It seems to be a difficult task.
>
>> The main idea is abstraction from cgroups in the container code. It does say `container (cgroup) information` in the heading.
>
> I know that, but "container_type" label is in hs_err / VM.info log. In addition, UL reports some metrics as "container" in spite of cgroups.
> If it takes a long time to fix JDK-8261242, I want to fix log messages for cgroups for next LTS release at least. "container" messages may make the customer do extra work to collect / analyzer their system.

> Then can we compromise and just say "container/cgroup" everywhere?

How about following change?
It is easy to add new resource controller other than cgroups in future.

// get_resource_container_name() will return "cgroups" or "cgroupsV2"
const char *resource_controller_name = OSContainer::get_resource_container_name();

log_debug(os, container)("%s memory limit %s: " JLONG_FORMAT ", using host value",
                       resource_controller_name,
                       mem_limit == OSCONTAINER_ERROR ? "failed" : "unlimited", mem_limit);

-------------

PR: https://git.openjdk.java.net/jdk/pull/3280
12