RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed

Jie Fu-2
Hi all,

jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
This is because testMemoryFailCount [1] fails due to OOM killed.
This test fails to avoid OOM killed [2] if memory.failcnt is always 0.

The fix will print "Not OOM killed" if OOM killed doesn't happen.
And also fix another bug if the test get returned here [3].

Testing:
  - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
[2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
[3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96

-------------

Commit messages:
 - 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed

Changes: https://git.openjdk.java.net/jdk/pull/3286/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3286&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8264524
  Stats: 14 lines in 2 files changed: 11 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3286.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3286/head:pull/3286

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed

Severin Gehwolf-3
On Wed, 31 Mar 2021 15:53:28 GMT, Jie Fu <[hidden email]> wrote:

> Hi all,
>
> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
> This is because testMemoryFailCount [1] fails due to OOM killed.
> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>
> The fix will print "Not OOM killed" if OOM killed doesn't happen.
> And also fix another bug if the test get returned here [3].
>
> Testing:
>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>
> Thanks.
> Best regards,
> Jie
>
> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96

test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java line 120:

> 118:                 oa.shouldHaveExitValue(0).shouldContain("TEST PASSED!!!");
> 119:             }
> 120:         }

Consider a broken implementation of `Metrics.systemMetrics().getMemoryFailCount()`. Wouldn't the test now (falsely) pass?

What is the actual test output on those systems where the test fails? There should be a docker log file. Does it enter line 91?

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed

Jie Fu-2
On Wed, 31 Mar 2021 17:14:32 GMT, Severin Gehwolf <[hidden email]> wrote:

>> Hi all,
>>
>> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
>> This is because testMemoryFailCount [1] fails due to OOM killed.
>> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>>
>> The fix will print "Not OOM killed" if OOM killed doesn't happen.
>> And also fix another bug if the test get returned here [3].
>>
>> Testing:
>>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>>
>> Thanks.
>> Best regards,
>> Jie
>>
>> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
>> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96
>
> test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java line 120:
>
>> 118:                 oa.shouldHaveExitValue(0).shouldContain("TEST PASSED!!!");
>> 119:             }
>> 120:         }
>
> Consider a broken implementation of `Metrics.systemMetrics().getMemoryFailCount()`. Wouldn't the test now (falsely) pass?
>
> What is the actual test output on those systems where the test fails? There should be a docker log file. Does it enter line 91?

Thanks @jerboaa for your review.

Test output when failing is just:
[failcount]

It doesn't enter line 91 of test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java.


I assume memory.failcnt has been disabled on these platforms since it's always 0.
What about checking the value of memory.failcnt after testMemoryFailCount like:
// check after testMemoryFailCount()
if (memory.failcnt is zero) {
  // memory.failcnt has been disabled
  pass
} else {
  // a broken implementation of Metrics.systemMetrics().getMemoryFailCount()
  fail
}
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed

Jie Fu-2
In reply to this post by Severin Gehwolf-3
On Wed, 31 Mar 2021 17:14:32 GMT, Severin Gehwolf <[hidden email]> wrote:

>> Hi all,
>>
>> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
>> This is because testMemoryFailCount [1] fails due to OOM killed.
>> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>>
>> The fix will print "Not OOM killed" if OOM killed doesn't happen.
>> And also fix another bug if the test get returned here [3].
>>
>> Testing:
>>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>>
>> Thanks.
>> Best regards,
>> Jie
>>
>> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
>> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96
>
> test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java line 120:
>
>> 118:                 oa.shouldHaveExitValue(0).shouldContain("TEST PASSED!!!");
>> 119:             }
>> 120:         }
>
> Consider a broken implementation of `Metrics.systemMetrics().getMemoryFailCount()`. Wouldn't the test now (falsely) pass?
>
> What is the actual test output on those systems where the test fails? There should be a docker log file. Does it enter line 91?

Hi @jerboaa ,

After more thinking, I think the reason why memory.failcnt is always 0 is that there is no swap space on the host machine.
So the testMemoryFailCount should be skipped in that case.

But is there any API which can be used to get the swap space size of the host machine?

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed [v2]

Jie Fu-2
In reply to this post by Jie Fu-2
> Hi all,
>
> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
> This is because testMemoryFailCount [1] fails due to OOM killed.
> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>
> The fix will print "Not OOM killed" if OOM killed doesn't happen.
> And also fix another bug if the test get returned here [3].
>
> Testing:
>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>
> Thanks.
> Best regards,
> Jie
>
> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96

Jie Fu has updated the pull request incrementally with one additional commit since the last revision:

  Check swapping before testing

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3286/files
  - new: https://git.openjdk.java.net/jdk/pull/3286/files/09468a16..3e1c446d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3286&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3286&range=00-01

  Stats: 28 lines in 2 files changed: 16 ins; 6 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3286.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3286/head:pull/3286

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v2]

Jie Fu-2
In reply to this post by Jie Fu-2
On Thu, 1 Apr 2021 04:44:48 GMT, Jie Fu <[hidden email]> wrote:

>> test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java line 120:
>>
>>> 118:                 oa.shouldHaveExitValue(0).shouldContain("TEST PASSED!!!");
>>> 119:             }
>>> 120:         }
>>
>> Consider a broken implementation of `Metrics.systemMetrics().getMemoryFailCount()`. Wouldn't the test now (falsely) pass?
>>
>> What is the actual test output on those systems where the test fails? There should be a docker log file. Does it enter line 91?
>
> Hi @jerboaa ,
>
> After more thinking, I think the reason why memory.failcnt is always 0 is that there is no swap space on the host machine.
> So the testMemoryFailCount should be skipped in that case.
>
> But is there any API which can be used to get the swap space size of the host machine?
>
> Thanks.

> Consider a broken implementation of `Metrics.systemMetrics().getMemoryFailCount()`. Wouldn't the test now (falsely) pass?

Hi @jerboaa ,

A pre-test run has been added to check whether swapping really works for testMemoryFailCount.
Swapping should be OK for memory.failcnt testing otherwise it will fail due to OOM killed.

What do you think?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v2]

Severin Gehwolf-3
On Fri, 2 Apr 2021 04:10:47 GMT, Jie Fu <[hidden email]> wrote:

>> Hi @jerboaa ,
>>
>> After more thinking, I think the reason why memory.failcnt is always 0 is that there is no swap space on the host machine.
>> So the testMemoryFailCount should be skipped in that case.
>>
>> But is there any API which can be used to get the swap space size of the host machine?
>>
>> Thanks.
>
>> Consider a broken implementation of `Metrics.systemMetrics().getMemoryFailCount()`. Wouldn't the test now (falsely) pass?
>
> Hi @jerboaa ,
>
> A pre-test run has been added to check whether swapping really works for testMemoryFailCount.
> Swapping should be OK for memory.failcnt testing otherwise it will fail due to OOM killed.
>
> What do you think?
> Thanks.

@DamonFool Hmm, if swap not working is the issue the test shouldn't enter the branch which you say is failing. From `MetricsMemoryTester.java`:

    private static void testMemoryFailCount() {
        long memAndSwapLimit = Metrics.systemMetrics().getMemoryAndSwapLimit();
        long memLimit = Metrics.systemMetrics().getMemoryLimit();

        // We need swap to execute this test or will SEGV
        if (memAndSwapLimit <= memLimit) { // <=============== This is checking whether or not swap works
            System.out.println("No swap memory limits, test case skipped");
        } else {

It has been added with JDK-8250984. So now I'm even more confused what's going on here...

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v2]

Jie Fu-2
On Tue, 6 Apr 2021 09:40:39 GMT, Severin Gehwolf <[hidden email]> wrote:

> if (memAndSwapLimit <= memLimit)

Hi @jerboaa ,

Unfortunately, this check fails to work on a host machine which doesn't support swapping (e.g., total size of Swap is 0 byte).

You can still specify --memory and --memory-swap as you like on a host machine without swapping space.
And Metrics.systemMetrics().getMemoryLimit()/getMemoryAndSwapLimit() do return the values as you specified.
But the swapping will never happen since there is no swapping space on the host machine.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v2]

Severin Gehwolf-3
In reply to this post by Jie Fu-2
On Fri, 2 Apr 2021 04:03:45 GMT, Jie Fu <[hidden email]> wrote:

>> Hi all,
>>
>> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
>> This is because testMemoryFailCount [1] fails due to OOM killed.
>> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>>
>> The fix will print "Not OOM killed" if OOM killed doesn't happen.
>> And also fix another bug if the test get returned here [3].
>>
>> Testing:
>>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>>
>> Thanks.
>> Best regards,
>> Jie
>>
>> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
>> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96
>
> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>
>   Check swapping before testing

Changes requested by sgehwolf (Reviewer).

test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java line 106:

> 104:         Common.logNewTestCase("testMemoryFailCount" + value);
> 105:
> 106:         // Check whether swapping really works for this test

Please explain what "swapping not working" actually means in this comment. One version of it is already handled via JDK-8250984 so this is sort-of ambiguous. Suggestion: "On some systems there is no swap space enabled. On those systems running `java -version`??? with a memory limit fails due to swap space size being 0". Or something like that.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v3]

Jie Fu-2
In reply to this post by Jie Fu-2
> Hi all,
>
> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
> This is because testMemoryFailCount [1] fails due to OOM killed.
> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>
> The fix will print "Not OOM killed" if OOM killed doesn't happen.
> And also fix another bug if the test get returned here [3].
>
> Testing:
>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>
> Thanks.
> Best regards,
> Jie
>
> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96

Jie Fu has updated the pull request incrementally with one additional commit since the last revision:

  Update the comment

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3286/files
  - new: https://git.openjdk.java.net/jdk/pull/3286/files/3e1c446d..101b90d0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3286&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3286&range=01-02

  Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3286.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3286/head:pull/3286

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v2]

Jie Fu-2
In reply to this post by Severin Gehwolf-3
On Tue, 6 Apr 2021 12:27:56 GMT, Severin Gehwolf <[hidden email]> wrote:

> Please explain what "swapping not working" actually means in this comment.

Updated.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v3]

Severin Gehwolf-3
In reply to this post by Jie Fu-2
On Tue, 6 Apr 2021 13:11:54 GMT, Jie Fu <[hidden email]> wrote:

>> Hi all,
>>
>> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
>> This is because testMemoryFailCount [1] fails due to OOM killed.
>> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>>
>> The fix will print "Not OOM killed" if OOM killed doesn't happen.
>> And also fix another bug if the test get returned here [3].
>>
>> Testing:
>>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>>
>> Thanks.
>> Best regards,
>> Jie
>>
>> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
>> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96
>
> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>
>   Update the comment

LGTM.

-------------

Marked as reviewed by sgehwolf (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working [v3]

Jie Fu-2
On Tue, 6 Apr 2021 13:45:27 GMT, Severin Gehwolf <[hidden email]> wrote:

> LGTM.

Thanks @jerboaa for your review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286
Reply | Threaded
Open this post in threaded view
|

Integrated: 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working

Jie Fu-2
In reply to this post by Jie Fu-2
On Wed, 31 Mar 2021 15:53:28 GMT, Jie Fu <[hidden email]> wrote:

> Hi all,
>
> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
> This is because testMemoryFailCount [1] fails due to OOM killed.
> This test fails to avoid OOM killed [2] if memory.failcnt is always 0.
>
> The fix will print "Not OOM killed" if OOM killed doesn't happen.
> And also fix another bug if the test get returned here [3].
>
> Testing:
>   - jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64
>
> Thanks.
> Best regards,
> Jie
>
> [1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
> [3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96

This pull request has now been integrated.

Changeset: bfb034ab
Author:    Jie Fu <[hidden email]>
URL:       https://git.openjdk.java.net/jdk/commit/bfb034ab
Stats:     25 lines in 1 file changed: 23 ins; 0 del; 2 mod

8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working

Reviewed-by: sgehwolf

-------------

PR: https://git.openjdk.java.net/jdk/pull/3286