RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc

Kim Barrett-2
Please review this change to ParallelGC to avoid unnecessary full GCs when
concurrent threads attempt oldgen allocations during evacuation.

When a GC thread fails an oldgen allocation it expands the heap and retries
the allocation.  If the second allocation attempt fails then allocation
failure is reported to the caller, which can lead to a full GC.  But the
retried allocation could fail because, after expansion, some other thread
allocated enough of the available space that the retry fails.  This can
happen even though there is plenty of space available, if only that retry
were to perform another expansion.

Rather than trying to combine the allocation retry with the expansion (it's
not clear there's a way to do so without breaking invariants), we instead
simply loop on the allocation attempt + expand, until either the allocation
succeeds or the expand fails.  If some other thread "steals" space from the
expanding thread and causes its next allocation attempt to fail and do
another expansion, that's functionally no different from the expanding
thread succeeding and causing the other thread to fail allocation and do the
expand instead.

This change includes modifying PSOldGen::expand_to_reserved to return false
when there is no space available, where it previously returned true.  It's
not clear why it returned true; that seems wrong, but was harmless.  But it
must not do so with the new looping behavior for allocation, else it would
never terminate.

Testing:
mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)

-------------

Commit messages:
 - retry failed allocation if expand succeeds

Changes: https://git.openjdk.java.net/jdk/pull/2309/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8260044
  Stats: 15 lines in 2 files changed: 5 ins; 0 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2309.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2309/head:pull/2309

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc

Thomas Schatzl-4
On Fri, 29 Jan 2021 08:24:13 GMT, Kim Barrett <[hidden email]> wrote:

> Please review this change to ParallelGC to avoid unnecessary full GCs when
> concurrent threads attempt oldgen allocations during evacuation.
>
> When a GC thread fails an oldgen allocation it expands the heap and retries
> the allocation.  If the second allocation attempt fails then allocation
> failure is reported to the caller, which can lead to a full GC.  But the
> retried allocation could fail because, after expansion, some other thread
> allocated enough of the available space that the retry fails.  This can
> happen even though there is plenty of space available, if only that retry
> were to perform another expansion.
>
> Rather than trying to combine the allocation retry with the expansion (it's
> not clear there's a way to do so without breaking invariants), we instead
> simply loop on the allocation attempt + expand, until either the allocation
> succeeds or the expand fails.  If some other thread "steals" space from the
> expanding thread and causes its next allocation attempt to fail and do
> another expansion, that's functionally no different from the expanding
> thread succeeding and causing the other thread to fail allocation and do the
> expand instead.
>
> This change includes modifying PSOldGen::expand_to_reserved to return false
> when there is no space available, where it previously returned true.  It's
> not clear why it returned true; that seems wrong, but was harmless.  But it
> must not do so with the new looping behavior for allocation, else it would
> never terminate.
>
> Testing:
> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)

src/hotspot/share/gc/parallel/psOldGen.hpp line 141:

> 139:     do {
> 140:       res = cas_allocate_noexpand(word_size);
> 141:       // Retry failed allocation if expand succeeds.

"... but allocation did not." would be nice to be added to this comment to be complete.

src/hotspot/share/gc/parallel/psOldGen.cpp line 192:

> 190: bool PSOldGen::expand(size_t bytes) {
> 191:   if (bytes == 0) {
> 192:     return true;

I'd prefer if the code would `guarantee` or at least `assert` that `bytes > 0` because returning `true` here seems scary wrt to the loop.

All code paths seem to cover this situation already, i.e. with `word_size == 0` this should not be called.

But if you think it's not a big issue, we can keep it. This is pre-existing of course.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc

Kim Barrett-2
On Fri, 29 Jan 2021 10:08:53 GMT, Thomas Schatzl <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> src/hotspot/share/gc/parallel/psOldGen.hpp line 141:
>
>> 139:     do {
>> 140:       res = cas_allocate_noexpand(word_size);
>> 141:       // Retry failed allocation if expand succeeds.
>
> "... but allocation did not." would be nice to be added to this comment to be complete.

That's a "failed allocation".

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc

Kim Barrett-2
In reply to this post by Thomas Schatzl-4
On Fri, 29 Jan 2021 10:15:28 GMT, Thomas Schatzl <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> src/hotspot/share/gc/parallel/psOldGen.cpp line 192:
>
>> 190: bool PSOldGen::expand(size_t bytes) {
>> 191:   if (bytes == 0) {
>> 192:     return true;
>
> I'd prefer if the code would `guarantee` or at least `assert` that `bytes > 0` because returning `true` here seems scary wrt to the loop.
>
> All code paths seem to cover this situation already, i.e. with `word_size == 0` this should not be called.
>
> But if you think it's not a big issue, we can keep it. This is pre-existing of course.

Good point.  I will make sure a 0 size never gets here and assert/guarantee, or otherwise figure out what to do.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Kim Barrett-2
In reply to this post by Kim Barrett-2
> Please review this change to ParallelGC to avoid unnecessary full GCs when
> concurrent threads attempt oldgen allocations during evacuation.
>
> When a GC thread fails an oldgen allocation it expands the heap and retries
> the allocation.  If the second allocation attempt fails then allocation
> failure is reported to the caller, which can lead to a full GC.  But the
> retried allocation could fail because, after expansion, some other thread
> allocated enough of the available space that the retry fails.  This can
> happen even though there is plenty of space available, if only that retry
> were to perform another expansion.
>
> Rather than trying to combine the allocation retry with the expansion (it's
> not clear there's a way to do so without breaking invariants), we instead
> simply loop on the allocation attempt + expand, until either the allocation
> succeeds or the expand fails.  If some other thread "steals" space from the
> expanding thread and causes its next allocation attempt to fail and do
> another expansion, that's functionally no different from the expanding
> thread succeeding and causing the other thread to fail allocation and do the
> expand instead.
>
> This change includes modifying PSOldGen::expand_to_reserved to return false
> when there is no space available, where it previously returned true.  It's
> not clear why it returned true; that seems wrong, but was harmless.  But it
> must not do so with the new looping behavior for allocation, else it would
> never terminate.
>
> Testing:
> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)

Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:

  require non-zero expand size

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2309/files
  - new: https://git.openjdk.java.net/jdk/pull/2309/files/12500d49..d67d5e20

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=00-01

  Stats: 14 lines in 1 file changed: 1 ins; 6 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2309.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2309/head:pull/2309

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Kim Barrett-2
In reply to this post by Kim Barrett-2
On Fri, 29 Jan 2021 12:55:53 GMT, Kim Barrett <[hidden email]> wrote:

>> src/hotspot/share/gc/parallel/psOldGen.cpp line 192:
>>
>>> 190: bool PSOldGen::expand(size_t bytes) {
>>> 191:   if (bytes == 0) {
>>> 192:     return true;
>>
>> I'd prefer if the code would `guarantee` or at least `assert` that `bytes > 0` because returning `true` here seems scary wrt to the loop.
>>
>> All code paths seem to cover this situation already, i.e. with `word_size == 0` this should not be called.
>>
>> But if you think it's not a big issue, we can keep it. This is pre-existing of course.
>
> Good point.  I will make sure a 0 size never gets here and assert/guarantee, or otherwise figure out what to do.

I've changed various quick returns on zero size to instead be asserts, since none of them should ever be called with a zero size.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Thomas Schatzl-4
In reply to this post by Kim Barrett-2
On Fri, 29 Jan 2021 12:53:20 GMT, Kim Barrett <[hidden email]> wrote:

>> src/hotspot/share/gc/parallel/psOldGen.hpp line 141:
>>
>>> 139:     do {
>>> 140:       res = cas_allocate_noexpand(word_size);
>>> 141:       // Retry failed allocation if expand succeeds.
>>
>> "... but allocation did not." would be nice to be added to this comment to be complete.
>
> That's a "failed allocation".

Agreed :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Thomas Schatzl-4
In reply to this post by Kim Barrett-2
On Sat, 30 Jan 2021 06:09:01 GMT, Kim Barrett <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>
>   require non-zero expand size

Lgtm. Thanks.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Ivan Walulya-2
In reply to this post by Kim Barrett-2
On Sat, 30 Jan 2021 06:09:01 GMT, Kim Barrett <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>
>   require non-zero expand size

Lgtm!

-------------

Marked as reviewed by iwalulya (Committer).

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Stefan Johansson
In reply to this post by Kim Barrett-2
On Sat, 30 Jan 2021 06:09:01 GMT, Kim Barrett <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>
>   require non-zero expand size

Looks good!

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v3]

Kim Barrett-2
In reply to this post by Kim Barrett-2
> Please review this change to ParallelGC to avoid unnecessary full GCs when
> concurrent threads attempt oldgen allocations during evacuation.
>
> When a GC thread fails an oldgen allocation it expands the heap and retries
> the allocation.  If the second allocation attempt fails then allocation
> failure is reported to the caller, which can lead to a full GC.  But the
> retried allocation could fail because, after expansion, some other thread
> allocated enough of the available space that the retry fails.  This can
> happen even though there is plenty of space available, if only that retry
> were to perform another expansion.
>
> Rather than trying to combine the allocation retry with the expansion (it's
> not clear there's a way to do so without breaking invariants), we instead
> simply loop on the allocation attempt + expand, until either the allocation
> succeeds or the expand fails.  If some other thread "steals" space from the
> expanding thread and causes its next allocation attempt to fail and do
> another expansion, that's functionally no different from the expanding
> thread succeeding and causing the other thread to fail allocation and do the
> expand instead.
>
> This change includes modifying PSOldGen::expand_to_reserved to return false
> when there is no space available, where it previously returned true.  It's
> not clear why it returned true; that seems wrong, but was harmless.  But it
> must not do so with the new looping behavior for allocation, else it would
> never terminate.
>
> Testing:
> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:

 - Merge branch 'master' into retry_alloc
 - avoid expand storms
 - Merge branch 'master' into retry_alloc
 - require non-zero expand size
 - retry failed allocation if expand succeeds

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2309/files
  - new: https://git.openjdk.java.net/jdk/pull/2309/files/d67d5e20..72431d39

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=01-02

  Stats: 30850 lines in 869 files changed: 15950 ins; 10988 del; 3912 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2309.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2309/head:pull/2309

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Kim Barrett-2
In reply to this post by Stefan Johansson
On Mon, 1 Feb 2021 09:55:09 GMT, Stefan Johansson <[hidden email]> wrote:

>> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   require non-zero expand size
>
> Looks good!

The problem being addressed here is closely related to the "expand storm"
problem from JDK-8260045. I thought this one could be addressed separately
first, but now think not. Consider if we do an expand with excess here that
uses the remainder of the permitted space. If another thread was blocked
waiting to expand, its expand attempt will fail. With the old code, there
would still be another allocation attempt, but now the failing expand won't
do that.

Reversing the order of fixes doesn't work very well either, as avoiding the
expand storm needs the same sort of infrastructure for retrying a failed
allocation after (optional) expansion.

New commit "avoid expand storms" adds that fix.  It's a little bit kludgy
because of JDK-8261284, adding a function to MutableSpace for use only by
PSOldGen.  It's not the only weird function or behavior in MutableSpace.

Testing:
mach5 tier1-3, tier5 (tiers with common tests run with ParallelGC)

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v3]

Stefan Johansson
In reply to this post by Kim Barrett-2
On Mon, 8 Feb 2021 23:46:01 GMT, Kim Barrett <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>
>  - Merge branch 'master' into retry_alloc
>  - avoid expand storms
>  - Merge branch 'master' into retry_alloc
>  - require non-zero expand size
>  - retry failed allocation if expand succeeds

Marked as reviewed by sjohanss (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v3]

Ivan Walulya-2
In reply to this post by Kim Barrett-2
On Mon, 8 Feb 2021 23:46:01 GMT, Kim Barrett <[hidden email]> wrote:

>> Please review this change to ParallelGC to avoid unnecessary full GCs when
>> concurrent threads attempt oldgen allocations during evacuation.
>>
>> When a GC thread fails an oldgen allocation it expands the heap and retries
>> the allocation.  If the second allocation attempt fails then allocation
>> failure is reported to the caller, which can lead to a full GC.  But the
>> retried allocation could fail because, after expansion, some other thread
>> allocated enough of the available space that the retry fails.  This can
>> happen even though there is plenty of space available, if only that retry
>> were to perform another expansion.
>>
>> Rather than trying to combine the allocation retry with the expansion (it's
>> not clear there's a way to do so without breaking invariants), we instead
>> simply loop on the allocation attempt + expand, until either the allocation
>> succeeds or the expand fails.  If some other thread "steals" space from the
>> expanding thread and causes its next allocation attempt to fail and do
>> another expansion, that's functionally no different from the expanding
>> thread succeeding and causing the other thread to fail allocation and do the
>> expand instead.
>>
>> This change includes modifying PSOldGen::expand_to_reserved to return false
>> when there is no space available, where it previously returned true.  It's
>> not clear why it returned true; that seems wrong, but was harmless.  But it
>> must not do so with the new looping behavior for allocation, else it would
>> never terminate.
>>
>> Testing:
>> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>
>  - Merge branch 'master' into retry_alloc
>  - avoid expand storms
>  - Merge branch 'master' into retry_alloc
>  - require non-zero expand size
>  - retry failed allocation if expand succeeds

Marked as reviewed by iwalulya (Committer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Kim Barrett-2
In reply to this post by Thomas Schatzl-4
On Sun, 31 Jan 2021 17:10:55 GMT, Thomas Schatzl <[hidden email]> wrote:

>> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   require non-zero expand size
>
> Lgtm. Thanks.

Thanks @tschatzl , @kstefanj , and @walulyai for reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v4]

Kim Barrett-2
In reply to this post by Kim Barrett-2
> Please review this change to ParallelGC to avoid unnecessary full GCs when
> concurrent threads attempt oldgen allocations during evacuation.
>
> When a GC thread fails an oldgen allocation it expands the heap and retries
> the allocation.  If the second allocation attempt fails then allocation
> failure is reported to the caller, which can lead to a full GC.  But the
> retried allocation could fail because, after expansion, some other thread
> allocated enough of the available space that the retry fails.  This can
> happen even though there is plenty of space available, if only that retry
> were to perform another expansion.
>
> Rather than trying to combine the allocation retry with the expansion (it's
> not clear there's a way to do so without breaking invariants), we instead
> simply loop on the allocation attempt + expand, until either the allocation
> succeeds or the expand fails.  If some other thread "steals" space from the
> expanding thread and causes its next allocation attempt to fail and do
> another expansion, that's functionally no different from the expanding
> thread succeeding and causing the other thread to fail allocation and do the
> expand instead.
>
> This change includes modifying PSOldGen::expand_to_reserved to return false
> when there is no space available, where it previously returned true.  It's
> not clear why it returned true; that seems wrong, but was harmless.  But it
> must not do so with the new looping behavior for allocation, else it would
> never terminate.
>
> Testing:
> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:

 - Merge branch 'master' into retry_alloc
 - Merge branch 'master' into retry_alloc
 - avoid expand storms
 - Merge branch 'master' into retry_alloc
 - require non-zero expand size
 - retry failed allocation if expand succeeds

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2309/files
  - new: https://git.openjdk.java.net/jdk/pull/2309/files/72431d39..d463925f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2309&range=02-03

  Stats: 8695 lines in 369 files changed: 4618 ins; 2020 del; 2057 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2309.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2309/head:pull/2309

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Integrated: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc

Kim Barrett-2
In reply to this post by Kim Barrett-2
On Fri, 29 Jan 2021 08:24:13 GMT, Kim Barrett <[hidden email]> wrote:

> Please review this change to ParallelGC to avoid unnecessary full GCs when
> concurrent threads attempt oldgen allocations during evacuation.
>
> When a GC thread fails an oldgen allocation it expands the heap and retries
> the allocation.  If the second allocation attempt fails then allocation
> failure is reported to the caller, which can lead to a full GC.  But the
> retried allocation could fail because, after expansion, some other thread
> allocated enough of the available space that the retry fails.  This can
> happen even though there is plenty of space available, if only that retry
> were to perform another expansion.
>
> Rather than trying to combine the allocation retry with the expansion (it's
> not clear there's a way to do so without breaking invariants), we instead
> simply loop on the allocation attempt + expand, until either the allocation
> succeeds or the expand fails.  If some other thread "steals" space from the
> expanding thread and causes its next allocation attempt to fail and do
> another expansion, that's functionally no different from the expanding
> thread succeeding and causing the other thread to fail allocation and do the
> expand instead.
>
> This change includes modifying PSOldGen::expand_to_reserved to return false
> when there is no space available, where it previously returned true.  It's
> not clear why it returned true; that seems wrong, but was harmless.  But it
> must not do so with the new looping behavior for allocation, else it would
> never terminate.
>
> Testing:
> mach5 tier1-3, tier5 (tier2-3, 5 do a lot of ParallelGC testing)

This pull request has now been integrated.

Changeset: 6a84ec68
Author:    Kim Barrett <[hidden email]>
URL:       https://git.openjdk.java.net/jdk/commit/6a84ec68
Stats:     57 lines in 4 files changed: 33 ins; 6 del; 18 mod

8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc
8260045: Parallel GC: Waiting on ExpandHeap_lock may cause "expansion storm"

Loop to retry allocation if expand succeeds.  Treat space available after obtaining expand lock as expand success.

Reviewed-by: tschatzl, iwalulya, sjohanss

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8260044: Parallel GC: Concurrent allocation after heap expansion may cause unnecessary full gc [v2]

Thomas Schatzl-4
In reply to this post by Kim Barrett-2
On Fri, 12 Feb 2021 03:47:36 GMT, Kim Barrett <[hidden email]> wrote:

>> Lgtm. Thanks.
>
> Thanks @tschatzl , @kstefanj , and @walulyai for reviews.

Still looks good. Thanks. (I am aware this change has already been integrated).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2309