RFR (XS): 8173229: Wrong assert whether all remembered set entries have been iterated over in presence of coarsenings

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR (XS): 8173229: Wrong assert whether all remembered set entries have been iterated over in presence of coarsenings

Thomas Schatzl
Hi all,

  can I get reviews for this change that removes an assert with a wrong
condition?

The problem is that G1 remset management intentionally does not
properly synchronize state when one thread adds a remembered set entry
to a fine PRT and another thread coarsens that one. In that case, the
PRT's bitmap and separate occupancy counter can get inconsistent in
basically all possible benign ways (i.e. one bit too many in the
bitmap, or one bit too few) every time.

This behavior is documented at the top of heapRegionRemSet.cpp, but the
assert tries to verify that the number of iterated fine PRT remembered
set entries corresponds to what the fine PRT thinks it is.

My solution to that problem is to remove that assert (and some
additional dead code that does the same thing). The reason is that
fixing this assert seems not worth the effort:

- the existing assert is only checked if we have only one worker
thread. This seems unusual nowadays, kind of supported by the fact that
this assert has been introduced on day one, and apparently never before
triggered in this case.

- one could add a check to further tighten the assert, i.e. only check
if there were no coarsenings; however the #coarsenings counter is a
global right now, which means that any remembered set coarsening at any
point during execution disables this check completely.

- one could do extensive analysis of the situation and then update the
PRTs to fix it so that next time we don't need to get into this
situation again. However, that would make the assert even slower
(counting occupied cards is very slow), and I am kind of against an
assertion that updates state...

CR:
https://bugs.openjdk.java.net/browse/JDK-8173229
Webrev:
http://cr.openjdk.java.net/~tschatzl/8173229/webrev/
Testing:
Did 1000+ runs of that test with remembered set verification without
issues. Previous runs showed that the situation occurs every 50-100
runs, so it should have been checked a few times already.

Thanks,
  Thomas
Reply | Threaded
Open this post in threaded view
|

Re: RFR (XS): 8173229: Wrong assert whether all remembered set entries have been iterated over in presence of coarsenings

Mikael Gerdin
Hi Thomas,

On 2017-01-26 12:25, Thomas Schatzl wrote:

> Hi all,
>
>   can I get reviews for this change that removes an assert with a wrong
> condition?
>
> The problem is that G1 remset management intentionally does not
> properly synchronize state when one thread adds a remembered set entry
> to a fine PRT and another thread coarsens that one. In that case, the
> PRT's bitmap and separate occupancy counter can get inconsistent in
> basically all possible benign ways (i.e. one bit too many in the
> bitmap, or one bit too few) every time.
>
> This behavior is documented at the top of heapRegionRemSet.cpp, but the
> assert tries to verify that the number of iterated fine PRT remembered
> set entries corresponds to what the fine PRT thinks it is.
>
> My solution to that problem is to remove that assert (and some
> additional dead code that does the same thing). The reason is that
> fixing this assert seems not worth the effort:
>
> - the existing assert is only checked if we have only one worker
> thread. This seems unusual nowadays, kind of supported by the fact that
> this assert has been introduced on day one, and apparently never before
> triggered in this case.
>
> - one could add a check to further tighten the assert, i.e. only check
> if there were no coarsenings; however the #coarsenings counter is a
> global right now, which means that any remembered set coarsening at any
> point during execution disables this check completely.
>
> - one could do extensive analysis of the situation and then update the
> PRTs to fix it so that next time we don't need to get into this
> situation again. However, that would make the assert even slower
> (counting occupied cards is very slow), and I am kind of against an
> assertion that updates state...
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8173229
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8173229/webrev/

Looks good to me.
/Mikael

> Testing:
> Did 1000+ runs of that test with remembered set verification without
> issues. Previous runs showed that the situation occurs every 50-100
> runs, so it should have been checked a few times already.
>
> Thanks,
>   Thomas
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR (XS): 8173229: Wrong assert whether all remembered set entries have been iterated over in presence of coarsenings

Erik Helin-2
In reply to this post by Thomas Schatzl
On 01/26/2017 12:25 PM, Thomas Schatzl wrote:

> Hi all,
>
>   can I get reviews for this change that removes an assert with a wrong
> condition?
>
> The problem is that G1 remset management intentionally does not
> properly synchronize state when one thread adds a remembered set entry
> to a fine PRT and another thread coarsens that one. In that case, the
> PRT's bitmap and separate occupancy counter can get inconsistent in
> basically all possible benign ways (i.e. one bit too many in the
> bitmap, or one bit too few) every time.
>
> This behavior is documented at the top of heapRegionRemSet.cpp, but the
> assert tries to verify that the number of iterated fine PRT remembered
> set entries corresponds to what the fine PRT thinks it is.
>
> My solution to that problem is to remove that assert (and some
> additional dead code that does the same thing). The reason is that
> fixing this assert seems not worth the effort:
>
> - the existing assert is only checked if we have only one worker
> thread. This seems unusual nowadays, kind of supported by the fact that
> this assert has been introduced on day one, and apparently never before
> triggered in this case.
>
> - one could add a check to further tighten the assert, i.e. only check
> if there were no coarsenings; however the #coarsenings counter is a
> global right now, which means that any remembered set coarsening at any
> point during execution disables this check completely.
>
> - one could do extensive analysis of the situation and then update the
> PRTs to fix it so that next time we don't need to get into this
> situation again. However, that would make the assert even slower
> (counting occupied cards is very slow), and I am kind of against an
> assertion that updates state...
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8173229
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8173229/webrev/

Looks good, Reviewed. Thanks for fixing this!
Erik

> Testing:
> Did 1000+ runs of that test with remembered set verification without
> issues. Previous runs showed that the situation occurs every 50-100
> runs, so it should have been checked a few times already.
>
> Thanks,
>   Thomas
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR (XS): 8173229: Wrong assert whether all remembered set entries have been iterated over in presence of coarsenings

Thomas Schatzl
Hi Erik,

On Thu, 2017-01-26 at 14:13 +0100, Erik Helin wrote:
> On 01/26/2017 12:25 PM, Thomas Schatzl wrote:
> >
> > Hi all,
> >
> >   can I get reviews for this change that removes an assert with a
> > wrong
> > condition?
> >
[...]
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8173229
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8173229/webrev/
> Looks good, Reviewed. Thanks for fixing this!

  thanks for your review.

Thomas

Reply | Threaded
Open this post in threaded view
|

Re: RFR (XS): 8173229: Wrong assert whether all remembered set entries have been iterated over in presence of coarsenings

Thomas Schatzl
In reply to this post by Mikael Gerdin
Hi Mikael,

On Thu, 2017-01-26 at 13:47 +0100, Mikael Gerdin wrote:

> Hi Thomas,
>
> On 2017-01-26 12:25, Thomas Schatzl wrote:
> >
> > Hi all,
> >
> >   can I get reviews for this change that removes an assert with a
> > wrong
> > condition?
> >
[...]
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8173229
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8173229/webrev/
> Looks good to me.
> /Mikael

  thanks for your review.

Thomas