Quantcast

RFR (7xS): 8071280, 8162928, 8177707, 8177044, 8178148, 8175554, 8178151: Remembered set update/scan improvements

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFR (7xS): 8071280, 8162928, 8177707, 8177044, 8178148, 8175554, 8178151: Remembered set update/scan improvements

Thomas Schatzl
Hi all,

  I would like you to ask for reviews of a set of related changes that
optimize the way G1 updates/scans/refines cards of the remembered set.

As these CRs are highly dependent on each other I would like to have
them reviewed and pushed together. Another reason is that all of them
are self-contained, which means that they take one or the other detour
I want to avoid questions about.

The basic premise is that refinement and Scan RS (and Update RS as it
is mostly the same) phases are quite slow with G1. There are a lot of
low-hanging fruits to be picked here, ranging from micro-optimizations,
fixing "wrong" code sharing, re-using (slow) CMS code, doing stuff in
non-straightforward ways etc.

These changes try to make it better (tm) ;)

Results are good, around 10%+ faster scan rs time/card, 10%+ faster
update rs/card, refinement also faster (although it is too hard to
measure), 8 LOC less (there is more to delete, but that one needs more
care. Also these changes add some logging :))

Here we go:

8071280: Specialize HeapRegion::oops_on_card_seq_iterate_careful() for
use during concurrent refinement and updating the rset

CR: https://bugs.openjdk.java.net/browse/JDK-8071280
Webrev: http://cr.openjdk.java.net/~tschatzl/8071280/webrev

As the title indicates, this change allows specializations of
HeapRegion::oops_on_card_seq_iterate_careful() according to current GC
phase (concurrent or during gc) to allow to remove some checks.
This is mostly a preparatory change for the next ones.

8162928: Micro-optimizations in scanning the remembered sets

CR: https://bugs.openjdk.java.net/browse/JDK-8162928
Webrev: http://cr.openjdk.java.net/~tschatzl/8162928/webrev/

This change removes the use of HeapRegionDCTOC during scanning
remembered sets, and uses the faster
HeapRegion::oops_on_card_seq_iterate_careful() (measured).
HeapRegionDCTOC also depends CMS code which is just bad.

8177707: Specialize G1RemSet::refine_card for concurrent/during
safepoint refinement

CR: https://bugs.openjdk.java.net/browse/JDK-8177707
Webrev: http://cr.openjdk.java.net/~tschatzl/8177707/webrev/

This change temporarily duplicates the code in G1RemSet::refine_card
for during refinement/gc to allow (initial) specialization of these
methods for their purpose.

8177044: Remove _scan_top from HeapRegion

CR: https://bugs.openjdk.java.net/browse/JDK-8177044
Webrev: http://cr.openjdk.java.net/~tschatzl/8177044/

Removes _scan_top and all its awkward handling by explicitly creating a
snapshot of the top values of the heap regions. This allows significant
specialization of the refine_during_gc() method by subsuming all the
various checks at the beginning into a single one.

This one is responsible for a large part of the improvement of the
update rs phase.

8178148: Log more detailed information about scan rs phase

CR: https://bugs.openjdk.java.net/browse/JDK-8178148
Webrev: http://cr.openjdk.java.net/~tschatzl/8178148/webrev/

Adds some more statistics gathering and logging about what the scan rs
phase is actually doing, allowing better performance analysis. This
information will also be used further in the future.

8175554: Improve G1UpdateRSOrPushRefClosure

CR: https://bugs.openjdk.java.net/browse/JDK-8175554
Webrev: http://cr.openjdk.java.net/~tschatzl/8175554/webrev/

(This mostly deletes code, so it's still "S" ;))

Finally remove the G1ParPushHeapRSClosure and merge it with
G1UpdateOrScanRSClosure. You may now notice that
G1UpdateOrScanRSClosure is pretty similar to G1ParScanClosure - that is
another issue.
Also make sure that G1UpdateOrScanRSClosure properly does the humongous
and "ext" region handling (which is benign so far).

8178151: Clean up G1RemSet files

CR: https://bugs.openjdk.java.net/browse/JDK-8178151
Webrev: http://cr.openjdk.java.net/~tschatzl/8178151/webrev/

Clean up after all these changes a little in the G1Remset* files,
removing obsolete stuff, some renamings, etc. 

Thanks,
  Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (7xS): 8071280, 8162928, 8177707, 8177044, 8178148, 8175554, 8178151: Remembered set update/scan improvements

Thomas Schatzl
Hi all,

On Tue, 2017-04-11 at 13:30 +0200, Thomas Schatzl wrote:
> Hi all,



>   I would like you to ask for reviews of a set of related changes
> that optimize the way G1 updates/scans/refines cards of the
> remembered set.
  I forgot to add comments about testing:

- jprt for all changes, stacked in the order given
- rbt common and gc for tier 2-5 for everything
- perf benchmarks for almost all of the phases, and all together
- some additional large heap benchmarks

Thanks,
  Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (7xS): 8071280, 8162928, 8177707, 8177044, 8178148, 8175554, 8178151: Remembered set update/scan improvements

Kim Barrett
In reply to this post by Thomas Schatzl
> On Apr 11, 2017, at 7:30 AM, Thomas Schatzl <[hidden email]> wrote:
> 8071280: Specialize HeapRegion::oops_on_card_seq_iterate_careful() for
> use during concurrent refinement and updating the rset
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8071280
> Webrev: http://cr.openjdk.java.net/~tschatzl/8071280/webrev
>
> As the title indicates, this change allows specializations of
> HeapRegion::oops_on_card_seq_iterate_careful() according to current GC
> phase (concurrent or during gc) to allow to remove some checks.
> This is mostly a preparatory change for the next ones.

Only this changeset.  I’ll be sending out comments on others later, separate messages for each.

------------------------------------------------------------------------------
src/share/vm/gc/g1/heapRegion.inline.hpp
 149   assert(ClassUnloadingWithConcurrentMark,

Is this assertion really correct?  There is a similar assertion in
block_size, but it is protected by a call to block_is_obj.  I'm not
seeing similar protection for this one.  Maybe I'm missing someting?

------------------------------------------------------------------------------
src/share/vm/gc/g1/g1RemSet.cpp
 728   }

Close brace is now mis-indented.

------------------------------------------------------------------------------
src/share/vm/gc/g1/heapRegion.hpp
 329   template <class Closure, bool is_gc_active>
 330   inline bool do_oops_on_card_in_humongous(MemRegion mr,

and

 682   template <class Closure, bool is_gc_active>
 683   inline bool oops_on_card_seq_iterate_careful(MemRegion mr, Closure* cl);

It would be better to have the is_gc_active template parameter first.
That would allow the Closure parameter to be deduced in calls, e.g.
instead of

    card_processed = r->oops_on_card_seq_iterate_careful<G1UpdateRSOrPushRefOopClosure, true>(dirty_region, &update_rs_oop_cl);

one would use

    card_processed = r->oops_on_card_seq_iterate_careful<true>(dirty_region, &update_rs_oop_cl);

with Closure being deduced from the type of the argument in the call.

------------------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (7xS): 8071280, 8162928, 8177707, 8177044, 8178148, 8175554, 8178151: Remembered set update/scan improvements

Kim Barrett
In reply to this post by Thomas Schatzl
> On Apr 11, 2017, at 7:30 AM, Thomas Schatzl <[hidden email]> wrote:
>
> Hi all,
>
>   I would like you to ask for reviews of a set of related changes that
> optimize the way G1 updates/scans/refines cards of the remembered set.
>
> […]
> Here we go:
>
> […]
> 8162928: Micro-optimizations in scanning the remembered sets
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8162928
> Webrev: http://cr.openjdk.java.net/~tschatzl/8162928/webrev/
>
> This change removes the use of HeapRegionDCTOC during scanning
> remembered sets, and uses the faster
> HeapRegion::oops_on_card_seq_iterate_careful() (measured).
> HeapRegionDCTOC also depends CMS code which is just bad.

Looks good.

------------------------------------------------------------------------------
src/share/vm/gc/g1/g1RemSet.cpp
 318     r->oops_on_card_seq_iterate_careful<G1ParPushHeapRSClosure, true>(mr, _push_heap_cl);

This will of course need to be updated for my earlier suggested change
to reorder the template parameters.  I don't need a new webrev for that.

------------------------------------------------------------------------------


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (7xS): 8071280, 8162928, 8177707, 8177044, 8178148, 8175554, 8178151: Remembered set update/scan improvements

Kim Barrett
In reply to this post by Thomas Schatzl
> On Apr 11, 2017, at 7:30 AM, Thomas Schatzl <[hidden email]> wrote:
>
> Hi all,
>
>   I would like you to ask for reviews of a set of related changes that
> optimize the way G1 updates/scans/refines cards of the remembered set.
>
> […]
> 8177707: Specialize G1RemSet::refine_card for concurrent/during
> safepoint refinement
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8177707
> Webrev: http://cr.openjdk.java.net/~tschatzl/8177707/webrev/
>
> This change temporarily duplicates the code in G1RemSet::refine_card
> for during refinement/gc to allow (initial) specialization of these
> methods for their purpose.

------------------------------------------------------------------------------
src/share/vm/gc/g1/g1RemSet.cpp
 588   // While we are processing RSet buffers during the collection, we
...
 598   assert(!r->in_collection_set(), "Concurrent refinement should not encounter cards in the collection set.");

The comment starting at line 588 seems like it needs updating to
account for line 598 being changed from an "if" to an "assert".

However, I'm not sure this change change to an assert is correct,
because of the possibility of stale cards.

------------------------------------------------------------------------------
src/share/vm/gc/g1/g1RemSet.cpp
 636   HeapWord* scan_limit = r->top();

I think the commentary about this initialization is important and
ought not be deleted.

Similarly for the use of scan_top() in the gc_active case.

------------------------------------------------------------------------------
src/share/vm/gc/g1/g1RemSet.cpp
 688 bool G1RemSet::refine_card_during_gc(jbyte* card_ptr,

I sure hope a lot of the near duplication with the concurrent version
is going to go away in one of the later changes in this stack.

------------------------------------------------------------------------------  
src/share/vm/gc/g1/g1RemSet.cpp
 777   _conc_refine_cards++;

Since this is not in "concurrent" refinement, is it still correct to
include it in this count?  Maybe the variable name is not right?  If
so, I'd be okay with a variable rename in a later change if that makes
management of this stack of changes easier.

------------------------------------------------------------------------------  
src/share/vm/gc/g1/g1OopClosures.inline.hpp
 114 inline static void check_obj_during_refinement(T* p, oop const obj) {

Just "inline" is sufficient, no "static" needed.

------------------------------------------------------------------------------

Loading...