RFR: 8189871: Refactor GC barriers to use declarative semantics

classic Classic list List threaded Threaded
42 messages Options
123
Reply | Threaded
Open this post in threaded view
|

RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
Hi,

In an effort to remove explicit calls to GC barriers (and other
orthogonal forms of barriers, like encoding/decoding oops for compressed
oops and fencing for memory ordering), I have built an API that I call
"Access". Its purpose is to perform accesses with declarative semantics,
to handle multiple orthogonal concerns that affect how an access is
performed, including memory ordering, compressed oops, GC barriers for
marking, reference strength, etc, and as a result making GCs more
modular, and as a result allow new concurrently compacting GC schemes
utilizing load barriers to live in harmony in hotspot without everyone
going crazy manually inserting barriers if UseBlahGC is enabled.

CR:
https://bugs.openjdk.java.net/browse/JDK-8189871

Webrev:
http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/

So there are three views of this I suppose:

1) The frontend: how this is actually used in shared code
2) The backends: how anyone writing a GC sticks their required barriers
in there
3) The internals: how accesses find their way from the frontend to the
corresponding backend

== Frontend ==

Let's start with the frontend. I hope I made this fairly simple! You can
find it in runtime/access.hpp
Each access annotates its declarative semantics with a set of
"decorators", which is the name of the attributes/properties affecting
how an access is performed.
There is an Access<decorator> API that makes the declarative semantics
possible.

For example, if I want to perform a load acquire of an oop in the heap
that has "weak" strength, I would do something like:
oop result = Access<MO_ACQUIRE | IN_HEAP |
ON_WEAK_OOP_REF>::oop_load_at(obj, offset);

The Access API would then send the access through some GC backend, that
overrides the whole access and tells it to perform a "raw" load acquire,
and then possibly keep it alive if necessary (G1 SATB enqueue barriers).

To make life easier, there are some helpers for the most common access
patterns that merely add some default decorator for the involved type of
access. For example, there is a RawAccess for performing AS_RAW accesses
(that bypasses runtime checks and GC barriers), HeapAccess sets the
IN_HEAP decorator and RootAccess sets the IN_ROOT decorator for
accessing root oops. So for the previous call, I could simply do:

oop result = HeapAccess<MO_ACQUIRE | ON_WEAK_OOP_REF>::oop_load_at(obj,
offset);

The access.hpp file introduces each decorator (belonging to some
category) with an explanation what it is for. It also introduces all
operations you can make with access (loads, stores, cmpxchg, xchg,
arraycopy and clone).

This changeset mostly introduces the Access API but is not complete in
annotating the code more than where it gets very awkward if I don't.

== Backend ==

For a GC maintainer, the BarrierSet::AccessBarrier is the top level
backend that provides basic accesses that may be overridden. By default,
it just performs raw accesses without any GC barriers, that handle
things like compressed oops and memory ordering only. The ModRef barrier
set introduces the notion of pre/post write barriers, that can be
overridden for each GC. The CardTableModRef barrier set overrides the
post write barrier to mark cards, and G1 overrides it to mark cards
slightly differently and do some SATB enqueueing. G1 also overrides
loads to see if we need to perform SATB enqueue on weak references.

The raw accesses go to the RawAccessBarrier (living in
accessBackend.hpp) that performs the actual accesses. It connects to
Atomic and OrderAccess for accesses that require that.

== Internals ==

Internally, the accesses go through a number of stages in
access.inline.hpp as documented at the top.

1) set default decorators and get rid of CV qualifiers etc. Sanity
checking also happens here: we check that the decorators make sense for
the access being performed, and that the passed in types are not bogus.
2) reduce types so if we have a different type of the address and value,
then either it is not allowed or it implies we use compressed oops and
remember that we know something about whether compressed oops are used
or not, before erasing address type
3) pre-runtime dispatch: figure out if all runtime checks can be
bypassed into a raw access
4) runtime dispatch: send the access through a function pointer that
upon the first invocation resolves the intended GC AccessBarrier
accessor on the BarrierSet that handles this access, as well as figures
out whether we are using compressed oops or not while we are at it, and
then calls it through the post-runtime dispatch
5) post-runtime dispatch: fix some erased types that were not known at
compile time such as whether the address is a narrowOop* or oop*
depending on whether compressed oops was selected at runtime or not, and
call the resolved BarrierSet::AccessBarrier accessor (load/store/etc)
with all the call-site build-time and run-time resolved decorators and
type information that describes the access.

Testing: mach5 tier1-5

Thanks,
/Erik
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Per Liden
Looks good! Awesome work Erik!

(I pre-reviewed this before Erik sent it out, so all my comments have
already been taken care of)

cheers,
Per

On 2017-11-09 18:00, Erik Österlund wrote:

> Hi,
>
> In an effort to remove explicit calls to GC barriers (and other
> orthogonal forms of barriers, like encoding/decoding oops for compressed
> oops and fencing for memory ordering), I have built an API that I call
> "Access". Its purpose is to perform accesses with declarative semantics,
> to handle multiple orthogonal concerns that affect how an access is
> performed, including memory ordering, compressed oops, GC barriers for
> marking, reference strength, etc, and as a result making GCs more
> modular, and as a result allow new concurrently compacting GC schemes
> utilizing load barriers to live in harmony in hotspot without everyone
> going crazy manually inserting barriers if UseBlahGC is enabled.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189871
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>
> So there are three views of this I suppose:
>
> 1) The frontend: how this is actually used in shared code
> 2) The backends: how anyone writing a GC sticks their required barriers
> in there
> 3) The internals: how accesses find their way from the frontend to the
> corresponding backend
>
> == Frontend ==
>
> Let's start with the frontend. I hope I made this fairly simple! You can
> find it in runtime/access.hpp
> Each access annotates its declarative semantics with a set of
> "decorators", which is the name of the attributes/properties affecting
> how an access is performed.
> There is an Access<decorator> API that makes the declarative semantics
> possible.
>
> For example, if I want to perform a load acquire of an oop in the heap
> that has "weak" strength, I would do something like:
> oop result = Access<MO_ACQUIRE | IN_HEAP |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The Access API would then send the access through some GC backend, that
> overrides the whole access and tells it to perform a "raw" load acquire,
> and then possibly keep it alive if necessary (G1 SATB enqueue barriers).
>
> To make life easier, there are some helpers for the most common access
> patterns that merely add some default decorator for the involved type of
> access. For example, there is a RawAccess for performing AS_RAW accesses
> (that bypasses runtime checks and GC barriers), HeapAccess sets the
> IN_HEAP decorator and RootAccess sets the IN_ROOT decorator for
> accessing root oops. So for the previous call, I could simply do:
>
> oop result = HeapAccess<MO_ACQUIRE | ON_WEAK_OOP_REF>::oop_load_at(obj,
> offset);
>
> The access.hpp file introduces each decorator (belonging to some
> category) with an explanation what it is for. It also introduces all
> operations you can make with access (loads, stores, cmpxchg, xchg,
> arraycopy and clone).
>
> This changeset mostly introduces the Access API but is not complete in
> annotating the code more than where it gets very awkward if I don't.
>
> == Backend ==
>
> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
> backend that provides basic accesses that may be overridden. By default,
> it just performs raw accesses without any GC barriers, that handle
> things like compressed oops and memory ordering only. The ModRef barrier
> set introduces the notion of pre/post write barriers, that can be
> overridden for each GC. The CardTableModRef barrier set overrides the
> post write barrier to mark cards, and G1 overrides it to mark cards
> slightly differently and do some SATB enqueueing. G1 also overrides
> loads to see if we need to perform SATB enqueue on weak references.
>
> The raw accesses go to the RawAccessBarrier (living in
> accessBackend.hpp) that performs the actual accesses. It connects to
> Atomic and OrderAccess for accesses that require that.
>
> == Internals ==
>
> Internally, the accesses go through a number of stages in
> access.inline.hpp as documented at the top.
>
> 1) set default decorators and get rid of CV qualifiers etc. Sanity
> checking also happens here: we check that the decorators make sense for
> the access being performed, and that the passed in types are not bogus.
> 2) reduce types so if we have a different type of the address and value,
> then either it is not allowed or it implies we use compressed oops and
> remember that we know something about whether compressed oops are used
> or not, before erasing address type
> 3) pre-runtime dispatch: figure out if all runtime checks can be
> bypassed into a raw access
> 4) runtime dispatch: send the access through a function pointer that
> upon the first invocation resolves the intended GC AccessBarrier
> accessor on the BarrierSet that handles this access, as well as figures
> out whether we are using compressed oops or not while we are at it, and
> then calls it through the post-runtime dispatch
> 5) post-runtime dispatch: fix some erased types that were not known at
> compile time such as whether the address is a narrowOop* or oop*
> depending on whether compressed oops was selected at runtime or not, and
> call the resolved BarrierSet::AccessBarrier accessor (load/store/etc)
> with all the call-site build-time and run-time resolved decorators and
> type information that describes the access.
>
> Testing: mach5 tier1-5
>
> Thanks,
> /Erik
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
Hi Per,

Thank you. :)

/Erik

On 2017-11-10 09:25, Per Liden wrote:

> Looks good! Awesome work Erik!
>
> (I pre-reviewed this before Erik sent it out, so all my comments have
> already been taken care of)
>
> cheers,
> Per
>
> On 2017-11-09 18:00, Erik Österlund wrote:
>> Hi,
>>
>> In an effort to remove explicit calls to GC barriers (and other
>> orthogonal forms of barriers, like encoding/decoding oops for compressed
>> oops and fencing for memory ordering), I have built an API that I call
>> "Access". Its purpose is to perform accesses with declarative semantics,
>> to handle multiple orthogonal concerns that affect how an access is
>> performed, including memory ordering, compressed oops, GC barriers for
>> marking, reference strength, etc, and as a result making GCs more
>> modular, and as a result allow new concurrently compacting GC schemes
>> utilizing load barriers to live in harmony in hotspot without everyone
>> going crazy manually inserting barriers if UseBlahGC is enabled.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>
>> So there are three views of this I suppose:
>>
>> 1) The frontend: how this is actually used in shared code
>> 2) The backends: how anyone writing a GC sticks their required barriers
>> in there
>> 3) The internals: how accesses find their way from the frontend to the
>> corresponding backend
>>
>> == Frontend ==
>>
>> Let's start with the frontend. I hope I made this fairly simple! You can
>> find it in runtime/access.hpp
>> Each access annotates its declarative semantics with a set of
>> "decorators", which is the name of the attributes/properties affecting
>> how an access is performed.
>> There is an Access<decorator> API that makes the declarative semantics
>> possible.
>>
>> For example, if I want to perform a load acquire of an oop in the heap
>> that has "weak" strength, I would do something like:
>> oop result = Access<MO_ACQUIRE | IN_HEAP |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The Access API would then send the access through some GC backend, that
>> overrides the whole access and tells it to perform a "raw" load acquire,
>> and then possibly keep it alive if necessary (G1 SATB enqueue barriers).
>>
>> To make life easier, there are some helpers for the most common access
>> patterns that merely add some default decorator for the involved type of
>> access. For example, there is a RawAccess for performing AS_RAW accesses
>> (that bypasses runtime checks and GC barriers), HeapAccess sets the
>> IN_HEAP decorator and RootAccess sets the IN_ROOT decorator for
>> accessing root oops. So for the previous call, I could simply do:
>>
>> oop result = HeapAccess<MO_ACQUIRE | ON_WEAK_OOP_REF>::oop_load_at(obj,
>> offset);
>>
>> The access.hpp file introduces each decorator (belonging to some
>> category) with an explanation what it is for. It also introduces all
>> operations you can make with access (loads, stores, cmpxchg, xchg,
>> arraycopy and clone).
>>
>> This changeset mostly introduces the Access API but is not complete in
>> annotating the code more than where it gets very awkward if I don't.
>>
>> == Backend ==
>>
>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
>> backend that provides basic accesses that may be overridden. By default,
>> it just performs raw accesses without any GC barriers, that handle
>> things like compressed oops and memory ordering only. The ModRef barrier
>> set introduces the notion of pre/post write barriers, that can be
>> overridden for each GC. The CardTableModRef barrier set overrides the
>> post write barrier to mark cards, and G1 overrides it to mark cards
>> slightly differently and do some SATB enqueueing. G1 also overrides
>> loads to see if we need to perform SATB enqueue on weak references.
>>
>> The raw accesses go to the RawAccessBarrier (living in
>> accessBackend.hpp) that performs the actual accesses. It connects to
>> Atomic and OrderAccess for accesses that require that.
>>
>> == Internals ==
>>
>> Internally, the accesses go through a number of stages in
>> access.inline.hpp as documented at the top.
>>
>> 1) set default decorators and get rid of CV qualifiers etc. Sanity
>> checking also happens here: we check that the decorators make sense for
>> the access being performed, and that the passed in types are not bogus.
>> 2) reduce types so if we have a different type of the address and value,
>> then either it is not allowed or it implies we use compressed oops and
>> remember that we know something about whether compressed oops are used
>> or not, before erasing address type
>> 3) pre-runtime dispatch: figure out if all runtime checks can be
>> bypassed into a raw access
>> 4) runtime dispatch: send the access through a function pointer that
>> upon the first invocation resolves the intended GC AccessBarrier
>> accessor on the BarrierSet that handles this access, as well as figures
>> out whether we are using compressed oops or not while we are at it, and
>> then calls it through the post-runtime dispatch
>> 5) post-runtime dispatch: fix some erased types that were not known at
>> compile time such as whether the address is a narrowOop* or oop*
>> depending on whether compressed oops was selected at runtime or not, and
>> call the resolved BarrierSet::AccessBarrier accessor (load/store/etc)
>> with all the call-site build-time and run-time resolved decorators and
>> type information that describes the access.
>>
>> Testing: mach5 tier1-5
>>
>> Thanks,
>> /Erik

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Roman Kennke-6
In reply to this post by Erik Österlund-2
Hi Erik,

This looks very good to me. It is likely that we'll need to extend it a
little bit for Shenandoah, but I haven't got around to try that out yet,
and will propose it when this patch percolated down to the Shenandoah
project.

Questions (I know I've asked some of it before in private discussions):
- A BarrierSet needs to declare an AccessBarrier inner class. How does
this get 'registered' with the Access dispatcher?
- I see you use namespace. I haven't seen them anywhere else in Hotspot,
so this looks quite unusual :-) Not that I am against it (I would
probably advocate for using more of it), but have you considered
alternatives that look more common Hotspot-style (e.g. declaring an
all-static AccessInternal class)?
- The dispatching machinery looks a bit over the top, and from the
outskirts like a manual re-invention of virtual method dispatch.
Couldn't we do the same stuff with the usual public interface / concrete
implementation idioms? I am worried that adding just one method to the
interface turns into a nightmare of wiring up stuff and adding tons of
boilerplate to get it going. Not to mention the learning curve involved
trying to make sense of what goes where.

Other than that, I very much like what I see.

Roman

> Hi,
>
> In an effort to remove explicit calls to GC barriers (and other
> orthogonal forms of barriers, like encoding/decoding oops for
> compressed oops and fencing for memory ordering), I have built an API
> that I call "Access". Its purpose is to perform accesses with
> declarative semantics, to handle multiple orthogonal concerns that
> affect how an access is performed, including memory ordering,
> compressed oops, GC barriers for marking, reference strength, etc, and
> as a result making GCs more modular, and as a result allow new
> concurrently compacting GC schemes utilizing load barriers to live in
> harmony in hotspot without everyone going crazy manually inserting
> barriers if UseBlahGC is enabled.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189871
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>
> So there are three views of this I suppose:
>
> 1) The frontend: how this is actually used in shared code
> 2) The backends: how anyone writing a GC sticks their required
> barriers in there
> 3) The internals: how accesses find their way from the frontend to the
> corresponding backend
>
> == Frontend ==
>
> Let's start with the frontend. I hope I made this fairly simple! You
> can find it in runtime/access.hpp
> Each access annotates its declarative semantics with a set of
> "decorators", which is the name of the attributes/properties affecting
> how an access is performed.
> There is an Access<decorator> API that makes the declarative semantics
> possible.
>
> For example, if I want to perform a load acquire of an oop in the heap
> that has "weak" strength, I would do something like:
> oop result = Access<MO_ACQUIRE | IN_HEAP |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The Access API would then send the access through some GC backend,
> that overrides the whole access and tells it to perform a "raw" load
> acquire, and then possibly keep it alive if necessary (G1 SATB enqueue
> barriers).
>
> To make life easier, there are some helpers for the most common access
> patterns that merely add some default decorator for the involved type
> of access. For example, there is a RawAccess for performing AS_RAW
> accesses (that bypasses runtime checks and GC barriers), HeapAccess
> sets the IN_HEAP decorator and RootAccess sets the IN_ROOT decorator
> for accessing root oops. So for the previous call, I could simply do:
>
> oop result = HeapAccess<MO_ACQUIRE |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The access.hpp file introduces each decorator (belonging to some
> category) with an explanation what it is for. It also introduces all
> operations you can make with access (loads, stores, cmpxchg, xchg,
> arraycopy and clone).
>
> This changeset mostly introduces the Access API but is not complete in
> annotating the code more than where it gets very awkward if I don't.
>
> == Backend ==
>
> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
> backend that provides basic accesses that may be overridden. By
> default, it just performs raw accesses without any GC barriers, that
> handle things like compressed oops and memory ordering only. The
> ModRef barrier set introduces the notion of pre/post write barriers,
> that can be overridden for each GC. The CardTableModRef barrier set
> overrides the post write barrier to mark cards, and G1 overrides it to
> mark cards slightly differently and do some SATB enqueueing. G1 also
> overrides loads to see if we need to perform SATB enqueue on weak
> references.
>
> The raw accesses go to the RawAccessBarrier (living in
> accessBackend.hpp) that performs the actual accesses. It connects to
> Atomic and OrderAccess for accesses that require that.
>
> == Internals ==
>
> Internally, the accesses go through a number of stages in
> access.inline.hpp as documented at the top.
>
> 1) set default decorators and get rid of CV qualifiers etc. Sanity
> checking also happens here: we check that the decorators make sense
> for the access being performed, and that the passed in types are not
> bogus.
> 2) reduce types so if we have a different type of the address and
> value, then either it is not allowed or it implies we use compressed
> oops and remember that we know something about whether compressed oops
> are used or not, before erasing address type
> 3) pre-runtime dispatch: figure out if all runtime checks can be
> bypassed into a raw access
> 4) runtime dispatch: send the access through a function pointer that
> upon the first invocation resolves the intended GC AccessBarrier
> accessor on the BarrierSet that handles this access, as well as
> figures out whether we are using compressed oops or not while we are
> at it, and then calls it through the post-runtime dispatch
> 5) post-runtime dispatch: fix some erased types that were not known at
> compile time such as whether the address is a narrowOop* or oop*
> depending on whether compressed oops was selected at runtime or not,
> and call the resolved BarrierSet::AccessBarrier accessor
> (load/store/etc) with all the call-site build-time and run-time
> resolved decorators and type information that describes the access.
>
> Testing: mach5 tier1-5
>
> Thanks,
> /Erik


Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
Hi Roman,

On 2017-11-10 16:01, Roman Kennke wrote:
> Hi Erik,
>
> This looks very good to me. It is likely that we'll need to extend it
> a little bit for Shenandoah, but I haven't got around to try that out
> yet, and will propose it when this patch percolated down to the
> Shenandoah project.

Yes. The framework should be quite flexible, and of course I will work
with you on anything that needs to be updated.

> Questions (I know I've asked some of it before in private discussions):
> - A BarrierSet needs to declare an AccessBarrier inner class. How does
> this get 'registered' with the Access dispatcher?

Good question. Each new GC is tied together at one single point. The
Access API picks up GCs from the gc/shared/barrierSetConfig.hpp and
.inline.hpp files.

So to register a new GC, such as Shenandoah, you have to:

1) Make sure you have a BarrierSet enum value which is added to the list
of FOR_EACH_BARRIER_SET_DO as well as FOR_EACH_CONCRETE_BARRIER_SET_DO
in barrierSetConfig.hpp.

The first of said lists contains all barrier sets that are known to
exist at build time, and the second of said lists crucially contains
concrete barrier sets that have an AccessBarrier to resolve.

2) You also need to make sure in the barrierSetConfig.inline.hpp file
that you #include your shenandoah BarrierSet inline.hpp file.

3) You have to provide a specialization for the BarrierSet::GetName and
BarrierSet::GetType metafunctions that provide an enum value for a
barrier set type, and vice versa.

4) Since you probably want primitive accesses in the heap to also
resolve into the barrier set in a build that includes Shenandoah, you
should #define SUPPORT_BARRIER_ON_PRIMITIVES in the barrierSetConfig.hpp
file when building with Shenandoah. This will flick on the
INTERNAL_BT_BARRIER_ON_PRIMITIVES decorator to each access so that the
Access framework understands that even primitive accesses must be
resolved at run-time in the barrier set. So this is a build-time switch
for turning on run-time resolution of primitive accesses in the heap.

And now you should be set: your new ShenandoahBarrierSet::AccessBarrier
will be called for each access, including primitives.

It works the following way:

1) The barrier resolver loads the current barrier set, and checks the
"name" of it (the enum value).
2) Each "name" for concrete barriers that you listed in
barrierSetConfig.hpp is asked for...
3) ...the BarrierSet::GetType of that enum "name", and...
4) The AccessBarrier of that resulting BarrierSet (your
ShenandoahBarrierSet) will be called.

Hope that makes sense.

> - I see you use namespace. I haven't seen them anywhere else in
> Hotspot, so this looks quite unusual :-) Not that I am against it (I
> would probably advocate for using more of it), but have you considered
> alternatives that look more common Hotspot-style (e.g. declaring an
> all-static AccessInternal class)?

We do not have any platform that has problems with namespaces. So I
would prefer this basic namespace usage to a bunch of AllStatic classes.
I hope nobody minds that.

> - The dispatching machinery looks a bit over the top, and from the
> outskirts like a manual re-invention of virtual method dispatch.
> Couldn't we do the same stuff with the usual public interface /
> concrete implementation idioms? I am worried that adding just one
> method to the interface turns into a nightmare of wiring up stuff and
> adding tons of boilerplate to get it going. Not to mention the
> learning curve involved trying to make sense of what goes where.

It's not quite like a normal virtual call though. It's a virtual
dispatch that carries template parameters through the runtime dispatch.
The template parameters are required for 1) remembering the decorators
(semantics of the access), and 2) the type information of the operands
for the access.

In my experience, when you need to drag template parameters through some
form of virtual dispatch, it is easiest to use function pointers and not
virtual calls.
I understand it might look a bit complicated, but I hope that is okay.

> Other than that, I very much like what I see.

I'm glad to hear it!

Thanks,
/Erik

> Roman
>
>> Hi,
>>
>> In an effort to remove explicit calls to GC barriers (and other
>> orthogonal forms of barriers, like encoding/decoding oops for
>> compressed oops and fencing for memory ordering), I have built an API
>> that I call "Access". Its purpose is to perform accesses with
>> declarative semantics, to handle multiple orthogonal concerns that
>> affect how an access is performed, including memory ordering,
>> compressed oops, GC barriers for marking, reference strength, etc,
>> and as a result making GCs more modular, and as a result allow new
>> concurrently compacting GC schemes utilizing load barriers to live in
>> harmony in hotspot without everyone going crazy manually inserting
>> barriers if UseBlahGC is enabled.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>
>> So there are three views of this I suppose:
>>
>> 1) The frontend: how this is actually used in shared code
>> 2) The backends: how anyone writing a GC sticks their required
>> barriers in there
>> 3) The internals: how accesses find their way from the frontend to
>> the corresponding backend
>>
>> == Frontend ==
>>
>> Let's start with the frontend. I hope I made this fairly simple! You
>> can find it in runtime/access.hpp
>> Each access annotates its declarative semantics with a set of
>> "decorators", which is the name of the attributes/properties
>> affecting how an access is performed.
>> There is an Access<decorator> API that makes the declarative
>> semantics possible.
>>
>> For example, if I want to perform a load acquire of an oop in the
>> heap that has "weak" strength, I would do something like:
>> oop result = Access<MO_ACQUIRE | IN_HEAP |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The Access API would then send the access through some GC backend,
>> that overrides the whole access and tells it to perform a "raw" load
>> acquire, and then possibly keep it alive if necessary (G1 SATB
>> enqueue barriers).
>>
>> To make life easier, there are some helpers for the most common
>> access patterns that merely add some default decorator for the
>> involved type of access. For example, there is a RawAccess for
>> performing AS_RAW accesses (that bypasses runtime checks and GC
>> barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets
>> the IN_ROOT decorator for accessing root oops. So for the previous
>> call, I could simply do:
>>
>> oop result = HeapAccess<MO_ACQUIRE |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The access.hpp file introduces each decorator (belonging to some
>> category) with an explanation what it is for. It also introduces all
>> operations you can make with access (loads, stores, cmpxchg, xchg,
>> arraycopy and clone).
>>
>> This changeset mostly introduces the Access API but is not complete
>> in annotating the code more than where it gets very awkward if I don't.
>>
>> == Backend ==
>>
>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
>> backend that provides basic accesses that may be overridden. By
>> default, it just performs raw accesses without any GC barriers, that
>> handle things like compressed oops and memory ordering only. The
>> ModRef barrier set introduces the notion of pre/post write barriers,
>> that can be overridden for each GC. The CardTableModRef barrier set
>> overrides the post write barrier to mark cards, and G1 overrides it
>> to mark cards slightly differently and do some SATB enqueueing. G1
>> also overrides loads to see if we need to perform SATB enqueue on
>> weak references.
>>
>> The raw accesses go to the RawAccessBarrier (living in
>> accessBackend.hpp) that performs the actual accesses. It connects to
>> Atomic and OrderAccess for accesses that require that.
>>
>> == Internals ==
>>
>> Internally, the accesses go through a number of stages in
>> access.inline.hpp as documented at the top.
>>
>> 1) set default decorators and get rid of CV qualifiers etc. Sanity
>> checking also happens here: we check that the decorators make sense
>> for the access being performed, and that the passed in types are not
>> bogus.
>> 2) reduce types so if we have a different type of the address and
>> value, then either it is not allowed or it implies we use compressed
>> oops and remember that we know something about whether compressed
>> oops are used or not, before erasing address type
>> 3) pre-runtime dispatch: figure out if all runtime checks can be
>> bypassed into a raw access
>> 4) runtime dispatch: send the access through a function pointer that
>> upon the first invocation resolves the intended GC AccessBarrier
>> accessor on the BarrierSet that handles this access, as well as
>> figures out whether we are using compressed oops or not while we are
>> at it, and then calls it through the post-runtime dispatch
>> 5) post-runtime dispatch: fix some erased types that were not known
>> at compile time such as whether the address is a narrowOop* or oop*
>> depending on whether compressed oops was selected at runtime or not,
>> and call the resolved BarrierSet::AccessBarrier accessor
>> (load/store/etc) with all the call-site build-time and run-time
>> resolved decorators and type information that describes the access.
>>
>> Testing: mach5 tier1-5
>>
>> Thanks,
>> /Erik
>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Aleksey Shipilev-4
On 11/10/2017 05:29 PM, Erik Österlund wrote:
> 1) The barrier resolver loads the current barrier set, and checks the "name" of it (the enum value).
> 2) Each "name" for concrete barriers that you listed in barrierSetConfig.hpp is asked for...
> 3) ...the BarrierSet::GetType of that enum "name", and...
> 4) The AccessBarrier of that resulting BarrierSet (your ShenandoahBarrierSet) will be called.

So, I tried a simpler exercise with Epsilon, and it seems to work:
 http://cr.openjdk.java.net/~shade/epsilon/gc-barriers-declarative.patch

A few comments from that exercise:

 *) After the recent BS cleanup, the EpsilonBarrierSet has only a few leftovers [1]. With current
patch, write_ref_field_work seems to be gone. But write_ref_array_work and write_region_work are
still used. This removal is deliberately not handled in current patch, right?

 *) What is the meaning of AccesssBarrier::Raw like these?
  typedef BarrierSet::AccessBarrier<decorators, BarrierSetT> Raw;

  I am asking because it is not unclear if BS should typedef this. Epsilon seems to work fine
without the declaration. G1SATBCardTableLoggingModRefBS::AccessBarrier has it, but
CardTableModRefBS::AccessBarrier has not.

Thanks,
-Aleksey

[1]
http://hg.openjdk.java.net/jdk/sandbox/file/b2b4df384c83/src/hotspot/share/gc/epsilon/epsilonBarrierSet.hpp

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Per Liden
On 11/10/2017 07:39 PM, Aleksey Shipilev wrote:

> On 11/10/2017 05:29 PM, Erik Österlund wrote:
>> 1) The barrier resolver loads the current barrier set, and checks the "name" of it (the enum value).
>> 2) Each "name" for concrete barriers that you listed in barrierSetConfig.hpp is asked for...
>> 3) ...the BarrierSet::GetType of that enum "name", and...
>> 4) The AccessBarrier of that resulting BarrierSet (your ShenandoahBarrierSet) will be called.
>
> So, I tried a simpler exercise with Epsilon, and it seems to work:
>   http://cr.openjdk.java.net/~shade/epsilon/gc-barriers-declarative.patch
>
> A few comments from that exercise:
>
>   *) After the recent BS cleanup, the EpsilonBarrierSet has only a few leftovers [1]. With current
> patch, write_ref_field_work seems to be gone. But write_ref_array_work and write_region_work are
> still used. This removal is deliberately not handled in current patch, right?

Correct, those will be removed in one of Erik's later patches.

>
>   *) What is the meaning of AccesssBarrier::Raw like these?
>    typedef BarrierSet::AccessBarrier<decorators, BarrierSetT> Raw;

That's a convenience typedef for use when you want to call the super
class to do raw accesses (i.e. access without GC barriers, other than
potentially encode/decode oops), which you'd typically do inside
AccessBarrier functions for a specific BarrierSet, where those calls are
sandwiched in-between the GC specific barrier logic.

>
>    I am asking because it is not unclear if BS should typedef this. Epsilon seems to work fine
> without the declaration. G1SATBCardTableLoggingModRefBS::AccessBarrier has it, but
> CardTableModRefBS::AccessBarrier has not.

In Epsilon you don't needed it since you have no barriers. All you are
doing is just raw accesses which is the default behavior offered by the
super class.

cheers,
Per

>
> Thanks,
> -Aleksey
>
> [1]
> http://hg.openjdk.java.net/jdk/sandbox/file/b2b4df384c83/src/hotspot/share/gc/epsilon/epsilonBarrierSet.hpp
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Roman Kennke-6
In reply to this post by Erik Österlund-2
Hi Erik,

> Hi Roman,
>
> On 2017-11-10 16:01, Roman Kennke wrote:
>> Hi Erik,
>>
>> This looks very good to me. It is likely that we'll need to extend it
>> a little bit for Shenandoah, but I haven't got around to try that out
>> yet, and will propose it when this patch percolated down to the
>> Shenandoah project.
>
> Yes. The framework should be quite flexible, and of course I will work
> with you on anything that needs to be updated.
Perfect!

>> Questions (I know I've asked some of it before in private discussions):
>> - A BarrierSet needs to declare an AccessBarrier inner class. How
>> does this get 'registered' with the Access dispatcher?
>
> Good question. Each new GC is tied together at one single point. The
> Access API picks up GCs from the gc/shared/barrierSetConfig.hpp and
> .inline.hpp files.
>
> So to register a new GC, such as Shenandoah, you have to:
>
> 1) Make sure you have a BarrierSet enum value which is added to the
> list of FOR_EACH_BARRIER_SET_DO as well as
> FOR_EACH_CONCRETE_BARRIER_SET_DO in barrierSetConfig.hpp.
>
> The first of said lists contains all barrier sets that are known to
> exist at build time, and the second of said lists crucially contains
> concrete barrier sets that have an AccessBarrier to resolve.
>
> 2) You also need to make sure in the barrierSetConfig.inline.hpp file
> that you #include your shenandoah BarrierSet inline.hpp file.
>
> 3) You have to provide a specialization for the BarrierSet::GetName
> and BarrierSet::GetType metafunctions that provide an enum value for a
> barrier set type, and vice versa.
>
> 4) Since you probably want primitive accesses in the heap to also
> resolve into the barrier set in a build that includes Shenandoah, you
> should #define SUPPORT_BARRIER_ON_PRIMITIVES in the
> barrierSetConfig.hpp file when building with Shenandoah. This will
> flick on the INTERNAL_BT_BARRIER_ON_PRIMITIVES decorator to each
> access so that the Access framework understands that even primitive
> accesses must be resolved at run-time in the barrier set. So this is a
> build-time switch for turning on run-time resolution of primitive
> accesses in the heap.
>
> And now you should be set: your new
> ShenandoahBarrierSet::AccessBarrier will be called for each access,
> including primitives.
>
> It works the following way:
>
> 1) The barrier resolver loads the current barrier set, and checks the
> "name" of it (the enum value).
> 2) Each "name" for concrete barriers that you listed in
> barrierSetConfig.hpp is asked for...
> 3) ...the BarrierSet::GetType of that enum "name", and...
> 4) The AccessBarrier of that resulting BarrierSet (your
> ShenandoahBarrierSet) will be called.
>
> Hope that makes sense.
>

It is ok. But just so. ;-)
The problem is that it is a bit mystical how stuff is set up, and I'd
prefer a more explicit way to do it.
Maybe add the above explanations (how to make new GC barriers) in a
comment somewhere?

>> - I see you use namespace. I haven't seen them anywhere else in
>> Hotspot, so this looks quite unusual :-) Not that I am against it (I
>> would probably advocate for using more of it), but have you
>> considered alternatives that look more common Hotspot-style (e.g.
>> declaring an all-static AccessInternal class)?
>
> We do not have any platform that has problems with namespaces. So I
> would prefer this basic namespace usage to a bunch of AllStatic
> classes. I hope nobody minds that.
>
I am ok with it.

>> - The dispatching machinery looks a bit over the top, and from the
>> outskirts like a manual re-invention of virtual method dispatch.
>> Couldn't we do the same stuff with the usual public interface /
>> concrete implementation idioms? I am worried that adding just one
>> method to the interface turns into a nightmare of wiring up stuff and
>> adding tons of boilerplate to get it going. Not to mention the
>> learning curve involved trying to make sense of what goes where.
>
> It's not quite like a normal virtual call though. It's a virtual
> dispatch that carries template parameters through the runtime
> dispatch. The template parameters are required for 1) remembering the
> decorators (semantics of the access), and 2) the type information of
> the operands for the access.
>
> In my experience, when you need to drag template parameters through
> some form of virtual dispatch, it is easiest to use function pointers
> and not virtual calls.
> I understand it might look a bit complicated, but I hope that is okay.
Right. Templates and virtual doesn't play well (or at all) together.

I've seen that the GC interface JEP just went to 'targeted' so I guess
you can push it (probably needs more review?)

Roman
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

coleen.phillimore
In reply to this post by Erik Österlund-2

http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/classfile/javaClasses.cpp.udiff.html

+ assert(!is_reference ||
InstanceKlass::cast(obj->klass())->is_subclass_of(SystemDictionary::Reference_klass()),
"sanity");


Can you do something like this instead of all the InstanceKlass::cast.

+ InstanceKlass* k = InstanceKlass::cast(obj->klass());
+ bool is_reference = k->reference_type() != REF_NONE;
+ assert(!is_reference ||
k->is_subclass_of(SystemDictionary::Reference_klass()), "sanity");
+ return is_reference;
+}

And do you know that this is an instance rather than array instance? 
InstanceKlass::cast() has an assert that is_instance_klass().


http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/oops/klass.cpp.udiff.html

Revert file with only one line removed.


http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/runtime/access.hpp.html

  240   template <typename T>
  241   struct OopOrNarrowOopInternal: AllStatic {
  242     typedef oop type;
  243   };
  244
  245   template <>
  246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
  247     typedef narrowOop type;
  248   };
  249

Kim and I agree that we should not have the default template definition
for oop and have two specializations instead.  Mostly I agree because
this is confusing.

  240   template <typename T>
  241   struct OopOrNarrowOopInternal;

  240   template <>
  241   struct OopOrNarrowOopInternal<oop>: AllStatic {
  242     typedef oop type;
  243   };
  244
  245   template <>
  246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
  247     typedef narrowOop type;
  248   };
  249


We were also trying to figure out how the runtime would know whether to
use IN_CONCURRENT_ROOT vs IN_ROOT decorator, since it probably varies
for GCs.  And requires runtime code to know whether the root is scanned
concurrently or not, which we don't know.

This is all I have for now but I'm going to download the patch and have
more of a look tomorrow.

Thanks,
Coleen


On 11/9/17 12:00 PM, Erik Österlund wrote:

> Hi,
>
> In an effort to remove explicit calls to GC barriers (and other
> orthogonal forms of barriers, like encoding/decoding oops for
> compressed oops and fencing for memory ordering), I have built an API
> that I call "Access". Its purpose is to perform accesses with
> declarative semantics, to handle multiple orthogonal concerns that
> affect how an access is performed, including memory ordering,
> compressed oops, GC barriers for marking, reference strength, etc, and
> as a result making GCs more modular, and as a result allow new
> concurrently compacting GC schemes utilizing load barriers to live in
> harmony in hotspot without everyone going crazy manually inserting
> barriers if UseBlahGC is enabled.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189871
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>
> So there are three views of this I suppose:
>
> 1) The frontend: how this is actually used in shared code
> 2) The backends: how anyone writing a GC sticks their required
> barriers in there
> 3) The internals: how accesses find their way from the frontend to the
> corresponding backend
>
> == Frontend ==
>
> Let's start with the frontend. I hope I made this fairly simple! You
> can find it in runtime/access.hpp
> Each access annotates its declarative semantics with a set of
> "decorators", which is the name of the attributes/properties affecting
> how an access is performed.
> There is an Access<decorator> API that makes the declarative semantics
> possible.
>
> For example, if I want to perform a load acquire of an oop in the heap
> that has "weak" strength, I would do something like:
> oop result = Access<MO_ACQUIRE | IN_HEAP |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The Access API would then send the access through some GC backend,
> that overrides the whole access and tells it to perform a "raw" load
> acquire, and then possibly keep it alive if necessary (G1 SATB enqueue
> barriers).
>
> To make life easier, there are some helpers for the most common access
> patterns that merely add some default decorator for the involved type
> of access. For example, there is a RawAccess for performing AS_RAW
> accesses (that bypasses runtime checks and GC barriers), HeapAccess
> sets the IN_HEAP decorator and RootAccess sets the IN_ROOT decorator
> for accessing root oops. So for the previous call, I could simply do:
>
> oop result = HeapAccess<MO_ACQUIRE |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The access.hpp file introduces each decorator (belonging to some
> category) with an explanation what it is for. It also introduces all
> operations you can make with access (loads, stores, cmpxchg, xchg,
> arraycopy and clone).
>
> This changeset mostly introduces the Access API but is not complete in
> annotating the code more than where it gets very awkward if I don't.
>
> == Backend ==
>
> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
> backend that provides basic accesses that may be overridden. By
> default, it just performs raw accesses without any GC barriers, that
> handle things like compressed oops and memory ordering only. The
> ModRef barrier set introduces the notion of pre/post write barriers,
> that can be overridden for each GC. The CardTableModRef barrier set
> overrides the post write barrier to mark cards, and G1 overrides it to
> mark cards slightly differently and do some SATB enqueueing. G1 also
> overrides loads to see if we need to perform SATB enqueue on weak
> references.
>
> The raw accesses go to the RawAccessBarrier (living in
> accessBackend.hpp) that performs the actual accesses. It connects to
> Atomic and OrderAccess for accesses that require that.
>
> == Internals ==
>
> Internally, the accesses go through a number of stages in
> access.inline.hpp as documented at the top.
>
> 1) set default decorators and get rid of CV qualifiers etc. Sanity
> checking also happens here: we check that the decorators make sense
> for the access being performed, and that the passed in types are not
> bogus.
> 2) reduce types so if we have a different type of the address and
> value, then either it is not allowed or it implies we use compressed
> oops and remember that we know something about whether compressed oops
> are used or not, before erasing address type
> 3) pre-runtime dispatch: figure out if all runtime checks can be
> bypassed into a raw access
> 4) runtime dispatch: send the access through a function pointer that
> upon the first invocation resolves the intended GC AccessBarrier
> accessor on the BarrierSet that handles this access, as well as
> figures out whether we are using compressed oops or not while we are
> at it, and then calls it through the post-runtime dispatch
> 5) post-runtime dispatch: fix some erased types that were not known at
> compile time such as whether the address is a narrowOop* or oop*
> depending on whether compressed oops was selected at runtime or not,
> and call the resolved BarrierSet::AccessBarrier accessor
> (load/store/etc) with all the call-site build-time and run-time
> resolved decorators and type information that describes the access.
>
> Testing: mach5 tier1-5
>
> Thanks,
> /Erik

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
In reply to this post by Roman Kennke-6
Hi Roman,

Thanks for having a look at this.

> On 13 Nov 2017, at 18:10, Roman Kennke <[hidden email]> wrote:
>
> Hi Erik,
>
>> Hi Roman,
>>
>>> On 2017-11-10 16:01, Roman Kennke wrote:
>>> Hi Erik,
>>>
>>> This looks very good to me. It is likely that we'll need to extend it a little bit for Shenandoah, but I haven't got around to try that out yet, and will propose it when this patch percolated down to the Shenandoah project.
>>
>> Yes. The framework should be quite flexible, and of course I will work with you on anything that needs to be updated.
> Perfect!
>
>>> Questions (I know I've asked some of it before in private discussions):
>>> - A BarrierSet needs to declare an AccessBarrier inner class. How does this get 'registered' with the Access dispatcher?
>>
>> Good question. Each new GC is tied together at one single point. The Access API picks up GCs from the gc/shared/barrierSetConfig.hpp and .inline.hpp files.
>>
>> So to register a new GC, such as Shenandoah, you have to:
>>
>> 1) Make sure you have a BarrierSet enum value which is added to the list of FOR_EACH_BARRIER_SET_DO as well as FOR_EACH_CONCRETE_BARRIER_SET_DO in barrierSetConfig.hpp.
>>
>> The first of said lists contains all barrier sets that are known to exist at build time, and the second of said lists crucially contains concrete barrier sets that have an AccessBarrier to resolve.
>>
>> 2) You also need to make sure in the barrierSetConfig.inline.hpp file that you #include your shenandoah BarrierSet inline.hpp file.
>>
>> 3) You have to provide a specialization for the BarrierSet::GetName and BarrierSet::GetType metafunctions that provide an enum value for a barrier set type, and vice versa.
>>
>> 4) Since you probably want primitive accesses in the heap to also resolve into the barrier set in a build that includes Shenandoah, you should #define SUPPORT_BARRIER_ON_PRIMITIVES in the barrierSetConfig.hpp file when building with Shenandoah. This will flick on the INTERNAL_BT_BARRIER_ON_PRIMITIVES decorator to each access so that the Access framework understands that even primitive accesses must be resolved at run-time in the barrier set. So this is a build-time switch for turning on run-time resolution of primitive accesses in the heap.
>>
>> And now you should be set: your new ShenandoahBarrierSet::AccessBarrier will be called for each access, including primitives.
>>
>> It works the following way:
>>
>> 1) The barrier resolver loads the current barrier set, and checks the "name" of it (the enum value).
>> 2) Each "name" for concrete barriers that you listed in barrierSetConfig.hpp is asked for...
>> 3) ...the BarrierSet::GetType of that enum "name", and...
>> 4) The AccessBarrier of that resulting BarrierSet (your ShenandoahBarrierSet) will be called.
>>
>> Hope that makes sense.
>>
>
> It is ok. But just so. ;-)
> The problem is that it is a bit mystical how stuff is set up, and I'd prefer a more explicit way to do it.
> Maybe add the above explanations (how to make new GC barriers) in a comment somewhere?

Okay, will do!

>
>>> - I see you use namespace. I haven't seen them anywhere else in Hotspot, so this looks quite unusual :-) Not that I am against it (I would probably advocate for using more of it), but have you considered alternatives that look more common Hotspot-style (e.g. declaring an all-static AccessInternal class)?
>>
>> We do not have any platform that has problems with namespaces. So I would prefer this basic namespace usage to a bunch of AllStatic classes. I hope nobody minds that.
>>
> I am ok with it.

Good, thank you.

>
>>> - The dispatching machinery looks a bit over the top, and from the outskirts like a manual re-invention of virtual method dispatch. Couldn't we do the same stuff with the usual public interface / concrete implementation idioms? I am worried that adding just one method to the interface turns into a nightmare of wiring up stuff and adding tons of boilerplate to get it going. Not to mention the learning curve involved trying to make sense of what goes where.
>>
>> It's not quite like a normal virtual call though. It's a virtual dispatch that carries template parameters through the runtime dispatch. The template parameters are required for 1) remembering the decorators (semantics of the access), and 2) the type information of the operands for the access.
>>
>> In my experience, when you need to drag template parameters through some form of virtual dispatch, it is easiest to use function pointers and not virtual calls.
>> I understand it might look a bit complicated, but I hope that is okay.
> Right. Templates and virtual doesn't play well (or at all) together.

Indeed.

> I've seen that the GC interface JEP just went to 'targeted' so I guess you can push it (probably needs more review?)

Yes, soon indeed.

Thanks,
/Erik

> Roman

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

coleen.phillimore
In reply to this post by Erik Österlund-2

Hi,  Meta-comment: I think the access API should be in oops rather than
the runtime directory.   This API is for accessing objects in the heap
from other objects in the heap or in runtime code.   So that seems like
it belongs in the oops directory to me (even though metadata is there
for historical reasons).   The memory directory would be my second
choice, and runtime third.

I've been reading through much of the code with help and explanation of
what these templates do from Kim.  They are pretty tricky but I can see
why they are there and what they do.   I'm going to suggest some
comments in places when I've gotten through more of this.

Thanks,
Coleen

On 11/9/17 12:00 PM, Erik Österlund wrote:

> Hi,
>
> In an effort to remove explicit calls to GC barriers (and other
> orthogonal forms of barriers, like encoding/decoding oops for
> compressed oops and fencing for memory ordering), I have built an API
> that I call "Access". Its purpose is to perform accesses with
> declarative semantics, to handle multiple orthogonal concerns that
> affect how an access is performed, including memory ordering,
> compressed oops, GC barriers for marking, reference strength, etc, and
> as a result making GCs more modular, and as a result allow new
> concurrently compacting GC schemes utilizing load barriers to live in
> harmony in hotspot without everyone going crazy manually inserting
> barriers if UseBlahGC is enabled.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189871
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>
> So there are three views of this I suppose:
>
> 1) The frontend: how this is actually used in shared code
> 2) The backends: how anyone writing a GC sticks their required
> barriers in there
> 3) The internals: how accesses find their way from the frontend to the
> corresponding backend
>
> == Frontend ==
>
> Let's start with the frontend. I hope I made this fairly simple! You
> can find it in runtime/access.hpp
> Each access annotates its declarative semantics with a set of
> "decorators", which is the name of the attributes/properties affecting
> how an access is performed.
> There is an Access<decorator> API that makes the declarative semantics
> possible.
>
> For example, if I want to perform a load acquire of an oop in the heap
> that has "weak" strength, I would do something like:
> oop result = Access<MO_ACQUIRE | IN_HEAP |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The Access API would then send the access through some GC backend,
> that overrides the whole access and tells it to perform a "raw" load
> acquire, and then possibly keep it alive if necessary (G1 SATB enqueue
> barriers).
>
> To make life easier, there are some helpers for the most common access
> patterns that merely add some default decorator for the involved type
> of access. For example, there is a RawAccess for performing AS_RAW
> accesses (that bypasses runtime checks and GC barriers), HeapAccess
> sets the IN_HEAP decorator and RootAccess sets the IN_ROOT decorator
> for accessing root oops. So for the previous call, I could simply do:
>
> oop result = HeapAccess<MO_ACQUIRE |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The access.hpp file introduces each decorator (belonging to some
> category) with an explanation what it is for. It also introduces all
> operations you can make with access (loads, stores, cmpxchg, xchg,
> arraycopy and clone).
>
> This changeset mostly introduces the Access API but is not complete in
> annotating the code more than where it gets very awkward if I don't.
>
> == Backend ==
>
> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
> backend that provides basic accesses that may be overridden. By
> default, it just performs raw accesses without any GC barriers, that
> handle things like compressed oops and memory ordering only. The
> ModRef barrier set introduces the notion of pre/post write barriers,
> that can be overridden for each GC. The CardTableModRef barrier set
> overrides the post write barrier to mark cards, and G1 overrides it to
> mark cards slightly differently and do some SATB enqueueing. G1 also
> overrides loads to see if we need to perform SATB enqueue on weak
> references.
>
> The raw accesses go to the RawAccessBarrier (living in
> accessBackend.hpp) that performs the actual accesses. It connects to
> Atomic and OrderAccess for accesses that require that.
>
> == Internals ==
>
> Internally, the accesses go through a number of stages in
> access.inline.hpp as documented at the top.
>
> 1) set default decorators and get rid of CV qualifiers etc. Sanity
> checking also happens here: we check that the decorators make sense
> for the access being performed, and that the passed in types are not
> bogus.
> 2) reduce types so if we have a different type of the address and
> value, then either it is not allowed or it implies we use compressed
> oops and remember that we know something about whether compressed oops
> are used or not, before erasing address type
> 3) pre-runtime dispatch: figure out if all runtime checks can be
> bypassed into a raw access
> 4) runtime dispatch: send the access through a function pointer that
> upon the first invocation resolves the intended GC AccessBarrier
> accessor on the BarrierSet that handles this access, as well as
> figures out whether we are using compressed oops or not while we are
> at it, and then calls it through the post-runtime dispatch
> 5) post-runtime dispatch: fix some erased types that were not known at
> compile time such as whether the address is a narrowOop* or oop*
> depending on whether compressed oops was selected at runtime or not,
> and call the resolved BarrierSet::AccessBarrier accessor
> (load/store/etc) with all the call-site build-time and run-time
> resolved decorators and type information that describes the access.
>
> Testing: mach5 tier1-5
>
> Thanks,
> /Erik

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

David Holmes
In reply to this post by Erik Österlund-2
Hi Erik,

I really like the level of abstraction and encapsulation this provides.

Can't comment on the GC specific details or the template mechanics
directly, of course. :)

A couple of comments:

src/hotspot/share/oops/klass.hpp

  412   // Is an oop/narrowOop null or subtype of this Klass?
  413   template <typename T>
  414   bool is_covariant(T element);

I find "is_covariant" a very obscure way to name this. It may be
academically accurate but it's really just asking if the element is of a
type that is a subclass of the current klass. The null handling
complicates it, but it seems to me that:

template <typename T>
bool Klass::is_instanceof_or_null(T element);

would be more consistent with how we normally refer to things in the VM
(though the _or_null can be dropped from the name).

---

src/hotspot/share/oops/objArrayOop.cpp

Klass* objArrayOopDesc::covariant_bound()

There's that word again. :) If you really think you need to use
covariance within these API's you really need to add some comments to
the method declarations to explain them. Most of us probably have a
minimal recollection of covariance and contravariance from discussing
type-safety for method parameters and return types. :)

---

src/hotspot/share/prims/unsafe.cpp

The changes from jobjects to oops made me uneasy, but I'm assuming the
places where MemoryAccess and GuardedMemoryAccess are used are
affectively all leave routines with no chance of hitting anything that
would respond to a safepoint request?

Thanks,
David
-----

On 10/11/2017 3:00 AM, Erik Österlund wrote:

> Hi,
>
> In an effort to remove explicit calls to GC barriers (and other
> orthogonal forms of barriers, like encoding/decoding oops for compressed
> oops and fencing for memory ordering), I have built an API that I call
> "Access". Its purpose is to perform accesses with declarative semantics,
> to handle multiple orthogonal concerns that affect how an access is
> performed, including memory ordering, compressed oops, GC barriers for
> marking, reference strength, etc, and as a result making GCs more
> modular, and as a result allow new concurrently compacting GC schemes
> utilizing load barriers to live in harmony in hotspot without everyone
> going crazy manually inserting barriers if UseBlahGC is enabled.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189871
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>
> So there are three views of this I suppose:
>
> 1) The frontend: how this is actually used in shared code
> 2) The backends: how anyone writing a GC sticks their required barriers
> in there
> 3) The internals: how accesses find their way from the frontend to the
> corresponding backend
>
> == Frontend ==
>
> Let's start with the frontend. I hope I made this fairly simple! You can
> find it in runtime/access.hpp
> Each access annotates its declarative semantics with a set of
> "decorators", which is the name of the attributes/properties affecting
> how an access is performed.
> There is an Access<decorator> API that makes the declarative semantics
> possible.
>
> For example, if I want to perform a load acquire of an oop in the heap
> that has "weak" strength, I would do something like:
> oop result = Access<MO_ACQUIRE | IN_HEAP |
> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>
> The Access API would then send the access through some GC backend, that
> overrides the whole access and tells it to perform a "raw" load acquire,
> and then possibly keep it alive if necessary (G1 SATB enqueue barriers).
>
> To make life easier, there are some helpers for the most common access
> patterns that merely add some default decorator for the involved type of
> access. For example, there is a RawAccess for performing AS_RAW accesses
> (that bypasses runtime checks and GC barriers), HeapAccess sets the
> IN_HEAP decorator and RootAccess sets the IN_ROOT decorator for
> accessing root oops. So for the previous call, I could simply do:
>
> oop result = HeapAccess<MO_ACQUIRE | ON_WEAK_OOP_REF>::oop_load_at(obj,
> offset);
>
> The access.hpp file introduces each decorator (belonging to some
> category) with an explanation what it is for. It also introduces all
> operations you can make with access (loads, stores, cmpxchg, xchg,
> arraycopy and clone).
>
> This changeset mostly introduces the Access API but is not complete in
> annotating the code more than where it gets very awkward if I don't.
>
> == Backend ==
>
> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
> backend that provides basic accesses that may be overridden. By default,
> it just performs raw accesses without any GC barriers, that handle
> things like compressed oops and memory ordering only. The ModRef barrier
> set introduces the notion of pre/post write barriers, that can be
> overridden for each GC. The CardTableModRef barrier set overrides the
> post write barrier to mark cards, and G1 overrides it to mark cards
> slightly differently and do some SATB enqueueing. G1 also overrides
> loads to see if we need to perform SATB enqueue on weak references.
>
> The raw accesses go to the RawAccessBarrier (living in
> accessBackend.hpp) that performs the actual accesses. It connects to
> Atomic and OrderAccess for accesses that require that.
>
> == Internals ==
>
> Internally, the accesses go through a number of stages in
> access.inline.hpp as documented at the top.
>
> 1) set default decorators and get rid of CV qualifiers etc. Sanity
> checking also happens here: we check that the decorators make sense for
> the access being performed, and that the passed in types are not bogus.
> 2) reduce types so if we have a different type of the address and value,
> then either it is not allowed or it implies we use compressed oops and
> remember that we know something about whether compressed oops are used
> or not, before erasing address type
> 3) pre-runtime dispatch: figure out if all runtime checks can be
> bypassed into a raw access
> 4) runtime dispatch: send the access through a function pointer that
> upon the first invocation resolves the intended GC AccessBarrier
> accessor on the BarrierSet that handles this access, as well as figures
> out whether we are using compressed oops or not while we are at it, and
> then calls it through the post-runtime dispatch
> 5) post-runtime dispatch: fix some erased types that were not known at
> compile time such as whether the address is a narrowOop* or oop*
> depending on whether compressed oops was selected at runtime or not, and
> call the resolved BarrierSet::AccessBarrier accessor (load/store/etc)
> with all the call-site build-time and run-time resolved decorators and
> type information that describes the access.
>
> Testing: mach5 tier1-5
>
> Thanks,
> /Erik
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

coleen.phillimore


On 11/15/17 2:47 AM, David Holmes wrote:

> Hi Erik,
>
> I really like the level of abstraction and encapsulation this provides.
>
> Can't comment on the GC specific details or the template mechanics
> directly, of course. :)
>
> A couple of comments:
>
> src/hotspot/share/oops/klass.hpp
>
>  412   // Is an oop/narrowOop null or subtype of this Klass?
>  413   template <typename T>
>  414   bool is_covariant(T element);
>
> I find "is_covariant" a very obscure way to name this. It may be
> academically accurate but it's really just asking if the element is of
> a type that is a subclass of the current klass. The null handling
> complicates it, but it seems to me that:
>
> template <typename T>
> bool Klass::is_instanceof_or_null(T element);
>
> would be more consistent with how we normally refer to things in the
> VM (though the _or_null can be dropped from the name).
>

Hi,  I have to agree with David on the covariant name, and his suggested
replacement.

This name made me stop in my tracks while reading this.
thanks,
Coleen

> ---
>
> src/hotspot/share/oops/objArrayOop.cpp
>
> Klass* objArrayOopDesc::covariant_bound()
>
> There's that word again. :) If you really think you need to use
> covariance within these API's you really need to add some comments to
> the method declarations to explain them. Most of us probably have a
> minimal recollection of covariance and contravariance from discussing
> type-safety for method parameters and return types. :)
>
> ---
>
> src/hotspot/share/prims/unsafe.cpp
>
> The changes from jobjects to oops made me uneasy, but I'm assuming the
> places where MemoryAccess and GuardedMemoryAccess are used are
> affectively all leave routines with no chance of hitting anything that
> would respond to a safepoint request?
>
> Thanks,
> David
> -----
>
> On 10/11/2017 3:00 AM, Erik Österlund wrote:
>> Hi,
>>
>> In an effort to remove explicit calls to GC barriers (and other
>> orthogonal forms of barriers, like encoding/decoding oops for
>> compressed oops and fencing for memory ordering), I have built an API
>> that I call "Access". Its purpose is to perform accesses with
>> declarative semantics, to handle multiple orthogonal concerns that
>> affect how an access is performed, including memory ordering,
>> compressed oops, GC barriers for marking, reference strength, etc,
>> and as a result making GCs more modular, and as a result allow new
>> concurrently compacting GC schemes utilizing load barriers to live in
>> harmony in hotspot without everyone going crazy manually inserting
>> barriers if UseBlahGC is enabled.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>
>> So there are three views of this I suppose:
>>
>> 1) The frontend: how this is actually used in shared code
>> 2) The backends: how anyone writing a GC sticks their required
>> barriers in there
>> 3) The internals: how accesses find their way from the frontend to
>> the corresponding backend
>>
>> == Frontend ==
>>
>> Let's start with the frontend. I hope I made this fairly simple! You
>> can find it in runtime/access.hpp
>> Each access annotates its declarative semantics with a set of
>> "decorators", which is the name of the attributes/properties
>> affecting how an access is performed.
>> There is an Access<decorator> API that makes the declarative
>> semantics possible.
>>
>> For example, if I want to perform a load acquire of an oop in the
>> heap that has "weak" strength, I would do something like:
>> oop result = Access<MO_ACQUIRE | IN_HEAP |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The Access API would then send the access through some GC backend,
>> that overrides the whole access and tells it to perform a "raw" load
>> acquire, and then possibly keep it alive if necessary (G1 SATB
>> enqueue barriers).
>>
>> To make life easier, there are some helpers for the most common
>> access patterns that merely add some default decorator for the
>> involved type of access. For example, there is a RawAccess for
>> performing AS_RAW accesses (that bypasses runtime checks and GC
>> barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets
>> the IN_ROOT decorator for accessing root oops. So for the previous
>> call, I could simply do:
>>
>> oop result = HeapAccess<MO_ACQUIRE |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The access.hpp file introduces each decorator (belonging to some
>> category) with an explanation what it is for. It also introduces all
>> operations you can make with access (loads, stores, cmpxchg, xchg,
>> arraycopy and clone).
>>
>> This changeset mostly introduces the Access API but is not complete
>> in annotating the code more than where it gets very awkward if I don't.
>>
>> == Backend ==
>>
>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
>> backend that provides basic accesses that may be overridden. By
>> default, it just performs raw accesses without any GC barriers, that
>> handle things like compressed oops and memory ordering only. The
>> ModRef barrier set introduces the notion of pre/post write barriers,
>> that can be overridden for each GC. The CardTableModRef barrier set
>> overrides the post write barrier to mark cards, and G1 overrides it
>> to mark cards slightly differently and do some SATB enqueueing. G1
>> also overrides loads to see if we need to perform SATB enqueue on
>> weak references.
>>
>> The raw accesses go to the RawAccessBarrier (living in
>> accessBackend.hpp) that performs the actual accesses. It connects to
>> Atomic and OrderAccess for accesses that require that.
>>
>> == Internals ==
>>
>> Internally, the accesses go through a number of stages in
>> access.inline.hpp as documented at the top.
>>
>> 1) set default decorators and get rid of CV qualifiers etc. Sanity
>> checking also happens here: we check that the decorators make sense
>> for the access being performed, and that the passed in types are not
>> bogus.
>> 2) reduce types so if we have a different type of the address and
>> value, then either it is not allowed or it implies we use compressed
>> oops and remember that we know something about whether compressed
>> oops are used or not, before erasing address type
>> 3) pre-runtime dispatch: figure out if all runtime checks can be
>> bypassed into a raw access
>> 4) runtime dispatch: send the access through a function pointer that
>> upon the first invocation resolves the intended GC AccessBarrier
>> accessor on the BarrierSet that handles this access, as well as
>> figures out whether we are using compressed oops or not while we are
>> at it, and then calls it through the post-runtime dispatch
>> 5) post-runtime dispatch: fix some erased types that were not known
>> at compile time such as whether the address is a narrowOop* or oop*
>> depending on whether compressed oops was selected at runtime or not,
>> and call the resolved BarrierSet::AccessBarrier accessor
>> (load/store/etc) with all the call-site build-time and run-time
>> resolved decorators and type information that describes the access.
>>
>> Testing: mach5 tier1-5
>>
>> Thanks,
>> /Erik

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
In reply to this post by David Holmes
Hi David,

Thank you for the review.

On 2017-11-15 08:47, David Holmes wrote:
> Hi Erik,
>
> I really like the level of abstraction and encapsulation this provides.

Glad to hear it!

> Can't comment on the GC specific details or the template mechanics
> directly, of course. :)
>
> A couple of comments:
>
> src/hotspot/share/oops/klass.hpp
>
>  412   // Is an oop/narrowOop null or subtype of this Klass?
>  413   template <typename T>
>  414   bool is_covariant(T element);
>
> I find "is_covariant" a very obscure way to name this. It may be
> academically accurate but it's really just asking if the element is of
> a type that is a subclass of the current klass. The null handling
> complicates it, but it seems to me that:
>
> template <typename T>
> bool Klass::is_instanceof_or_null(T element);
>
> would be more consistent with how we normally refer to things in the
> VM (though the _or_null can be dropped from the name).

Hmm, I see your point. I have renamed covariant/contravariant
accordingly to fit better into our current notions.

The ARRAYCOPY_CONTRAVARIANT decorator has been renamed ARRAYCOPY_CHECKCAST.
The is_covariant check has been renamed is_instanceof_or_null as you
proposed.
The covariant_bound() method has been renamed to element_klass().

> ---
>
> src/hotspot/share/oops/objArrayOop.cpp
>
> Klass* objArrayOopDesc::covariant_bound()
>
> There's that word again. :) If you really think you need to use
> covariance within these API's you really need to add some comments to
> the method declarations to explain them. Most of us probably have a
> minimal recollection of covariance and contravariance from discussing
> type-safety for method parameters and return types. :)

Fixed as mentioned above.

>
> ---
>
> src/hotspot/share/prims/unsafe.cpp
>
> The changes from jobjects to oops made me uneasy, but I'm assuming the
> places where MemoryAccess and GuardedMemoryAccess are used are
> affectively all leave routines with no chance of hitting anything that
> would respond to a safepoint request?

Yes, that is correct. There are no thread transitions in those paths.

Here is a new full webrev:
http://cr.openjdk.java.net/~eosterlund/8189871/webrev.01/

Incremental:
http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00_01/

Thanks,
/Erik

> Thanks,
> David
> -----
>
> On 10/11/2017 3:00 AM, Erik Österlund wrote:
>> Hi,
>>
>> In an effort to remove explicit calls to GC barriers (and other
>> orthogonal forms of barriers, like encoding/decoding oops for
>> compressed oops and fencing for memory ordering), I have built an API
>> that I call "Access". Its purpose is to perform accesses with
>> declarative semantics, to handle multiple orthogonal concerns that
>> affect how an access is performed, including memory ordering,
>> compressed oops, GC barriers for marking, reference strength, etc,
>> and as a result making GCs more modular, and as a result allow new
>> concurrently compacting GC schemes utilizing load barriers to live in
>> harmony in hotspot without everyone going crazy manually inserting
>> barriers if UseBlahGC is enabled.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>
>> So there are three views of this I suppose:
>>
>> 1) The frontend: how this is actually used in shared code
>> 2) The backends: how anyone writing a GC sticks their required
>> barriers in there
>> 3) The internals: how accesses find their way from the frontend to
>> the corresponding backend
>>
>> == Frontend ==
>>
>> Let's start with the frontend. I hope I made this fairly simple! You
>> can find it in runtime/access.hpp
>> Each access annotates its declarative semantics with a set of
>> "decorators", which is the name of the attributes/properties
>> affecting how an access is performed.
>> There is an Access<decorator> API that makes the declarative
>> semantics possible.
>>
>> For example, if I want to perform a load acquire of an oop in the
>> heap that has "weak" strength, I would do something like:
>> oop result = Access<MO_ACQUIRE | IN_HEAP |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The Access API would then send the access through some GC backend,
>> that overrides the whole access and tells it to perform a "raw" load
>> acquire, and then possibly keep it alive if necessary (G1 SATB
>> enqueue barriers).
>>
>> To make life easier, there are some helpers for the most common
>> access patterns that merely add some default decorator for the
>> involved type of access. For example, there is a RawAccess for
>> performing AS_RAW accesses (that bypasses runtime checks and GC
>> barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets
>> the IN_ROOT decorator for accessing root oops. So for the previous
>> call, I could simply do:
>>
>> oop result = HeapAccess<MO_ACQUIRE |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The access.hpp file introduces each decorator (belonging to some
>> category) with an explanation what it is for. It also introduces all
>> operations you can make with access (loads, stores, cmpxchg, xchg,
>> arraycopy and clone).
>>
>> This changeset mostly introduces the Access API but is not complete
>> in annotating the code more than where it gets very awkward if I don't.
>>
>> == Backend ==
>>
>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
>> backend that provides basic accesses that may be overridden. By
>> default, it just performs raw accesses without any GC barriers, that
>> handle things like compressed oops and memory ordering only. The
>> ModRef barrier set introduces the notion of pre/post write barriers,
>> that can be overridden for each GC. The CardTableModRef barrier set
>> overrides the post write barrier to mark cards, and G1 overrides it
>> to mark cards slightly differently and do some SATB enqueueing. G1
>> also overrides loads to see if we need to perform SATB enqueue on
>> weak references.
>>
>> The raw accesses go to the RawAccessBarrier (living in
>> accessBackend.hpp) that performs the actual accesses. It connects to
>> Atomic and OrderAccess for accesses that require that.
>>
>> == Internals ==
>>
>> Internally, the accesses go through a number of stages in
>> access.inline.hpp as documented at the top.
>>
>> 1) set default decorators and get rid of CV qualifiers etc. Sanity
>> checking also happens here: we check that the decorators make sense
>> for the access being performed, and that the passed in types are not
>> bogus.
>> 2) reduce types so if we have a different type of the address and
>> value, then either it is not allowed or it implies we use compressed
>> oops and remember that we know something about whether compressed
>> oops are used or not, before erasing address type
>> 3) pre-runtime dispatch: figure out if all runtime checks can be
>> bypassed into a raw access
>> 4) runtime dispatch: send the access through a function pointer that
>> upon the first invocation resolves the intended GC AccessBarrier
>> accessor on the BarrierSet that handles this access, as well as
>> figures out whether we are using compressed oops or not while we are
>> at it, and then calls it through the post-runtime dispatch
>> 5) post-runtime dispatch: fix some erased types that were not known
>> at compile time such as whether the address is a narrowOop* or oop*
>> depending on whether compressed oops was selected at runtime or not,
>> and call the resolved BarrierSet::AccessBarrier accessor
>> (load/store/etc) with all the call-site build-time and run-time
>> resolved decorators and type information that describes the access.
>>
>> Testing: mach5 tier1-5
>>
>> Thanks,
>> /Erik

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
In reply to this post by coleen.phillimore
Hi Coleen,

On 2017-11-14 02:34, [hidden email] wrote:

>
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/classfile/javaClasses.cpp.udiff.html 
>
>
> + assert(!is_reference ||
> InstanceKlass::cast(obj->klass())->is_subclass_of(SystemDictionary::Reference_klass()),
> "sanity");
>
>
> Can you do something like this instead of all the InstanceKlass::cast.
>
> + InstanceKlass* k = InstanceKlass::cast(obj->klass());
> + bool is_reference = k->reference_type() != REF_NONE;
> + assert(!is_reference ||
> k->is_subclass_of(SystemDictionary::Reference_klass()), "sanity");
> + return is_reference;
> +}
>

Yes, sure. Fixed in latest webrev in this thread
(http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00_01/)

> And do you know that this is an instance rather than array instance?  
> InstanceKlass::cast() has an assert that is_instance_klass().

The code previously assumed it is_instance_klass(), and I did not
question its correctness. But now that you mention it, it does seem
safer to explicitly check. So I added an explicit check for that.

>
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/oops/klass.cpp.udiff.html 
>
>
> Revert file with only one line removed.
>

Fixed.

>
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/runtime/access.hpp.html 
>
>
>  240   template <typename T>
>  241   struct OopOrNarrowOopInternal: AllStatic {
>  242     typedef oop type;
>  243   };
>  244
>  245   template <>
>  246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
>  247     typedef narrowOop type;
>  248   };
>  249
>
> Kim and I agree that we should not have the default template
> definition for oop and have two specializations instead.  Mostly I
> agree because this is confusing.
>
>  240   template <typename T>
>  241   struct OopOrNarrowOopInternal;
>
>  240   template <>
>  241   struct OopOrNarrowOopInternal<oop>: AllStatic {
>  242     typedef oop type;
>  243   };
>  244
>  245   template <>
>  246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
>  247     typedef narrowOop type;
>  248   };
>  249
>

This choice is actually deliberate. The reason is that we pass in things
that is-an oop but is not exactly oop, like for example arrayOop or
oopDesc* and the whole subclass tree if oop. An earlier incarnation of
the Access API used a new IsDerived<X, Y> metafunction to determine
whether a type is-an oop.
However, in order to keep the solution simpler and introduce less type
dependencies, I instead use this OopOrNarrowOop thing to pass oop-like
things through it and narrow it down to oop or narrowOop. If a narrowOop
is passed in, then it becomes narrowOop, if you pass in oopDesc*,
arrayOop, oop, instanceOop, or whatever, then it results in an oop, and
tries to implicitly convert whatever was passed in to oop. That will act
as if the overload was "oop" in the way that you can pass in anything
that is implicitly convertible to exactly oop or narrowOop, and once you
make it past that layer, it will be treated only as oop or narrowOop.

Sorry for any resulting confusion.

>
> We were also trying to figure out how the runtime would know whether
> to use IN_CONCURRENT_ROOT vs IN_ROOT decorator, since it probably
> varies for GCs.  And requires runtime code to know whether the root is
> scanned concurrently or not, which we don't know.
>

The Access API needs to know whether this is scanned concurrently or not
(because we *have* to perform different barriers then). So whatever data
structure the access is performed on, needs to make sure this is known.

> This is all I have for now but I'm going to download the patch and
> have more of a look tomorrow.

Thank you for the review and thank you for taking the time to look at this.

Thanks,
/Erik

> Thanks,
> Coleen
>
>
> On 11/9/17 12:00 PM, Erik Österlund wrote:
>> Hi,
>>
>> In an effort to remove explicit calls to GC barriers (and other
>> orthogonal forms of barriers, like encoding/decoding oops for
>> compressed oops and fencing for memory ordering), I have built an API
>> that I call "Access". Its purpose is to perform accesses with
>> declarative semantics, to handle multiple orthogonal concerns that
>> affect how an access is performed, including memory ordering,
>> compressed oops, GC barriers for marking, reference strength, etc,
>> and as a result making GCs more modular, and as a result allow new
>> concurrently compacting GC schemes utilizing load barriers to live in
>> harmony in hotspot without everyone going crazy manually inserting
>> barriers if UseBlahGC is enabled.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>
>> So there are three views of this I suppose:
>>
>> 1) The frontend: how this is actually used in shared code
>> 2) The backends: how anyone writing a GC sticks their required
>> barriers in there
>> 3) The internals: how accesses find their way from the frontend to
>> the corresponding backend
>>
>> == Frontend ==
>>
>> Let's start with the frontend. I hope I made this fairly simple! You
>> can find it in runtime/access.hpp
>> Each access annotates its declarative semantics with a set of
>> "decorators", which is the name of the attributes/properties
>> affecting how an access is performed.
>> There is an Access<decorator> API that makes the declarative
>> semantics possible.
>>
>> For example, if I want to perform a load acquire of an oop in the
>> heap that has "weak" strength, I would do something like:
>> oop result = Access<MO_ACQUIRE | IN_HEAP |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The Access API would then send the access through some GC backend,
>> that overrides the whole access and tells it to perform a "raw" load
>> acquire, and then possibly keep it alive if necessary (G1 SATB
>> enqueue barriers).
>>
>> To make life easier, there are some helpers for the most common
>> access patterns that merely add some default decorator for the
>> involved type of access. For example, there is a RawAccess for
>> performing AS_RAW accesses (that bypasses runtime checks and GC
>> barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets
>> the IN_ROOT decorator for accessing root oops. So for the previous
>> call, I could simply do:
>>
>> oop result = HeapAccess<MO_ACQUIRE |
>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>
>> The access.hpp file introduces each decorator (belonging to some
>> category) with an explanation what it is for. It also introduces all
>> operations you can make with access (loads, stores, cmpxchg, xchg,
>> arraycopy and clone).
>>
>> This changeset mostly introduces the Access API but is not complete
>> in annotating the code more than where it gets very awkward if I don't.
>>
>> == Backend ==
>>
>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
>> backend that provides basic accesses that may be overridden. By
>> default, it just performs raw accesses without any GC barriers, that
>> handle things like compressed oops and memory ordering only. The
>> ModRef barrier set introduces the notion of pre/post write barriers,
>> that can be overridden for each GC. The CardTableModRef barrier set
>> overrides the post write barrier to mark cards, and G1 overrides it
>> to mark cards slightly differently and do some SATB enqueueing. G1
>> also overrides loads to see if we need to perform SATB enqueue on
>> weak references.
>>
>> The raw accesses go to the RawAccessBarrier (living in
>> accessBackend.hpp) that performs the actual accesses. It connects to
>> Atomic and OrderAccess for accesses that require that.
>>
>> == Internals ==
>>
>> Internally, the accesses go through a number of stages in
>> access.inline.hpp as documented at the top.
>>
>> 1) set default decorators and get rid of CV qualifiers etc. Sanity
>> checking also happens here: we check that the decorators make sense
>> for the access being performed, and that the passed in types are not
>> bogus.
>> 2) reduce types so if we have a different type of the address and
>> value, then either it is not allowed or it implies we use compressed
>> oops and remember that we know something about whether compressed
>> oops are used or not, before erasing address type
>> 3) pre-runtime dispatch: figure out if all runtime checks can be
>> bypassed into a raw access
>> 4) runtime dispatch: send the access through a function pointer that
>> upon the first invocation resolves the intended GC AccessBarrier
>> accessor on the BarrierSet that handles this access, as well as
>> figures out whether we are using compressed oops or not while we are
>> at it, and then calls it through the post-runtime dispatch
>> 5) post-runtime dispatch: fix some erased types that were not known
>> at compile time such as whether the address is a narrowOop* or oop*
>> depending on whether compressed oops was selected at runtime or not,
>> and call the resolved BarrierSet::AccessBarrier accessor
>> (load/store/etc) with all the call-site build-time and run-time
>> resolved decorators and type information that describes the access.
>>
>> Testing: mach5 tier1-5
>>
>> Thanks,
>> /Erik
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
In reply to this post by coleen.phillimore
Hi Coleen,

> On 15 Nov 2017, at 14:00, [hidden email] wrote:
>
>
>
>> On 11/15/17 2:47 AM, David Holmes wrote:
>> Hi Erik,
>>
>> I really like the level of abstraction and encapsulation this provides.
>>
>> Can't comment on the GC specific details or the template mechanics directly, of course. :)
>>
>> A couple of comments:
>>
>> src/hotspot/share/oops/klass.hpp
>>
>>  412   // Is an oop/narrowOop null or subtype of this Klass?
>>  413   template <typename T>
>>  414   bool is_covariant(T element);
>>
>> I find "is_covariant" a very obscure way to name this. It may be academically accurate but it's really just asking if the element is of a type that is a subclass of the current klass. The null handling complicates it, but it seems to me that:
>>
>> template <typename T>
>> bool Klass::is_instanceof_or_null(T element);
>>
>> would be more consistent with how we normally refer to things in the VM (though the _or_null can be dropped from the name).
>>
>
> Hi,  I have to agree with David on the covariant name, and his suggested replacement.
>
> This name made me stop in my tracks while reading this.

Fixed.

Thanks,
/Erik

> thanks,
> Coleen
>> ---
>>
>> src/hotspot/share/oops/objArrayOop.cpp
>>
>> Klass* objArrayOopDesc::covariant_bound()
>>
>> There's that word again. :) If you really think you need to use covariance within these API's you really need to add some comments to the method declarations to explain them. Most of us probably have a minimal recollection of covariance and contravariance from discussing type-safety for method parameters and return types. :)
>>
>> ---
>>
>> src/hotspot/share/prims/unsafe.cpp
>>
>> The changes from jobjects to oops made me uneasy, but I'm assuming the places where MemoryAccess and GuardedMemoryAccess are used are affectively all leave routines with no chance of hitting anything that would respond to a safepoint request?
>>
>> Thanks,
>> David
>> -----
>>
>>> On 10/11/2017 3:00 AM, Erik Österlund wrote:
>>> Hi,
>>>
>>> In an effort to remove explicit calls to GC barriers (and other orthogonal forms of barriers, like encoding/decoding oops for compressed oops and fencing for memory ordering), I have built an API that I call "Access". Its purpose is to perform accesses with declarative semantics, to handle multiple orthogonal concerns that affect how an access is performed, including memory ordering, compressed oops, GC barriers for marking, reference strength, etc, and as a result making GCs more modular, and as a result allow new concurrently compacting GC schemes utilizing load barriers to live in harmony in hotspot without everyone going crazy manually inserting barriers if UseBlahGC is enabled.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>>
>>> So there are three views of this I suppose:
>>>
>>> 1) The frontend: how this is actually used in shared code
>>> 2) The backends: how anyone writing a GC sticks their required barriers in there
>>> 3) The internals: how accesses find their way from the frontend to the corresponding backend
>>>
>>> == Frontend ==
>>>
>>> Let's start with the frontend. I hope I made this fairly simple! You can find it in runtime/access.hpp
>>> Each access annotates its declarative semantics with a set of "decorators", which is the name of the attributes/properties affecting how an access is performed.
>>> There is an Access<decorator> API that makes the declarative semantics possible.
>>>
>>> For example, if I want to perform a load acquire of an oop in the heap that has "weak" strength, I would do something like:
>>> oop result = Access<MO_ACQUIRE | IN_HEAP | ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>>
>>> The Access API would then send the access through some GC backend, that overrides the whole access and tells it to perform a "raw" load acquire, and then possibly keep it alive if necessary (G1 SATB enqueue barriers).
>>>
>>> To make life easier, there are some helpers for the most common access patterns that merely add some default decorator for the involved type of access. For example, there is a RawAccess for performing AS_RAW accesses (that bypasses runtime checks and GC barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets the IN_ROOT decorator for accessing root oops. So for the previous call, I could simply do:
>>>
>>> oop result = HeapAccess<MO_ACQUIRE | ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>>
>>> The access.hpp file introduces each decorator (belonging to some category) with an explanation what it is for. It also introduces all operations you can make with access (loads, stores, cmpxchg, xchg, arraycopy and clone).
>>>
>>> This changeset mostly introduces the Access API but is not complete in annotating the code more than where it gets very awkward if I don't.
>>>
>>> == Backend ==
>>>
>>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level backend that provides basic accesses that may be overridden. By default, it just performs raw accesses without any GC barriers, that handle things like compressed oops and memory ordering only. The ModRef barrier set introduces the notion of pre/post write barriers, that can be overridden for each GC. The CardTableModRef barrier set overrides the post write barrier to mark cards, and G1 overrides it to mark cards slightly differently and do some SATB enqueueing. G1 also overrides loads to see if we need to perform SATB enqueue on weak references.
>>>
>>> The raw accesses go to the RawAccessBarrier (living in accessBackend.hpp) that performs the actual accesses. It connects to Atomic and OrderAccess for accesses that require that.
>>>
>>> == Internals ==
>>>
>>> Internally, the accesses go through a number of stages in access.inline.hpp as documented at the top.
>>>
>>> 1) set default decorators and get rid of CV qualifiers etc. Sanity checking also happens here: we check that the decorators make sense for the access being performed, and that the passed in types are not bogus.
>>> 2) reduce types so if we have a different type of the address and value, then either it is not allowed or it implies we use compressed oops and remember that we know something about whether compressed oops are used or not, before erasing address type
>>> 3) pre-runtime dispatch: figure out if all runtime checks can be bypassed into a raw access
>>> 4) runtime dispatch: send the access through a function pointer that upon the first invocation resolves the intended GC AccessBarrier accessor on the BarrierSet that handles this access, as well as figures out whether we are using compressed oops or not while we are at it, and then calls it through the post-runtime dispatch
>>> 5) post-runtime dispatch: fix some erased types that were not known at compile time such as whether the address is a narrowOop* or oop* depending on whether compressed oops was selected at runtime or not, and call the resolved BarrierSet::AccessBarrier accessor (load/store/etc) with all the call-site build-time and run-time resolved decorators and type information that describes the access.
>>>
>>> Testing: mach5 tier1-5
>>>
>>> Thanks,
>>> /Erik
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

David Holmes
In reply to this post by Erik Österlund-2
On 16/11/2017 2:42 AM, Erik Österlund wrote:

> Hi David,
>
> Thank you for the review.
>
> On 2017-11-15 08:47, David Holmes wrote:
>> Hi Erik,
>>
>> I really like the level of abstraction and encapsulation this provides.
>
> Glad to hear it!
>
>> Can't comment on the GC specific details or the template mechanics
>> directly, of course. :)
>>
>> A couple of comments:
>>
>> src/hotspot/share/oops/klass.hpp
>>
>>  412   // Is an oop/narrowOop null or subtype of this Klass?
>>  413   template <typename T>
>>  414   bool is_covariant(T element);
>>
>> I find "is_covariant" a very obscure way to name this. It may be
>> academically accurate but it's really just asking if the element is of
>> a type that is a subclass of the current klass. The null handling
>> complicates it, but it seems to me that:
>>
>> template <typename T>
>> bool Klass::is_instanceof_or_null(T element);
>>
>> would be more consistent with how we normally refer to things in the
>> VM (though the _or_null can be dropped from the name).
>
> Hmm, I see your point. I have renamed covariant/contravariant
> accordingly to fit better into our current notions.
>
> The ARRAYCOPY_CONTRAVARIANT decorator has been renamed ARRAYCOPY_CHECKCAST.
> The is_covariant check has been renamed is_instanceof_or_null as you
> proposed.
> The covariant_bound() method has been renamed to element_klass().

I completely missed contravariant in there :)

These changes look good.

Thanks.

David
-----

>> ---
>>
>> src/hotspot/share/oops/objArrayOop.cpp
>>
>> Klass* objArrayOopDesc::covariant_bound()
>>
>> There's that word again. :) If you really think you need to use
>> covariance within these API's you really need to add some comments to
>> the method declarations to explain them. Most of us probably have a
>> minimal recollection of covariance and contravariance from discussing
>> type-safety for method parameters and return types. :)
>
> Fixed as mentioned above.
>
>>
>> ---
>>
>> src/hotspot/share/prims/unsafe.cpp
>>
>> The changes from jobjects to oops made me uneasy, but I'm assuming the
>> places where MemoryAccess and GuardedMemoryAccess are used are
>> affectively all leave routines with no chance of hitting anything that
>> would respond to a safepoint request?
>
> Yes, that is correct. There are no thread transitions in those paths.
>
> Here is a new full webrev:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.01/
>
> Incremental:
> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00_01/
>
> Thanks,
> /Erik
>
>> Thanks,
>> David
>> -----
>>
>> On 10/11/2017 3:00 AM, Erik Österlund wrote:
>>> Hi,
>>>
>>> In an effort to remove explicit calls to GC barriers (and other
>>> orthogonal forms of barriers, like encoding/decoding oops for
>>> compressed oops and fencing for memory ordering), I have built an API
>>> that I call "Access". Its purpose is to perform accesses with
>>> declarative semantics, to handle multiple orthogonal concerns that
>>> affect how an access is performed, including memory ordering,
>>> compressed oops, GC barriers for marking, reference strength, etc,
>>> and as a result making GCs more modular, and as a result allow new
>>> concurrently compacting GC schemes utilizing load barriers to live in
>>> harmony in hotspot without everyone going crazy manually inserting
>>> barriers if UseBlahGC is enabled.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>>
>>> So there are three views of this I suppose:
>>>
>>> 1) The frontend: how this is actually used in shared code
>>> 2) The backends: how anyone writing a GC sticks their required
>>> barriers in there
>>> 3) The internals: how accesses find their way from the frontend to
>>> the corresponding backend
>>>
>>> == Frontend ==
>>>
>>> Let's start with the frontend. I hope I made this fairly simple! You
>>> can find it in runtime/access.hpp
>>> Each access annotates its declarative semantics with a set of
>>> "decorators", which is the name of the attributes/properties
>>> affecting how an access is performed.
>>> There is an Access<decorator> API that makes the declarative
>>> semantics possible.
>>>
>>> For example, if I want to perform a load acquire of an oop in the
>>> heap that has "weak" strength, I would do something like:
>>> oop result = Access<MO_ACQUIRE | IN_HEAP |
>>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>>
>>> The Access API would then send the access through some GC backend,
>>> that overrides the whole access and tells it to perform a "raw" load
>>> acquire, and then possibly keep it alive if necessary (G1 SATB
>>> enqueue barriers).
>>>
>>> To make life easier, there are some helpers for the most common
>>> access patterns that merely add some default decorator for the
>>> involved type of access. For example, there is a RawAccess for
>>> performing AS_RAW accesses (that bypasses runtime checks and GC
>>> barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets
>>> the IN_ROOT decorator for accessing root oops. So for the previous
>>> call, I could simply do:
>>>
>>> oop result = HeapAccess<MO_ACQUIRE |
>>> ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>>
>>> The access.hpp file introduces each decorator (belonging to some
>>> category) with an explanation what it is for. It also introduces all
>>> operations you can make with access (loads, stores, cmpxchg, xchg,
>>> arraycopy and clone).
>>>
>>> This changeset mostly introduces the Access API but is not complete
>>> in annotating the code more than where it gets very awkward if I don't.
>>>
>>> == Backend ==
>>>
>>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level
>>> backend that provides basic accesses that may be overridden. By
>>> default, it just performs raw accesses without any GC barriers, that
>>> handle things like compressed oops and memory ordering only. The
>>> ModRef barrier set introduces the notion of pre/post write barriers,
>>> that can be overridden for each GC. The CardTableModRef barrier set
>>> overrides the post write barrier to mark cards, and G1 overrides it
>>> to mark cards slightly differently and do some SATB enqueueing. G1
>>> also overrides loads to see if we need to perform SATB enqueue on
>>> weak references.
>>>
>>> The raw accesses go to the RawAccessBarrier (living in
>>> accessBackend.hpp) that performs the actual accesses. It connects to
>>> Atomic and OrderAccess for accesses that require that.
>>>
>>> == Internals ==
>>>
>>> Internally, the accesses go through a number of stages in
>>> access.inline.hpp as documented at the top.
>>>
>>> 1) set default decorators and get rid of CV qualifiers etc. Sanity
>>> checking also happens here: we check that the decorators make sense
>>> for the access being performed, and that the passed in types are not
>>> bogus.
>>> 2) reduce types so if we have a different type of the address and
>>> value, then either it is not allowed or it implies we use compressed
>>> oops and remember that we know something about whether compressed
>>> oops are used or not, before erasing address type
>>> 3) pre-runtime dispatch: figure out if all runtime checks can be
>>> bypassed into a raw access
>>> 4) runtime dispatch: send the access through a function pointer that
>>> upon the first invocation resolves the intended GC AccessBarrier
>>> accessor on the BarrierSet that handles this access, as well as
>>> figures out whether we are using compressed oops or not while we are
>>> at it, and then calls it through the post-runtime dispatch
>>> 5) post-runtime dispatch: fix some erased types that were not known
>>> at compile time such as whether the address is a narrowOop* or oop*
>>> depending on whether compressed oops was selected at runtime or not,
>>> and call the resolved BarrierSet::AccessBarrier accessor
>>> (load/store/etc) with all the call-site build-time and run-time
>>> resolved decorators and type information that describes the access.
>>>
>>> Testing: mach5 tier1-5
>>>
>>> Thanks,
>>> /Erik
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Erik Österlund-2
Hi David,

Thank you for the review.

/Erik

> On 16 Nov 2017, at 02:51, David Holmes <[hidden email]> wrote:
>
>> On 16/11/2017 2:42 AM, Erik Österlund wrote:
>> Hi David,
>> Thank you for the review.
>>> On 2017-11-15 08:47, David Holmes wrote:
>>> Hi Erik,
>>>
>>> I really like the level of abstraction and encapsulation this provides.
>> Glad to hear it!
>>> Can't comment on the GC specific details or the template mechanics directly, of course. :)
>>>
>>> A couple of comments:
>>>
>>> src/hotspot/share/oops/klass.hpp
>>>
>>>  412   // Is an oop/narrowOop null or subtype of this Klass?
>>>  413   template <typename T>
>>>  414   bool is_covariant(T element);
>>>
>>> I find "is_covariant" a very obscure way to name this. It may be academically accurate but it's really just asking if the element is of a type that is a subclass of the current klass. The null handling complicates it, but it seems to me that:
>>>
>>> template <typename T>
>>> bool Klass::is_instanceof_or_null(T element);
>>>
>>> would be more consistent with how we normally refer to things in the VM (though the _or_null can be dropped from the name).
>> Hmm, I see your point. I have renamed covariant/contravariant accordingly to fit better into our current notions.
>> The ARRAYCOPY_CONTRAVARIANT decorator has been renamed ARRAYCOPY_CHECKCAST.
>> The is_covariant check has been renamed is_instanceof_or_null as you proposed.
>> The covariant_bound() method has been renamed to element_klass().
>
> I completely missed contravariant in there :)
>
> These changes look good.
>
> Thanks.
>
> David
> -----
>
>>> ---
>>>
>>> src/hotspot/share/oops/objArrayOop.cpp
>>>
>>> Klass* objArrayOopDesc::covariant_bound()
>>>
>>> There's that word again. :) If you really think you need to use covariance within these API's you really need to add some comments to the method declarations to explain them. Most of us probably have a minimal recollection of covariance and contravariance from discussing type-safety for method parameters and return types. :)
>> Fixed as mentioned above.
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/unsafe.cpp
>>>
>>> The changes from jobjects to oops made me uneasy, but I'm assuming the places where MemoryAccess and GuardedMemoryAccess are used are affectively all leave routines with no chance of hitting anything that would respond to a safepoint request?
>> Yes, that is correct. There are no thread transitions in those paths.
>> Here is a new full webrev:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.01/
>> Incremental:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00_01/
>> Thanks,
>> /Erik
>>> Thanks,
>>> David
>>> -----
>>>
>>>> On 10/11/2017 3:00 AM, Erik Österlund wrote:
>>>> Hi,
>>>>
>>>> In an effort to remove explicit calls to GC barriers (and other orthogonal forms of barriers, like encoding/decoding oops for compressed oops and fencing for memory ordering), I have built an API that I call "Access". Its purpose is to perform accesses with declarative semantics, to handle multiple orthogonal concerns that affect how an access is performed, including memory ordering, compressed oops, GC barriers for marking, reference strength, etc, and as a result making GCs more modular, and as a result allow new concurrently compacting GC schemes utilizing load barriers to live in harmony in hotspot without everyone going crazy manually inserting barriers if UseBlahGC is enabled.
>>>>
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8189871
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/
>>>>
>>>> So there are three views of this I suppose:
>>>>
>>>> 1) The frontend: how this is actually used in shared code
>>>> 2) The backends: how anyone writing a GC sticks their required barriers in there
>>>> 3) The internals: how accesses find their way from the frontend to the corresponding backend
>>>>
>>>> == Frontend ==
>>>>
>>>> Let's start with the frontend. I hope I made this fairly simple! You can find it in runtime/access.hpp
>>>> Each access annotates its declarative semantics with a set of "decorators", which is the name of the attributes/properties affecting how an access is performed.
>>>> There is an Access<decorator> API that makes the declarative semantics possible.
>>>>
>>>> For example, if I want to perform a load acquire of an oop in the heap that has "weak" strength, I would do something like:
>>>> oop result = Access<MO_ACQUIRE | IN_HEAP | ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>>>
>>>> The Access API would then send the access through some GC backend, that overrides the whole access and tells it to perform a "raw" load acquire, and then possibly keep it alive if necessary (G1 SATB enqueue barriers).
>>>>
>>>> To make life easier, there are some helpers for the most common access patterns that merely add some default decorator for the involved type of access. For example, there is a RawAccess for performing AS_RAW accesses (that bypasses runtime checks and GC barriers), HeapAccess sets the IN_HEAP decorator and RootAccess sets the IN_ROOT decorator for accessing root oops. So for the previous call, I could simply do:
>>>>
>>>> oop result = HeapAccess<MO_ACQUIRE | ON_WEAK_OOP_REF>::oop_load_at(obj, offset);
>>>>
>>>> The access.hpp file introduces each decorator (belonging to some category) with an explanation what it is for. It also introduces all operations you can make with access (loads, stores, cmpxchg, xchg, arraycopy and clone).
>>>>
>>>> This changeset mostly introduces the Access API but is not complete in annotating the code more than where it gets very awkward if I don't.
>>>>
>>>> == Backend ==
>>>>
>>>> For a GC maintainer, the BarrierSet::AccessBarrier is the top level backend that provides basic accesses that may be overridden. By default, it just performs raw accesses without any GC barriers, that handle things like compressed oops and memory ordering only. The ModRef barrier set introduces the notion of pre/post write barriers, that can be overridden for each GC. The CardTableModRef barrier set overrides the post write barrier to mark cards, and G1 overrides it to mark cards slightly differently and do some SATB enqueueing. G1 also overrides loads to see if we need to perform SATB enqueue on weak references.
>>>>
>>>> The raw accesses go to the RawAccessBarrier (living in accessBackend.hpp) that performs the actual accesses. It connects to Atomic and OrderAccess for accesses that require that.
>>>>
>>>> == Internals ==
>>>>
>>>> Internally, the accesses go through a number of stages in access.inline.hpp as documented at the top.
>>>>
>>>> 1) set default decorators and get rid of CV qualifiers etc. Sanity checking also happens here: we check that the decorators make sense for the access being performed, and that the passed in types are not bogus.
>>>> 2) reduce types so if we have a different type of the address and value, then either it is not allowed or it implies we use compressed oops and remember that we know something about whether compressed oops are used or not, before erasing address type
>>>> 3) pre-runtime dispatch: figure out if all runtime checks can be bypassed into a raw access
>>>> 4) runtime dispatch: send the access through a function pointer that upon the first invocation resolves the intended GC AccessBarrier accessor on the BarrierSet that handles this access, as well as figures out whether we are using compressed oops or not while we are at it, and then calls it through the post-runtime dispatch
>>>> 5) post-runtime dispatch: fix some erased types that were not known at compile time such as whether the address is a narrowOop* or oop* depending on whether compressed oops was selected at runtime or not, and call the resolved BarrierSet::AccessBarrier accessor (load/store/etc) with all the call-site build-time and run-time resolved decorators and type information that describes the access.
>>>>
>>>> Testing: mach5 tier1-5
>>>>
>>>> Thanks,
>>>> /Erik

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

Kim Barrett
In reply to this post by Erik Österlund-2
> On Nov 15, 2017, at 11:55 AM, Erik Österlund <[hidden email]> wrote:
>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/runtime/access.hpp.html 
>>
>> 240   template <typename T>
>> 241   struct OopOrNarrowOopInternal: AllStatic {
>> 242     typedef oop type;
>> 243   };
>> 244
>> 245   template <>
>> 246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
>> 247     typedef narrowOop type;
>> 248   };
>> 249
>>
>> Kim and I agree that we should not have the default template definition for oop and have two specializations instead.  Mostly I agree because this is confusing.
>>
>> 240   template <typename T>
>> 241   struct OopOrNarrowOopInternal;
>>
>> 240   template <>
>> 241   struct OopOrNarrowOopInternal<oop>: AllStatic {
>> 242     typedef oop type;
>> 243   };
>> 244
>> 245   template <>
>> 246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
>> 247     typedef narrowOop type;
>> 248   };
>> 249
>>
>
> This choice is actually deliberate. The reason is that we pass in things that is-an oop but is not exactly oop, like for example arrayOop or oopDesc* and the whole subclass tree if oop. An earlier incarnation of the Access API used a new IsDerived<X, Y> metafunction to determine whether a type is-an oop.
> However, in order to keep the solution simpler and introduce less type dependencies, I instead use this OopOrNarrowOop thing to pass oop-like things through it and narrow it down to oop or narrowOop. If a narrowOop is passed in, then it becomes narrowOop, if you pass in oopDesc*, arrayOop, oop, instanceOop, or whatever, then it results in an oop, and tries to implicitly convert whatever was passed in to oop. That will act as if the overload was "oop" in the way that you can pass in anything that is implicitly convertible to exactly oop or narrowOop, and once you make it past that layer, it will be treated only as oop or narrowOop.
>
> Sorry for any resulting confusion.

Thanks for the explanation.  That wasn’t at all obvious.  It might be useful to add a comment that the default case handles all variants of oop and treats them all as oop.

>> We were also trying to figure out how the runtime would know whether to use IN_CONCURRENT_ROOT vs IN_ROOT decorator, since it probably varies for GCs.  And requires runtime code to know whether the root is scanned concurrently or not, which we don't know.
>>
>
> The Access API needs to know whether this is scanned concurrently or not (because we *have* to perform different barriers then). So whatever data structure the access is performed on, needs to make sure this is known.

The person writing the access call cannot be expected to accurately
know whether or not the associated location will be scanned
concurrently.  Even ignoring some of those people not being GC
experts, there is the problem that the answer might not be knowable
until runtime, because it may (will) depend on which GC is selected,
and possibly other considerations as well.

Does IN_CONCURRENT_ROOT really mean the location *might* be
concurrently processed, and using that tag is a conservative choice
that won't actually be wrong, but may be suboptimal?  (E.g. it should
really be called IS_POSSIBLY_CONCURRENT_ROOT?)  Or are the choices
between these really exclusive?  I'm guessing the former, but I'm not
certain of that, and the descriptions are no help.  Nor do I know how
bad such suboptimal behavior might be.  (If they really are exclusive,
then I don't see how to use these at all.)

When we discussed this problem previously, we talked about having
specific names associated with categories of off-heap references that
might be handled differently by different collectors.  Some specific
examples that came up in that discussion were JNI global (and weak)
handles, and interned strings.  Even if we really expect all of our
concurrent collectors to eventually process all of these concurrently,
such features might be added on different schedules for different
collectors.

Without such a naming scheme, e.g. with only a generic
IS_CONCURRENT_ROOT, different collectors may pay the cost for being
conservative for different categories.  That might be acceptable if
being conservative is cheap.  But then I would expect the possibly
concurrent case to be pretty much the default, to be used nearly
everywhere, since it is unreasonable to expect non-GC experts to stay
up to date on which roots are never ever scanned concurrently by any
collector.



Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8189871: Refactor GC barriers to use declarative semantics

coleen.phillimore


On 11/16/17 2:43 PM, Kim Barrett wrote:

>> On Nov 15, 2017, at 11:55 AM, Erik Österlund <[hidden email]> wrote:
>>> http://cr.openjdk.java.net/~eosterlund/8189871/webrev.00/src/hotspot/share/runtime/access.hpp.html
>>>
>>> 240   template <typename T>
>>> 241   struct OopOrNarrowOopInternal: AllStatic {
>>> 242     typedef oop type;
>>> 243   };
>>> 244
>>> 245   template <>
>>> 246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
>>> 247     typedef narrowOop type;
>>> 248   };
>>> 249
>>>
>>> Kim and I agree that we should not have the default template definition for oop and have two specializations instead.  Mostly I agree because this is confusing.
>>>
>>> 240   template <typename T>
>>> 241   struct OopOrNarrowOopInternal;
>>>
>>> 240   template <>
>>> 241   struct OopOrNarrowOopInternal<oop>: AllStatic {
>>> 242     typedef oop type;
>>> 243   };
>>> 244
>>> 245   template <>
>>> 246   struct OopOrNarrowOopInternal<narrowOop>: AllStatic {
>>> 247     typedef narrowOop type;
>>> 248   };
>>> 249
>>>
>> This choice is actually deliberate. The reason is that we pass in things that is-an oop but is not exactly oop, like for example arrayOop or oopDesc* and the whole subclass tree if oop. An earlier incarnation of the Access API used a new IsDerived<X, Y> metafunction to determine whether a type is-an oop.
>> However, in order to keep the solution simpler and introduce less type dependencies, I instead use this OopOrNarrowOop thing to pass oop-like things through it and narrow it down to oop or narrowOop. If a narrowOop is passed in, then it becomes narrowOop, if you pass in oopDesc*, arrayOop, oop, instanceOop, or whatever, then it results in an oop, and tries to implicitly convert whatever was passed in to oop. That will act as if the overload was "oop" in the way that you can pass in anything that is implicitly convertible to exactly oop or narrowOop, and once you make it past that layer, it will be treated only as oop or narrowOop.
>>
>> Sorry for any resulting confusion.
> Thanks for the explanation.  That wasn’t at all obvious.  It might be useful to add a comment that the default case handles all variants of oop and treats them all as oop.

Please add this comment!

>
>>> We were also trying to figure out how the runtime would know whether to use IN_CONCURRENT_ROOT vs IN_ROOT decorator, since it probably varies for GCs.  And requires runtime code to know whether the root is scanned concurrently or not, which we don't know.
>>>
>> The Access API needs to know whether this is scanned concurrently or not (because we *have* to perform different barriers then). So whatever data structure the access is performed on, needs to make sure this is known.
> The person writing the access call cannot be expected to accurately
> know whether or not the associated location will be scanned
> concurrently.  Even ignoring some of those people not being GC
> experts, there is the problem that the answer might not be knowable
> until runtime, because it may (will) depend on which GC is selected,
> and possibly other considerations as well.

So, yeah, this is pretty confusing.  The RootAccess() assumes that using
it will have a barrier because it's *possibly* scanned concurrently by
the GC.  The roots that are scanned at a safepoint, like the thread
stacks, can't one assume that they do not need an access decoration?  I
can't imagine that you are going to add the Access API to the
interpreter code for the bytecodes that push objects to the stack for
example.

So I have this in OopHandle::resolve() which is eventually going to be
scanned concurrently by some GCs but probably not others:

oop OopHandle::resolve() const {
   return _obj == NULL ? NULL : RootAccess<>(_obj);
}

And a WeakHandle from the vm perspective, will have access like:

oop WeakHandle::resolve() const {
   oop obj = RootAccess<ON_PHANTOM_OOP_REF>(_obj);
   // does this:  ensure_obj_alive(obj);
   return obj;
}

Nowhere do I know or care that these are accessed concurrently, or
rather I assume it could be accessed concurrently so not taking any
chances.  I don't think we want a different decorator.

Thanks,
Coleen

>
> Does IN_CONCURRENT_ROOT really mean the location *might* be
> concurrently processed, and using that tag is a conservative choice
> that won't actually be wrong, but may be suboptimal?  (E.g. it should
> really be called IS_POSSIBLY_CONCURRENT_ROOT?)  Or are the choices
> between these really exclusive?  I'm guessing the former, but I'm not
> certain of that, and the descriptions are no help.  Nor do I know how
> bad such suboptimal behavior might be.  (If they really are exclusive,
> then I don't see how to use these at all.)
>
> When we discussed this problem previously, we talked about having
> specific names associated with categories of off-heap references that
> might be handled differently by different collectors.  Some specific
> examples that came up in that discussion were JNI global (and weak)
> handles, and interned strings.  Even if we really expect all of our
> concurrent collectors to eventually process all of these concurrently,
> such features might be added on different schedules for different
> collectors.
>
> Without such a naming scheme, e.g. with only a generic
> IS_CONCURRENT_ROOT, different collectors may pay the cost for being
> conservative for different categories.  That might be acceptable if
> being conservative is cheap.  But then I would expect the possibly
> concurrent case to be pretty much the default, to be used nearly
> everywhere, since it is unreasonable to expect non-GC experts to stay
> up to date on which roots are never ever scanned concurrently by any
> collector.
>
>
>

123