Quantcast

[9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Roland Westrelin-3

http://cr.openjdk.java.net/~roland/8176506/webrev.00/

In test1:

1) the call to test_helper is inlined and the checkcast first checks
whether o is null or not with both branches possibly taken.

2) the compiler can't tell whether the unsafe access is off heap or
not. It assumes it's on heap and compiles it as a load from an oop
(address type is a TypeOopPtr).

3) the loop is unswitched using the null check from the checkcast. One
of the loop copies is for the case where o is null. The memory access
for that copy of the loop is a load from a null object and that causes a
crash in C2.

The fix I'm proposing is that for every unsafe access for which we can't
tell whether the access is on heap or off heap we default to using raw
memory instead of assuming it is on heap. This way we stop abusing the
c2 type system by pretending something is an oop when we don't know and
then crash when the assumption that that something is an oop becomes
false. In the patch, I CheckCastPP the Object base to raw memory and
makes sure we have one CheckCastPP per unsafe access so an address
possibly computed from an oop is not live at a safepoint.

This has drawbacks: we have an address computation per unsafe access, if
the object becomes a constant or a stable array during optimizations we
can't optimize the memory access anymore etc. but we already restrict
optimizations a lot in the case where we do not know whether the access
is off heap or not by putting membars around the memory access.

There are cases where we actually implicitly know that the base is not
null. test2 is such a case. The offset is small, so the object input can
only be non null. Another example is an object field access which
doesn't make sense off heap. In those cases, I suggest, c2 assumes the
base is non null and uses a CastPP to cast it to non null.

The problem is that the CastPP alone is not sufficient. In test2, if we
cast the base to non null after loop unswitching we have a cast to non
null with a null input in one branch. That data path dies but that
branch of the if doesn't: that causes c2 to crash. So we actually need a
proper null check that the CastPP depends on but, ideally, we wouldn't
want to pay the price of the check at runtime. The trick I suggest is to
have c2 add a null check with a null branch that goes to a Halt
node. The If node input is a new opaque node with 2 inputs: input 1 is
the null comparison, input 2 is the actual value of the null check (we
know it never fails). After loop optimizations, the opaque node is
replaced by its second input, the If goes away and there's no overhead
at runtime. That new opaque node allows constant folding: if during
optimizations input 1 of the opaque node becomes true or false, the
opaque node is replaced by true or false. In the case of test2, the
final code is the check for null from the checkcast with 2 branches: an
impossible branch that ends up in an halt and the "good" branch where
the unswitched loop is.

Note that there are other intrinsics where we know from the context that
a null object is impossible, so skip a null check. So for those
intrinsics, we might need to cast to non null as well and need to have a
null check that the cast depends on for consistency.

Roland.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Vladimir Ivanov
Roland,

I find it problematic to decide whether an unsafe access is always
on-heap or not during parsing. Considering we plan to remove membars
around suspicious accesses (as a fix for JDK-8176513), I would like to
avoid conservatively treating them as raw.

Can we just disable loop unswitching when it produces raw accesses? How
risky would it be?

There are cases when C2 already skips some optimizations when the base
is NULL (e.g., LoadNode::split_through_phi or MemNode::Ideal_common).

FTR I faced a similar problem before (JDK-8155635 [1]) and initially
experimented with a different approach: convert null-based oop pointers
to raw pointers [2], but after additional discussions decided to abandon it.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8155635
[2]
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-April/022743.html

On 3/13/17 12:35 PM, Roland Westrelin wrote:

>
> http://cr.openjdk.java.net/~roland/8176506/webrev.00/
>
> In test1:
>
> 1) the call to test_helper is inlined and the checkcast first checks
> whether o is null or not with both branches possibly taken.
>
> 2) the compiler can't tell whether the unsafe access is off heap or
> not. It assumes it's on heap and compiles it as a load from an oop
> (address type is a TypeOopPtr).
>
> 3) the loop is unswitched using the null check from the checkcast. One
> of the loop copies is for the case where o is null. The memory access
> for that copy of the loop is a load from a null object and that causes a
> crash in C2.
>
> The fix I'm proposing is that for every unsafe access for which we can't
> tell whether the access is on heap or off heap we default to using raw
> memory instead of assuming it is on heap. This way we stop abusing the
> c2 type system by pretending something is an oop when we don't know and
> then crash when the assumption that that something is an oop becomes
> false. In the patch, I CheckCastPP the Object base to raw memory and
> makes sure we have one CheckCastPP per unsafe access so an address
> possibly computed from an oop is not live at a safepoint.
>
> This has drawbacks: we have an address computation per unsafe access, if
> the object becomes a constant or a stable array during optimizations we
> can't optimize the memory access anymore etc. but we already restrict
> optimizations a lot in the case where we do not know whether the access
> is off heap or not by putting membars around the memory access.
>
> There are cases where we actually implicitly know that the base is not
> null. test2 is such a case. The offset is small, so the object input can
> only be non null. Another example is an object field access which
> doesn't make sense off heap. In those cases, I suggest, c2 assumes the
> base is non null and uses a CastPP to cast it to non null.
>
> The problem is that the CastPP alone is not sufficient. In test2, if we
> cast the base to non null after loop unswitching we have a cast to non
> null with a null input in one branch. That data path dies but that
> branch of the if doesn't: that causes c2 to crash. So we actually need a
> proper null check that the CastPP depends on but, ideally, we wouldn't
> want to pay the price of the check at runtime. The trick I suggest is to
> have c2 add a null check with a null branch that goes to a Halt
> node. The If node input is a new opaque node with 2 inputs: input 1 is
> the null comparison, input 2 is the actual value of the null check (we
> know it never fails). After loop optimizations, the opaque node is
> replaced by its second input, the If goes away and there's no overhead
> at runtime. That new opaque node allows constant folding: if during
> optimizations input 1 of the opaque node becomes true or false, the
> opaque node is replaced by true or false. In the case of test2, the
> final code is the check for null from the checkcast with 2 branches: an
> impossible branch that ends up in an halt and the "good" branch where
> the unswitched loop is.
>
> Note that there are other intrinsics where we know from the context that
> a null object is impossible, so skip a null check. So for those
> intrinsics, we might need to cast to non null as well and need to have a
> null check that the cast depends on for consistency.
>
> Roland.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Vladimir Ivanov
> There are cases when C2 already skips some optimizations when the base
> is NULL (e.g., LoadNode::split_through_phi or MemNode::Ideal_common).

s/is NULL/can be NULL/

Best regards,
Vladimir Ivanov

> On 3/13/17 12:35 PM, Roland Westrelin wrote:
>>
>> http://cr.openjdk.java.net/~roland/8176506/webrev.00/
>>
>> In test1:
>>
>> 1) the call to test_helper is inlined and the checkcast first checks
>> whether o is null or not with both branches possibly taken.
>>
>> 2) the compiler can't tell whether the unsafe access is off heap or
>> not. It assumes it's on heap and compiles it as a load from an oop
>> (address type is a TypeOopPtr).
>>
>> 3) the loop is unswitched using the null check from the checkcast. One
>> of the loop copies is for the case where o is null. The memory access
>> for that copy of the loop is a load from a null object and that causes a
>> crash in C2.
>>
>> The fix I'm proposing is that for every unsafe access for which we can't
>> tell whether the access is on heap or off heap we default to using raw
>> memory instead of assuming it is on heap. This way we stop abusing the
>> c2 type system by pretending something is an oop when we don't know and
>> then crash when the assumption that that something is an oop becomes
>> false. In the patch, I CheckCastPP the Object base to raw memory and
>> makes sure we have one CheckCastPP per unsafe access so an address
>> possibly computed from an oop is not live at a safepoint.
>>
>> This has drawbacks: we have an address computation per unsafe access, if
>> the object becomes a constant or a stable array during optimizations we
>> can't optimize the memory access anymore etc. but we already restrict
>> optimizations a lot in the case where we do not know whether the access
>> is off heap or not by putting membars around the memory access.
>>
>> There are cases where we actually implicitly know that the base is not
>> null. test2 is such a case. The offset is small, so the object input can
>> only be non null. Another example is an object field access which
>> doesn't make sense off heap. In those cases, I suggest, c2 assumes the
>> base is non null and uses a CastPP to cast it to non null.
>>
>> The problem is that the CastPP alone is not sufficient. In test2, if we
>> cast the base to non null after loop unswitching we have a cast to non
>> null with a null input in one branch. That data path dies but that
>> branch of the if doesn't: that causes c2 to crash. So we actually need a
>> proper null check that the CastPP depends on but, ideally, we wouldn't
>> want to pay the price of the check at runtime. The trick I suggest is to
>> have c2 add a null check with a null branch that goes to a Halt
>> node. The If node input is a new opaque node with 2 inputs: input 1 is
>> the null comparison, input 2 is the actual value of the null check (we
>> know it never fails). After loop optimizations, the opaque node is
>> replaced by its second input, the If goes away and there's no overhead
>> at runtime. That new opaque node allows constant folding: if during
>> optimizations input 1 of the opaque node becomes true or false, the
>> opaque node is replaced by true or false. In the case of test2, the
>> final code is the check for null from the checkcast with 2 branches: an
>> impossible branch that ends up in an halt and the "good" branch where
>> the unswitched loop is.
>>
>> Note that there are other intrinsics where we know from the context that
>> a null object is impossible, so skip a null check. So for those
>> intrinsics, we might need to cast to non null as well and need to have a
>> null check that the cast depends on for consistency.
>>
>> Roland.
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Roland Westrelin-3
In reply to this post by Vladimir Ivanov

Hi Vladimir,

Thanks for looking at this.

> I find it problematic to decide whether an unsafe access is always
> on-heap or not during parsing. Considering we plan to remove membars
> around suspicious accesses (as a fix for JDK-8176513), I would like to
> avoid conservatively treating them as raw.

Isn't the plan to remove the membars for 9 and put it back in 10?
I think it's a problem that we build a broken graph and try to plug the
holes as we find them because there will always be holes that we are not
yet aware of. That's why my suggestion is to be conservative. I also
think we should use arguments profiling and a null check to
speculatively cast base to non null when it makes sense. We have the
machinery to do that and it's a very simple change.

> FTR I faced a similar problem before (JDK-8155635 [1]) and initially
> experimented with a different approach: convert null-based oop pointers
> to raw pointers [2], but after additional discussions decided to abandon it.

I noticed that discussion. When we hit a similar problem with Shenandoah
I used a fix similar to the one you suggested. If you read Vladimir K's
answer:

http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-April/022745.html

"I think we should track which pointers are really RAW when creating
them."

which is what the fix I'm proposing now is doing.

Roland.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Vladimir Ivanov
>> I find it problematic to decide whether an unsafe access is always
>> on-heap or not during parsing. Considering we plan to remove membars
>> around suspicious accesses (as a fix for JDK-8176513), I would like to
>> avoid conservatively treating them as raw.
>
> Isn't the plan to remove the membars for 9 and put it back in 10?

JDK-8176513 removed membars around (possibly) mixed accesses and the fix
we are discussing makes them obsolete.

> I think it's a problem that we build a broken graph and try to plug the
> holes as we find them because there will always be holes that we are not
> yet aware of. That's why my suggestion is to be conservative. I also
> think we should use arguments profiling and a null check to
> speculatively cast base to non null when it makes sense. We have the
> machinery to do that and it's a very simple change.

I agree with your concerns and fully support using argument profiling
and speculative casts to improve the code w/ unsafe accesses.

>> FTR I faced a similar problem before (JDK-8155635 [1]) and initially
>> experimented with a different approach: convert null-based oop pointers
>> to raw pointers [2], but after additional discussions decided to abandon it.
>
> I noticed that discussion. When we hit a similar problem with Shenandoah
> I used a fix similar to the one you suggested. If you read Vladimir K's
> answer:
>
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-April/022745.html
>
> "I think we should track which pointers are really RAW when creating
> them."
>
> which is what the fix I'm proposing now is doing.

Ok, after thinking about it more, I agree with your arguments.

My last concern is that by marking possibly mixed accesses as RAW during
parsing, we limit what optimizations can be applied later.

I thought about allowing rawptr => oopptr conversions for mixed accesses
once they become on-heap (+ some cleanups; full patch [1]):

http://cr.openjdk.java.net/~vlivanov/roland/8176506/webrev.01_00/

What do you think about it?

Also, I think the checks in LoadNode::split_through_phi &
MemNode::Ideal_common I mentioned before affect normal accesses as well,
so probably the problem with loop unswitching you encountered isn't
specific to unsafe accesses. But let's keep it separate.

Best regards,
Vladimir Ivanov

[1] http://cr.openjdk.java.net/~vlivanov/roland/8176506/webrev.01/


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Roland Westrelin-3

Hi Vladimir,

>> Isn't the plan to remove the membars for 9 and put it back in 10?
>
> JDK-8176513 removed membars around (possibly) mixed accesses and the fix
> we are discussing makes them obsolete.

If we use unsafe to access a heap allocated object but it's impossible
for the compiler to tell it's heap allocated, then we would compile it
as a raw memory access. If we access the same object's field with a
regular access in the same method, then there's a chance that access
would be reordered with the unsafe access. So we still need the membars?

> My last concern is that by marking possibly mixed accesses as RAW during
> parsing, we limit what optimizations can be applied later.
>
> I thought about allowing rawptr => oopptr conversions for mixed accesses
> once they become on-heap (+ some cleanups; full patch [1]):
>
> http://cr.openjdk.java.net/~vlivanov/roland/8176506/webrev.01_00/
>
> What do you think about it?

Thanks for the cleanup and fixes.
Isn't converting rawptr => oopptr a variation of:

http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-April/022745.html

that Vladimir K didn't like?

Uses of the AddP records the memory slice they operate on:
MemNode::_adr_type, LoadStoreNode::_adr_type. That would need to be
fixed if the AddP type changes to avoid inconsistencies.

It seems unlikely but if the raw memory type was captured by a node: a
Phi that would merge 2 AddPNodes for instance. If the AddPNode types
change to oop ptr, then the Phi type wouldn't be oop ptr. I wonder if
that could cause problems.

If we think the compiler would find more non null bases after some
optimizations than it does at parse time, then one solution could be to
delay inlining of the unsafe accesses.

> Also, I think the checks in LoadNode::split_through_phi &
> MemNode::Ideal_common I mentioned before affect normal accesses as well,
> so probably the problem with loop unswitching you encountered isn't
> specific to unsafe accesses. But let's keep it separate.

Can you point to the exact code you're referring to?

Roland.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Vladimir Ivanov

>>> Isn't the plan to remove the membars for 9 and put it back in 10?
>>
>> JDK-8176513 removed membars around (possibly) mixed accesses and the fix
>> we are discussing makes them obsolete.
>
> If we use unsafe to access a heap allocated object but it's impossible
> for the compiler to tell it's heap allocated, then we would compile it
> as a raw memory access. If we access the same object's field with a
> regular access in the same method, then there's a chance that access
> would be reordered with the unsafe access. So we still need the membars?

Yes, my bad.

>> My last concern is that by marking possibly mixed accesses as RAW during
>> parsing, we limit what optimizations can be applied later.
>>
>> I thought about allowing rawptr => oopptr conversions for mixed accesses
>> once they become on-heap (+ some cleanups; full patch [1]):
>>
>> http://cr.openjdk.java.net/~vlivanov/roland/8176506/webrev.01_00/
>>
>> What do you think about it?
>
> Thanks for the cleanup and fixes.
> Isn't converting rawptr => oopptr a variation of:
>
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-April/022745.html
>
> that Vladimir K didn't like?
>
> Uses of the AddP records the memory slice they operate on:
> MemNode::_adr_type, LoadStoreNode::_adr_type. That would need to be
> fixed if the AddP type changes to avoid inconsistencies.
>
> It seems unlikely but if the raw memory type was captured by a node: a
> Phi that would merge 2 AddPNodes for instance. If the AddPNode types
> change to oop ptr, then the Phi type wouldn't be oop ptr. I wonder if
> that could cause problems.

Good point. I erroneously assumed that the difference between (rawptr =>
oopptr) and (oopptr => rawptr) conversions is widening vs narrowing
types, but it's not the case since rawptr & oopptr types are disjoint.

So, the proper type for mixed accesses would be TypePtr::BOTTOM, but it
won't make the compiler happy :-)

> If we think the compiler would find more non null bases after some
> optimizations than it does at parse time, then one solution could be to
> delay inlining of the unsafe accesses.

Ok, let's leave mixed accesses aside. If it turns out to be an important
case, then we can revisit it later. Argument profiling should minimize
the chances of on-heap unsafe access to be treated as mixed.

>> Also, I think the checks in LoadNode::split_through_phi &
>> MemNode::Ideal_common I mentioned before affect normal accesses as well,
>> so probably the problem with loop unswitching you encountered isn't
>> specific to unsafe accesses. But let's keep it separate.
>
> Can you point to the exact code you're referring to?

http://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/78c27c5148a7/src/share/vm/opto/memnode.cpp#l1312

http://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/78c27c5148a7/src/share/vm/opto/memnode.cpp#l331

Best regards,
Vladimir Ivanov
Loading...