[jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Dong Bo
This is a typo introduced by JDK-8255949.
Compiler will generate `ushr` for shifting right and accumulating four short integers.
It produces wrong results for specific case. The instruction should be `usra`.

-------------

Commit messages:
 - 8260585: AArch64: Wrong code generated by shifting right and accumulating four unsigned short integers

Changes: https://git.openjdk.java.net/jdk16/pull/136/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8260585
  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/136.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/136/head:pull/136

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Vladimir Kozlov-2
On Thu, 28 Jan 2021 13:01:07 GMT, Dong Bo <[hidden email]> wrote:

> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Someone familiar with Aarch64 assembler have to review this before it is approved for JDK 16.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Igor Veresov-3
In reply to this post by Dong Bo
On Thu, 28 Jan 2021 13:01:07 GMT, Dong Bo <[hidden email]> wrote:

> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Marked as reviewed by iveresov (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Dean Long-2
In reply to this post by Dong Bo
On Thu, 28 Jan 2021 13:01:07 GMT, Dong Bo <[hidden email]> wrote:

> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Marked as reviewed by dlong (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Dean Long-2
On Fri, 29 Jan 2021 20:08:36 GMT, Dean Long <[hidden email]> wrote:

>> This is a typo introduced by JDK-8255949.
>> Compiler will generate `ushr` for shifting right and accumulating four short integers.
>> It produces wrong results for specific case. The instruction should be `usra`.
>
> Marked as reviewed by dlong (Reviewer).

Why didn't the testing for JDK-8255949 catch this?  Do you need to fix the regression test too?

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Vladimir Kozlov-2
On Fri, 29 Jan 2021 20:24:05 GMT, Dean Long <[hidden email]> wrote:

>> Marked as reviewed by dlong (Reviewer).
>
> Why didn't the testing for JDK-8255949 catch this?  Do you need to fix the regression test too?

Yes, we need regression test for this fix. Or modify existing one to catch it.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v2]

Dong Bo
In reply to this post by Dong Bo
> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Dong Bo has updated the pull request incrementally with two additional commits since the last revision:

 - fix trailing whitespace
 - add tests for shift right and accumulate

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/136/files
  - new: https://git.openjdk.java.net/jdk16/pull/136/files/d97281ab..f212c5e6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=00-01

  Stats: 196 lines in 1 file changed: 196 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/136.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/136/head:pull/136

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v2]

Dong Bo
In reply to this post by Vladimir Kozlov-2
On Fri, 29 Jan 2021 20:31:16 GMT, Vladimir Kozlov <[hidden email]> wrote:

>> Why didn't the testing for JDK-8255949 catch this?  Do you need to fix the regression test too?
>
> Yes, we need regression test for this fix. Or modify existing one to catch it.

Did not run local tests for small loops in JDK-8255949.
Updated a test for all shift and accumulating operations which can catch this.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v2]

Andrew Haley
On 1/30/21 5:07 AM, Dong Bo wrote:

> On Fri, 29 Jan 2021 20:31:16 GMT, Vladimir Kozlov <[hidden email]> wrote:
>
>>> Why didn't the testing for JDK-8255949 catch this?  Do you need to fix the regression test too?
>>
>> Yes, we need regression test for this fix. Or modify existing one to catch it.
>
> Did not run local tests for small loops in JDK-8255949.
> Updated a test for all shift and accumulating operations which can catch this.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk16/pull/136

I don't understand. Looking at this:

instruct vsrla4S_imm(vecD dst, vecD src, immI shift) %{
  predicate(n->as_Vector()->length() == 4);
  match(Set dst (AddVS dst (URShiftVS src (RShiftCntV shift))));
  ins_cost(INSN_COST);
  format %{ "usra    $dst, $src, $shift\t# vector (4H)" %}
  ins_encode %{
    int sh = (int)$shift$$constant;
    if (sh >= 16) {
      __ eor(as_FloatRegister($src$$reg), __ T8B,
             as_FloatRegister($src$$reg),
             as_FloatRegister($src$$reg));
    } else {
      __ usra(as_FloatRegister($dst$$reg), __ T4H,
             as_FloatRegister($src$$reg), sh);
    }
  %}
  ins_pipe(vshift64_imm);
%}

What happens when the shift is >= 16? What happens to src and dst?

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v3]

Dong Bo
In reply to this post by Dong Bo
> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Dong Bo has updated the pull request incrementally with one additional commit since the last revision:

  make empty ins_encode when shift >= 16 (chars)

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/136/files
  - new: https://git.openjdk.java.net/jdk16/pull/136/files/f212c5e6..b7ef8fb8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=01-02

  Stats: 20 lines in 1 file changed: 0 ins; 16 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/136.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/136/head:pull/136

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v3]

Dong Bo
In reply to this post by Dong Bo
On Sat, 30 Jan 2021 05:01:09 GMT, Dong Bo <[hidden email]> wrote:

>> Yes, we need regression test for this fix. Or modify existing one to catch it.
>
> Did not run local tests for small loops in JDK-8255949.
> Updated a test for all shift and accumulating operations which can catch this.

> _Mailing list message from [Andrew Haley](mailto:[hidden email]) on [hotspot-dev](mailto:[hidden email]):_
>
> On 1/30/21 5:07 AM, Dong Bo wrote:
>
> I don't understand. Looking at this:
>
> instruct vsrla4S_imm(vecD dst, vecD src, immI shift) %{
> predicate(n->as_Vector()->length() == 4);
> match(Set dst (AddVS dst (URShiftVS src (RShiftCntV shift))));
> ins_cost(INSN_COST);
> format %{ "usra $dst, $src, $shift\t# vector (4H)" %}
> ins_encode %{
> int sh = (int)$shift$$constant;
> if (sh >= 16) {
> __ eor(as_FloatRegister($src$$reg), __ T8B,
> as_FloatRegister($src$$reg),
> as_FloatRegister($src$$reg));
> } else {
> __ usra(as_FloatRegister($dst$$reg), __ T4H,
> as_FloatRegister($src$$reg), sh);
> }
> %}
> ins_pipe(vshift64_imm);
> %}
>
> What happens when the shift is >= 16? What happens to src and dst?
>
> --
> Andrew Haley (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

This was wrong, both src and dst should have the same value as before.
Actually, when the shift is `>= 16`, the URShift is optimized to zero by the compiler.
So we don't have a `vsrla4S_imm` match if `shift >= 16`, the wrong `eor` is not generated.
Check the assembly code of the following test:
# test
public void shiftURightAccumulateChar() {
      for (int i = 0; i < count; i++) {
           charsD[i] = (char) (charsA[i] + (charsB[i] >>> 16));
      }
}
# assembly code, the `shift` is gone, only `move` left
1.17%  │   0x0000ffff88075348:   ldr  q16, [x14,#16]
         │   0x0000ffff8807534c:   add  x12, x19, x12
         │   0x0000ffff88075350:   str  q16, [x12,#16]
  1.66%  │   0x0000ffff88075354:   ldr  q16, [x14,#32]
         │   0x0000ffff88075358:   str  q16, [x12,#32]
  2.03%  │   0x0000ffff8807535c:   ldr  q16, [x14,#48]
         │   0x0000ffff88075360:   str  q16, [x12,#48]
  1.39%  │   0x0000ffff88075364:   ldr  q16, [x14,#64]
         │   0x0000ffff88075368:   str  q16, [x12,#64]

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v4]

Dong Bo
In reply to this post by Dong Bo
> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Dong Bo has updated the pull request incrementally with two additional commits since the last revision:

 - fix trailing whitespace
 - add tests for shifting counts

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/136/files
  - new: https://git.openjdk.java.net/jdk16/pull/136/files/b7ef8fb8..f2e490a3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=02-03

  Stats: 286 lines in 1 file changed: 245 ins; 6 del; 35 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/136.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/136/head:pull/136

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v5]

Dong Bo
In reply to this post by Dong Bo
> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Dong Bo has updated the pull request incrementally with one additional commit since the last revision:

  update tests for bytes and shorts

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/136/files
  - new: https://git.openjdk.java.net/jdk16/pull/136/files/f2e490a3..ca3d2192

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=03-04

  Stats: 74 lines in 1 file changed: 19 ins; 0 del; 55 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/136.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/136/head:pull/136

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v6]

Dong Bo
In reply to this post by Dong Bo
> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

Dong Bo has updated the pull request incrementally with one additional commit since the last revision:

  Update tests to match .2I. Still cannot match ssra for .8B, sshr+add are not combined.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/136/files
  - new: https://git.openjdk.java.net/jdk16/pull/136/files/ca3d2192..693f8cbd

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=136&range=04-05

  Stats: 88 lines in 1 file changed: 39 ins; 41 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/136.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/136/head:pull/136

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v3]

Dong Bo
In reply to this post by Dong Bo
On Wed, 3 Feb 2021 01:59:38 GMT, Dong Bo <[hidden email]> wrote:

>>> Because we only have predicate(n->as_Vector()->length() == 8) in vsraa8B_imm, so they are not matched.
>>> We should fix this with the following code:
>>
>> I think this is an enhancement, and should be done in a separate patch in jdk mainline.
>
>> > Because we only have predicate(n->as_Vector()->length() == 8) in vsraa8B_imm, so they are not matched.
>> > We should fix this with the following code:
>>
>> I think this is an enhancement, and should be done in a separate patch in jdk mainline.
>
> OK, I update a test with loop size 80 for bytes so that `ssra` for 8B can be matched now.

Ping... Can I get a review for the newest changes? Please let me know if we are ready to go.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v7]

Andrew Haley-2
In reply to this post by Dong Bo
On Wed, 3 Feb 2021 02:04:03 GMT, Dong Bo <[hidden email]> wrote:

>> This is a typo introduced by JDK-8255949.
>> Compiler will generate `ushr` for shifting right and accumulating four short integers.
>> It produces wrong results for specific case. The instruction should be `usra`.
>
> Dong Bo has updated the pull request incrementally with one additional commit since the last revision:
>
>   match ssra with 8B

OK, thanks.

-------------

Marked as reviewed by aph (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

Re: [jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v7]

Dong Bo
On Wed, 3 Feb 2021 09:07:36 GMT, Andrew Haley <[hidden email]> wrote:

>> Dong Bo has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   match ssra with 8B
>
> OK, thanks.

Thank you all for the review.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136
Reply | Threaded
Open this post in threaded view
|

[jdk16] Integrated: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Dong Bo
In reply to this post by Dong Bo
On Thu, 28 Jan 2021 13:01:07 GMT, Dong Bo <[hidden email]> wrote:

> This is a typo introduced by JDK-8255949.
> Compiler will generate `ushr` for shifting right and accumulating four short integers.
> It produces wrong results for specific case. The instruction should be `usra`.

This pull request has now been integrated.

Changeset: 5307afa9
Author:    Dong Bo <[hidden email]>
Committer: Dean Long <[hidden email]>
URL:       https://git.openjdk.java.net/jdk16/commit/5307afa9
Stats:     479 lines in 2 files changed: 458 ins; 16 del; 5 mod

8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers

Reviewed-by: iveresov, dlong, njian, aph

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136