On Mon, 4 Jan 2021 01:58:38 GMT, Eric Liu <[hidden email]> wrote:
>> This patch transforms '(x >>> rshift) + (x << lshift)' into >> 'RotateRight(x, rshift)' during GVN phase when both the shift exponents >> are constants and their sum equals to the number of bits for the type >> of shift base. >> >> This patch implements some new match rules for AArch64 instructions >> which can take ROR as the optional shift. Such instructions are 'and', >> 'or', 'eor', 'eon', 'bic' and 'orn'. >> >> ror w11, w2, #5 >> eor w0, w1, w11 >> >> With this patch, above code could be optimized to below: >> >> eor w0, w1, w2, ror #5 >> >> Finally, the patch refactors TestRotate.java[1][2]. >> >> Tested jtreg TestRotate.java, hotspot::hotspot_all_no_apps, >> jdk::jdk_core, langtools::tier1. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8252776 >> [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039911.html > > Could anyone help to take a look at this PR? Ping ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
On Thu, 28 Jan 2021 01:47:32 GMT, Eric Liu <[hidden email]> wrote:
>> Could anyone help to take a look at this PR? > > Ping What impact does this have on runtime? There are also Tier1 failures in x86, which you'll need to investigate ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
On Thu, 28 Jan 2021 01:58:33 GMT, Azeem Jiva <[hidden email]> wrote:
>> Ping > > What impact does this have on runtime? There are also Tier1 failures in x86, which you'll need to investigate I suppose this failure was not caused by my patch, and it has been fixed by https://github.com/openjdk/jdk/commit/1594372c. ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
In reply to this post by Eric Liu-2
On 1/28/21 1:50 AM, Eric Liu wrote:
> > Ping > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/1858 I'm reviewing it now. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 |
On 1/28/21 9:30 AM, Andrew Haley wrote:
> On 1/28/21 1:50 AM, Eric Liu wrote: >> >> Ping >> >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/1858 > > I'm reviewing it now. Looks basically OK, but was tough to review because of the flags and comments cleanup that made the diff hard to read. An oddity. For rotations we have: match(Set dst (AndI src1 (XorI(RotateRight src2 src3) src4))); but for all other shifts we have: match(Set dst (AndL src1 (XorL(RShiftL src2 src3) src4))); That is to say, the other shifts have a type suffix. This doesn't look right. Do you know what is going on? Otherwise, this looks OK. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 |
In reply to this post by Eric Liu-2
On Thu, 28 Jan 2021 02:17:47 GMT, Eric Liu <[hidden email]> wrote:
>> What impact does this have on runtime? There are also Tier1 failures in x86, which you'll need to investigate > > I suppose this failure was not caused by my patch, and it has been fixed by https://github.com/openjdk/jdk/commit/1594372c. Thanks for your kindly review. > An oddity. For rotations we have: > > match(Set dst (AndI src1 (XorI(RotateRight src2 src3) src4))); > > but for all other shifts we have: > > match(Set dst (AndL src1 (XorL(RShiftL src2 src3) src4))); > > That is to say, the other shifts have a type suffix. This doesn't look > right. Do you know what is going on? For rotation node, they have an extra field to identify the type, maybe this could save some code size... And I didn't use 'predicate' to check that because middle-end could guarantee Xor's input type was correct. ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
In reply to this post by Eric Liu-2
On Thu, 28 Jan 2021 02:17:47 GMT, Eric Liu <[hidden email]> wrote:
> I suppose this failure was not caused by my patch, and it has been fixed by https://github.com/openjdk/jdk/commit/1594372c. Benchmark results? ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
In reply to this post by Eric Liu-2
On Mon, 21 Dec 2020 10:10:10 GMT, Eric Liu <[hidden email]> wrote:
> This patch transforms '(x >>> rshift) + (x << lshift)' into > 'RotateRight(x, rshift)' during GVN phase when both the shift exponents > are constants and their sum equals to the number of bits for the type > of shift base. > > This patch implements some new match rules for AArch64 instructions > which can take ROR as the optional shift. Such instructions are 'and', > 'or', 'eor', 'eon', 'bic' and 'orn'. > > ror w11, w2, #5 > eor w0, w1, w11 > > With this patch, above code could be optimized to below: > > eor w0, w1, w2, ror #5 > > Finally, the patch refactors TestRotate.java[1][2]. > > Tested jtreg TestRotate.java, hotspot::hotspot_all_no_apps, > jdk::jdk_core, langtools::tier1. > > [1] https://bugs.openjdk.java.net/browse/JDK-8252776 > [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039911.html All that remains to do is the benchmarks. ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
In reply to this post by Eric Liu-2
On Thu, 28 Jan 2021 13:29:59 GMT, Eric Liu <[hidden email]> wrote:
> > An oddity. For rotations we have: > > ``` match(Set dst (AndI src1 (XorI(RotateRight src2 src3) src4))); ``` > > but for all other shifts we have: > > ``` match(Set dst (AndL src1 (XorL(RShiftL src2 src3) src4))); ``` > > That is to say, the other shifts have a type suffix. This doesn't look > > right. Do you know what is going on? > > For rotation node, they have an extra field to identify the type, maybe this could save some code size... And I didn't use 'predicate' to check that because middle-end could guarantee Xor's input type was correct. Ok, I see. That's very ugly, but it's not your fault. ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
In reply to this post by Andrew Haley-2
On Thu, 28 Jan 2021 14:26:32 GMT, Andrew Haley <[hidden email]> wrote:
>> This patch transforms '(x >>> rshift) + (x << lshift)' into >> 'RotateRight(x, rshift)' during GVN phase when both the shift exponents >> are constants and their sum equals to the number of bits for the type >> of shift base. >> >> This patch implements some new match rules for AArch64 instructions >> which can take ROR as the optional shift. Such instructions are 'and', >> 'or', 'eor', 'eon', 'bic' and 'orn'. >> >> ror w11, w2, #5 >> eor w0, w1, w11 >> >> With this patch, above code could be optimized to below: >> >> eor w0, w1, w2, ror #5 >> >> Finally, the patch refactors TestRotate.java[1][2]. >> >> Tested jtreg TestRotate.java, hotspot::hotspot_all_no_apps, >> jdk::jdk_core, langtools::tier1. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8252776 >> [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039911.html > > All that remains to do is the benchmarks. I benchmarked on 7 platforms with jmh test[1], most of them were hard to see any performance changes. On one platform the performance become bad, and seems a little unstable. Before: Benchmark Mode Cnt Score Error Units Rotation.addRotateRight avgt 15 270.080 ± 0.630 ns/op Rotation.andRotateRight avgt 15 269.000 ± 0.226 ns/op Rotation.bicRotateRight avgt 15 269.269 ± 0.769 ns/op Rotation.eonRotateRight avgt 15 269.456 ± 0.056 ns/op Rotation.ornRotateRight avgt 15 269.681 ± 1.028 ns/op Rotation.xorRotateRight avgt 15 270.953 ± 2.764 ns/op After: Benchmark Mode Cnt Score Error Units Rotation.addRotateRight avgt 15 399.995 ± 10.702 ns/op Rotation.andRotateRight avgt 15 399.517 ± 7.933 ns/op Rotation.bicRotateRight avgt 15 396.602 ± 2.038 ns/op Rotation.eonRotateRight avgt 15 401.978 ± 17.371 ns/op Rotation.ornRotateRight avgt 15 431.052 ± 73.711 ns/op Rotation.xorRotateRight avgt 15 410.447 ± 42.929 ns/op ------------------ [1] http://cr.openjdk.java.net/~xgong/rfr/8256438/Rotation.java ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
On 1/29/21 8:35 AM, Eric Liu wrote:
> I benchmarked on 7 platforms with jmh test[1], most of them were hard to see any performance changes. On one platform the performance become bad, and seems a little unstable. Wow. I'll have to look at this. In theory we prefer combined instructions because it should reduce code size. However, we can't escape the possibility that on some implementations combined instructions might cause pipeline stalls. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 |
On 29/01/2021 10:11, Andrew Haley wrote:
> On 1/29/21 8:35 AM, Eric Liu wrote: >> I benchmarked on 7 platforms with jmh test[1], most of them were hard to see any performance changes. On one platform the performance become bad, and seems a little unstable. Your benchmark did not work for me. It did not generate the correct instructions. Please try with this or similar: @Benchmark public void xorRotateRight(MyState s, Blackhole blackhole) { int x = s.xi; int y = s.yi; for (int i = 0; i < COUNT; i++) { y = x ^ ((y >>> 5) | (y << -5)); x = y ^ ((x >>> 5) | (x << -5)); } blackhole.consume(x); } I get: Benchmark Mode Cnt Score Error Units Rotation.xorRotateRight (before) avgt 3 6142.575 ± 15.940 ns/op Rotation.xorRotateRight (after) avgt 3 4081.587 ± 33.904 ns/op Please integrate the corrected benchmark into your patch. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 |
Hi Andrew,
Thanks for your feedback. I refined those benchmarks and on two platforms could have a better performance. One can get about 100% gain, and 30% for another one. Other platforms were hard to see any performance changes. Before: Benchmark Mode Cnt Score Error Units Rotation.andRotateRight avgt 15 3860.994 ± 3.409 ns/op Rotation.bicRotateRight avgt 15 3861.247 ± 3.321 ns/op Rotation.eonRotateRight avgt 15 3860.865 ± 3.003 ns/op Rotation.ornRotateRight avgt 15 3860.884 ± 3.260 ns/op Rotation.xorRotateRight avgt 15 3860.886 ± 2.728 ns/op After: Benchmark Mode Cnt Score Error Units Rotation.andRotateRight avgt 15 1933.495 ± 0.263 ns/op Rotation.bicRotateRight avgt 15 1933.436 ± 0.244 ns/op Rotation.eonRotateRight avgt 15 1933.459 ± 0.255 ns/op Rotation.ornRotateRight avgt 15 1933.559 ± 0.316 ns/op Rotation.xorRotateRight avgt 15 1933.467 ± 0.245 ns/op I would update my patch with the benchmark tests after finishing the whole tests. --Eric -----Original Message----- From: hotspot-compiler-dev <[hidden email]> On Behalf Of Andrew Haley Sent: Saturday, January 30, 2021 12:30 AM To: [hidden email] Subject: Re: RFR: 8256438: AArch64: Implement match rules with ROR shift register value On 29/01/2021 10:11, Andrew Haley wrote: > On 1/29/21 8:35 AM, Eric Liu wrote: >> I benchmarked on 7 platforms with jmh test[1], most of them were hard to see any performance changes. On one platform the performance become bad, and seems a little unstable. Your benchmark did not work for me. It did not generate the correct instructions. Please try with this or similar: @Benchmark public void xorRotateRight(MyState s, Blackhole blackhole) { int x = s.xi; int y = s.yi; for (int i = 0; i < COUNT; i++) { y = x ^ ((y >>> 5) | (y << -5)); x = y ^ ((x >>> 5) | (x << -5)); } blackhole.consume(x); } I get: Benchmark Mode Cnt Score Error Units Rotation.xorRotateRight (before) avgt 3 6142.575 ± 15.940 ns/op Rotation.xorRotateRight (after) avgt 3 4081.587 ± 33.904 ns/op Please integrate the corrected benchmark into your patch. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 |
In reply to this post by Andrew Haley-2
On Thu, 28 Jan 2021 14:26:32 GMT, Andrew Haley <[hidden email]> wrote:
>> Eric Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Add benchmark test >> >> Change-Id: I63ca51d06070a07e5c20daf4b42d2c8d7237a1da > > All that remains to do is the benchmarks. @theRealAph Could you help to take a look ? ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
On Tue, 9 Feb 2021 03:07:38 GMT, Eric Liu <[hidden email]> wrote:
>> All that remains to do is the benchmarks. > > @theRealAph Could you help to take a look ? Ping ------------- PR: https://git.openjdk.java.net/jdk/pull/1858 |
Free forum by Nabble | Edit this page |