8179527: Implement intrinsic code for reverseBytes with load/store

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

8179527: Implement intrinsic code for reverseBytes with load/store

Michihiro Horie

Dear all,

Would you please review following change?

Bug: https://bugs.openjdk.java.net/browse/JDK-8179527
Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/

I added new intrinsic code for reverseBytes() in ppc.ad with
* match(Set dst (ReverseBytesI/L/US/S (LoadI src)));
* match(Set dst (StoreI dst (ReverseBytesI/L/US/S src)));


Best regards,
--
Michihiro,
IBM Research - Tokyo
Reply | Threaded
Open this post in threaded view
|

RE: 8179527: Implement intrinsic code for reverseBytes with load/store

Gustavo Serra Scalet
Hi Michihiro,

I wonder if there is no vectorized approach for implementing your "bytes_reverse_long_Ex" instruct on ppc.ad. Or did you avoid doing it so intentionally?

> -----Original Message-----
> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-
> [hidden email]] On Behalf Of Michihiro Horie
> Sent: terça-feira, 2 de maio de 2017 11:47
> To: [hidden email]; [hidden email];
> [hidden email]; [hidden email]
> Subject: 8179527: Implement intrinsic code for reverseBytes with
> load/store
>
> Dear all,
>
> Would you please review following change?
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8179527
> Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/
>
> I added new intrinsic code for reverseBytes() in ppc.ad with
> * match(Set dst (ReverseBytesI/L/US/S (LoadI src)));
> * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src)));
>
>
> Best regards,
> --
> Michihiro,
> IBM Research - Tokyo

Reply | Threaded
Open this post in threaded view
|

RE: 8179527: Implement intrinsic code for reverseBytes with load/store

Doerr, Martin
Hi Michihiro and Gustavo,

thank you very much for implementing this change.

@Gustavo: Thanks for taking a look.
I think that the direct match rules are just there to satisfy match_rule_supported. They don't need to be fast, they are just a fall back solution.
The goal is to exploit the byte reverse load and store instructions which should match in more performance critical cases.

Now my review:

assembler_ppc.hpp:
Looks good except a minor formatting request:
LDBRX_OPCODE  = (31u << OPCODE_SHIFT |  532 << 1),
should be
LDBRX_OPCODE  = (31u << OPCODE_SHIFT | 532u << 1),
to be consistent.
The comments // X-FORM should be aligned with the other ones.

assembler_ppc.inline.hpp:
Good.

ppc.ad:
I'm concerned about the additional match rules which are only used for the expand step. They could match directly leading to incorrect code. What they match is not what they do.
I suggest to implement the code directly in the ins_encode. This would make the new code significantly shorter and less error prone.
I think we don't need to optimize for Power6 anymore and newer processors shouldn't really suffer under a little less optimized instruction scheduling. Would you agree?

Displacements may be too large for "li" so I suggest to use the "indirect" memory operand and let the compiler handle it. I know that it may increase latency because the compiler will need to insert an addition which could better be matched into the memory operand of the load which is harder to implement (it is possible to match an addition in an operand).


Best regards,
Martin


-----Original Message-----
From: Gustavo Serra Scalet [mailto:[hidden email]]
Sent: Dienstag, 2. Mai 2017 17:05
To: Michihiro Horie <[hidden email]>
Cc: [hidden email]; [hidden email]; Simonis, Volker <[hidden email]>; Doerr, Martin <[hidden email]>
Subject: RE: 8179527: Implement intrinsic code for reverseBytes with load/store

Hi Michihiro,

I wonder if there is no vectorized approach for implementing your "bytes_reverse_long_Ex" instruct on ppc.ad. Or did you avoid doing it so intentionally?

> -----Original Message-----
> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-
> [hidden email]] On Behalf Of Michihiro Horie
> Sent: terça-feira, 2 de maio de 2017 11:47
> To: [hidden email]; [hidden email];
> [hidden email]; [hidden email]
> Subject: 8179527: Implement intrinsic code for reverseBytes with
> load/store
>
> Dear all,
>
> Would you please review following change?
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8179527
> Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/
>
> I added new intrinsic code for reverseBytes() in ppc.ad with
> * match(Set dst (ReverseBytesI/L/US/S (LoadI src)));
> * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src)));
>
>
> Best regards,
> --
> Michihiro,
> IBM Research - Tokyo