Quantcast

RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz

Hi all, 

 

may I kindly request reviews for my medium enhancement? Further down the road, I would need a sponsor, too.

 

Bug: https://bugs.openjdk.java.net/browse/JDK-8175368

Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/

 

Description: 

This intrinsic implementation provides some performance benefit over the standard Java implementation. It uses only well-known “standard” instructions, available on all supported IBM System z hardware. Being very similar to the CRC32 intrinsics, it was tried to share as much code as possible between these two.

 

Performance may be expected to increase up to 2.0x, compared to a run with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of the byte array fed into CRC32C and, to some extent, on the H/W generation the test runs on.

 

Thanks,

Lutz

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz

Hi all,

 

the webrev URL below was incorrect. It displayed the correct location, but pointed elsewhere. The issue is fixed now and the URLs are correct.

 

Thank you!

Lutz

 

From: Lutz Schmidt <[hidden email]>
Date: Montag, 27. Februar 2017 um 15:55
To: "[hidden email]" <[hidden email]>
Subject: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

 

Hi all, 

 

may I kindly request reviews for my medium enhancement? Further down the road, I would need a sponsor, too.

 

Bug: https://bugs.openjdk.java.net/browse/JDK-8175368

Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/

 

Description: 

This intrinsic implementation provides some performance benefit over the standard Java implementation. It uses only well-known “standard” instructions, available on all supported IBM System z hardware. Being very similar to the CRC32 intrinsics, it was tried to share as much code as possible between these two.

 

Performance may be expected to increase up to 2.0x, compared to a run with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of the byte array fed into CRC32C and, to some extent, on the H/W generation the test runs on.

 

Thanks,

Lutz

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Vladimir Kozlov
In reply to this post by Schmidt, Lutz
Hi, Lutz

The link to webrev is wrong. It points to 8175370 changes.

Vladimir

On 2/27/17 6:55 AM, Schmidt, Lutz wrote:

> Hi all,
>
>
>
> may I kindly request reviews for my medium enhancement? Further down the
> road, I would need a sponsor, too.
>
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
>
> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
> <http://cr.openjdk.java.net/~lucy/webrevs/8175370/>
>
>
>
> Description:
>
> This intrinsic implementation provides some performance benefit over the
> standard Java implementation. It uses only well-known “standard”
> instructions, available on all supported IBM System z hardware. Being
> very similar to the CRC32 intrinsics, it was tried to share as much code
> as possible between these two.
>
>
>
> Performance may be expected to increase up to 2.0x, compared to a run
> with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
> the byte array fed into CRC32C and, to some extent, on the H/W
> generation the test runs on.
>
>
>
> Thanks,
>
> Lutz
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz
Hi all,
thanks, Vladimir, for pointing me to this. I already noticed it, corrected the URL and sent an update to the DL (approx. 3h ago).
Regards,
Lutz


On 27.02.17, 21:31, "Vladimir Kozlov" <[hidden email]> wrote:

    Hi, Lutz
   
    The link to webrev is wrong. It points to 8175370 changes.
   
    Vladimir
   
    On 2/27/17 6:55 AM, Schmidt, Lutz wrote:
    > Hi all,
    >
    >
    >
    > may I kindly request reviews for my medium enhancement? Further down the
    > road, I would need a sponsor, too.
    >
    >
    >
    > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
    >
    > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
    > <http://cr.openjdk.java.net/~lucy/webrevs/8175370/>
    >
    >
    >
    > Description:
    >
    > This intrinsic implementation provides some performance benefit over the
    > standard Java implementation. It uses only well-known “standard”
    > instructions, available on all supported IBM System z hardware. Being
    > very similar to the CRC32 intrinsics, it was tried to share as much code
    > as possible between these two.
    >
    >
    >
    > Performance may be expected to increase up to 2.0x, compared to a run
    > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
    > the byte array fed into CRC32C and, to some extent, on the H/W
    > generation the test runs on.
    >
    >
    >
    > Thanks,
    >
    > Lutz
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Vladimir Kozlov
In reply to this post by Schmidt, Lutz
Shared changes are good. I assume s390 code will be reviewed by experts.

Thanks,
Vladimir

On 2/27/17 9:34 AM, Schmidt, Lutz wrote:

> Hi all,
>
>
>
> the webrev URL below was incorrect. It displayed the correct location,
> but pointed elsewhere. The issue is fixed now and the URLs are correct.
>
>
>
> Thank you!
>
> Lutz
>
>
>
> *From: *Lutz Schmidt <[hidden email]>
> *Date: *Montag, 27. Februar 2017 um 15:55
> *To: *"[hidden email]"
> <[hidden email]>
> *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
> CRC32C
>
>
>
> Hi all,
>
>
>
> may I kindly request reviews for my medium enhancement? Further down the
> road, I would need a sponsor, too.
>
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
>
> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
>
>
>
> Description:
>
> This intrinsic implementation provides some performance benefit over the
> standard Java implementation. It uses only well-known “standard”
> instructions, available on all supported IBM System z hardware. Being
> very similar to the CRC32 intrinsics, it was tried to share as much code
> as possible between these two.
>
>
>
> Performance may be expected to increase up to 2.0x, compared to a run
> with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
> the byte array fed into CRC32C and, to some extent, on the H/W
> generation the test runs on.
>
>
>
> Thanks,
>
> Lutz
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz
Thank you, Vladimir, for looking at this.

Regards, Lutz


On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:

    Shared changes are good. I assume s390 code will be reviewed by experts.
   
    Thanks,
    Vladimir
   
    On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
    > Hi all,
    >
    >
    >
    > the webrev URL below was incorrect. It displayed the correct location,
    > but pointed elsewhere. The issue is fixed now and the URLs are correct.
    >
    >
    >
    > Thank you!
    >
    > Lutz
    >
    >
    >
    > *From: *Lutz Schmidt <[hidden email]>
    > *Date: *Montag, 27. Februar 2017 um 15:55
    > *To: *"[hidden email]"
    > <[hidden email]>
    > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
    > CRC32C
    >
    >
    >
    > Hi all,
    >
    >
    >
    > may I kindly request reviews for my medium enhancement? Further down the
    > road, I would need a sponsor, too.
    >
    >
    >
    > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
    >
    > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
    >
    >
    >
    > Description:
    >
    > This intrinsic implementation provides some performance benefit over the
    > standard Java implementation. It uses only well-known “standard”
    > instructions, available on all supported IBM System z hardware. Being
    > very similar to the CRC32 intrinsics, it was tried to share as much code
    > as possible between these two.
    >
    >
    >
    > Performance may be expected to increase up to 2.0x, compared to a run
    > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
    > the byte array fed into CRC32C and, to some extent, on the H/W
    > generation the test runs on.
    >
    >
    >
    > Thanks,
    >
    > Lutz
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Doerr, Martin
Hi Lutz,

thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
I only have some minor suggestions:

c1_LIRGenerator_s390.cpp
load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.

macroAssembler_s390.cpp
The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.

stubGenerator_s390.cpp
The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."

stubRoutines_s390.cpp
I didn't review all numbers :-), but I guess you have tested well enough?

It'd be nice if you could share some performance numbers.

Thanks and best regards,
Martin



-----Original Message-----
From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
Sent: Montag, 27. Februar 2017 22:33
To: Vladimir Kozlov <[hidden email]>; [hidden email]
Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Thank you, Vladimir, for looking at this.

Regards, Lutz


On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:

    Shared changes are good. I assume s390 code will be reviewed by experts.
   
    Thanks,
    Vladimir
   
    On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
    > Hi all,
    >
    >
    >
    > the webrev URL below was incorrect. It displayed the correct location,
    > but pointed elsewhere. The issue is fixed now and the URLs are correct.
    >
    >
    >
    > Thank you!
    >
    > Lutz
    >
    >
    >
    > *From: *Lutz Schmidt <[hidden email]>
    > *Date: *Montag, 27. Februar 2017 um 15:55
    > *To: *"[hidden email]"
    > <[hidden email]>
    > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
    > CRC32C
    >
    >
    >
    > Hi all,
    >
    >
    >
    > may I kindly request reviews for my medium enhancement? Further down the
    > road, I would need a sponsor, too.
    >
    >
    >
    > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
    >
    > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
    >
    >
    >
    > Description:
    >
    > This intrinsic implementation provides some performance benefit over the
    > standard Java implementation. It uses only well-known “standard”
    > instructions, available on all supported IBM System z hardware. Being
    > very similar to the CRC32 intrinsics, it was tried to share as much code
    > as possible between these two.
    >
    >
    >
    > Performance may be expected to increase up to 2.0x, compared to a run
    > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
    > the byte array fed into CRC32C and, to some extent, on the H/W
    > generation the test runs on.
    >
    >
    >
    > Thanks,
    >
    > Lutz
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz
Hi Martin,

Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/

The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.

Best regards,
Lutz



On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:

    Hi Lutz,
   
    thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
    I only have some minor suggestions:
   
    c1_LIRGenerator_s390.cpp
    load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
    As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
    The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
   
    macroAssembler_s390.cpp
    The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
   
    stubGenerator_s390.cpp
    The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
   
    stubRoutines_s390.cpp
    I didn't review all numbers :-), but I guess you have tested well enough?
   
    It'd be nice if you could share some performance numbers.
   
    Thanks and best regards,
    Martin
   
   
   
    -----Original Message-----
    From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
    Sent: Montag, 27. Februar 2017 22:33
    To: Vladimir Kozlov <[hidden email]>; [hidden email]
    Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
   
    Thank you, Vladimir, for looking at this.
   
    Regards, Lutz
   
   
    On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
   
        Shared changes are good. I assume s390 code will be reviewed by experts.
       
        Thanks,
        Vladimir
       
        On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
        > Hi all,
        >
        >
        >
        > the webrev URL below was incorrect. It displayed the correct location,
        > but pointed elsewhere. The issue is fixed now and the URLs are correct.
        >
        >
        >
        > Thank you!
        >
        > Lutz
        >
        >
        >
        > *From: *Lutz Schmidt <[hidden email]>
        > *Date: *Montag, 27. Februar 2017 um 15:55
        > *To: *"[hidden email]"
        > <[hidden email]>
        > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
        > CRC32C
        >
        >
        >
        > Hi all,
        >
        >
        >
        > may I kindly request reviews for my medium enhancement? Further down the
        > road, I would need a sponsor, too.
        >
        >
        >
        > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
        >
        > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
        >
        >
        >
        > Description:
        >
        > This intrinsic implementation provides some performance benefit over the
        > standard Java implementation. It uses only well-known “standard”
        > instructions, available on all supported IBM System z hardware. Being
        > very similar to the CRC32 intrinsics, it was tried to share as much code
        > as possible between these two.
        >
        >
        >
        > Performance may be expected to increase up to 2.0x, compared to a run
        > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
        > the byte array fed into CRC32C and, to some extent, on the H/W
        > generation the test runs on.
        >
        >
        >
        > Thanks,
        >
        > Lutz
        >
        >
        >
       
   
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Doerr, Martin
Hi Lutz,

thanks for implementing my suggestions and for the explanations. The change looks good to me.

Best regards,
Martin


-----Original Message-----
From: Schmidt, Lutz
Sent: Dienstag, 28. Februar 2017 15:28
To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Hi Martin,

Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/

The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.

Best regards,
Lutz



On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:

    Hi Lutz,
   
    thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
    I only have some minor suggestions:
   
    c1_LIRGenerator_s390.cpp
    load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
    As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
    The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
   
    macroAssembler_s390.cpp
    The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
   
    stubGenerator_s390.cpp
    The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
   
    stubRoutines_s390.cpp
    I didn't review all numbers :-), but I guess you have tested well enough?
   
    It'd be nice if you could share some performance numbers.
   
    Thanks and best regards,
    Martin
   
   
   
    -----Original Message-----
    From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
    Sent: Montag, 27. Februar 2017 22:33
    To: Vladimir Kozlov <[hidden email]>; [hidden email]
    Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
   
    Thank you, Vladimir, for looking at this.
   
    Regards, Lutz
   
   
    On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
   
        Shared changes are good. I assume s390 code will be reviewed by experts.
       
        Thanks,
        Vladimir
       
        On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
        > Hi all,
        >
        >
        >
        > the webrev URL below was incorrect. It displayed the correct location,
        > but pointed elsewhere. The issue is fixed now and the URLs are correct.
        >
        >
        >
        > Thank you!
        >
        > Lutz
        >
        >
        >
        > *From: *Lutz Schmidt <[hidden email]>
        > *Date: *Montag, 27. Februar 2017 um 15:55
        > *To: *"[hidden email]"
        > <[hidden email]>
        > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
        > CRC32C
        >
        >
        >
        > Hi all,
        >
        >
        >
        > may I kindly request reviews for my medium enhancement? Further down the
        > road, I would need a sponsor, too.
        >
        >
        >
        > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
        >
        > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
        >
        >
        >
        > Description:
        >
        > This intrinsic implementation provides some performance benefit over the
        > standard Java implementation. It uses only well-known “standard”
        > instructions, available on all supported IBM System z hardware. Being
        > very similar to the CRC32 intrinsics, it was tried to share as much code
        > as possible between these two.
        >
        >
        >
        > Performance may be expected to increase up to 2.0x, compared to a run
        > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
        > the byte array fed into CRC32C and, to some extent, on the H/W
        > generation the test runs on.
        >
        >
        >
        > Thanks,
        >
        > Lutz
        >
        >
        >
       
   
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz
Hi all,

I have fixed a tiny little bug (copy result to result register) in c1_LIRAssembler_s390.cpp and created a new webrev:
  http://cr.openjdk.java.net/~lucy/webrevs/8175368.02/

I believe I still need one more review. Furthermore: any volunteer to sponsor my change?

Thanks,
Lutz



On 28.02.17, 17:07, "Doerr, Martin" <[hidden email]> wrote:

    Hi Lutz,
   
    thanks for implementing my suggestions and for the explanations. The change looks good to me.
   
    Best regards,
    Martin
   
   
    -----Original Message-----
    From: Schmidt, Lutz
    Sent: Dienstag, 28. Februar 2017 15:28
    To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
    Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
   
    Hi Martin,
   
    Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/
   
    The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.
   
    Best regards,
    Lutz
   
   
   
    On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:
   
        Hi Lutz,
       
        thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
        I only have some minor suggestions:
       
        c1_LIRGenerator_s390.cpp
        load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
        As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
        The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
       
        macroAssembler_s390.cpp
        The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
       
        stubGenerator_s390.cpp
        The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
       
        stubRoutines_s390.cpp
        I didn't review all numbers :-), but I guess you have tested well enough?
       
        It'd be nice if you could share some performance numbers.
       
        Thanks and best regards,
        Martin
       
       
       
        -----Original Message-----
        From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
        Sent: Montag, 27. Februar 2017 22:33
        To: Vladimir Kozlov <[hidden email]>; [hidden email]
        Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
       
        Thank you, Vladimir, for looking at this.
       
        Regards, Lutz
       
       
        On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
       
            Shared changes are good. I assume s390 code will be reviewed by experts.
           
            Thanks,
            Vladimir
           
            On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
            > Hi all,
            >
            >
            >
            > the webrev URL below was incorrect. It displayed the correct location,
            > but pointed elsewhere. The issue is fixed now and the URLs are correct.
            >
            >
            >
            > Thank you!
            >
            > Lutz
            >
            >
            >
            > *From: *Lutz Schmidt <[hidden email]>
            > *Date: *Montag, 27. Februar 2017 um 15:55
            > *To: *"[hidden email]"
            > <[hidden email]>
            > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
            > CRC32C
            >
            >
            >
            > Hi all,
            >
            >
            >
            > may I kindly request reviews for my medium enhancement? Further down the
            > road, I would need a sponsor, too.
            >
            >
            >
            > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
            >
            > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
            >
            >
            >
            > Description:
            >
            > This intrinsic implementation provides some performance benefit over the
            > standard Java implementation. It uses only well-known “standard”
            > instructions, available on all supported IBM System z hardware. Being
            > very similar to the CRC32 intrinsics, it was tried to share as much code
            > as possible between these two.
            >
            >
            >
            > Performance may be expected to increase up to 2.0x, compared to a run
            > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
            > the byte array fed into CRC32C and, to some extent, on the H/W
            > generation the test runs on.
            >
            >
            >
            > Thanks,
            >
            > Lutz
            >
            >
            >
           
       
       
   
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz
Hi all,

this is yet another update for my RFR. I rebased my changes on the current tip to resolve a clash in templateInterpreterGenerator_s390.cpp – detected by Martin (mdoerr). Thank you!

Furthermore, I removed src/share/vm/c1/c1_compiler.cpp from the changeset. As a result, the CRC32C intrinsics on s390 will not be available in c1. Only later, after 8175369 has been pushed, they will become available. This “trick” avoids conflicting changes to the same source line by 8175368 and 8175369.

Please find the most recent webrev at: http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/

Regards,
Lutz



On 02.03.17, 17:22, "hotspot-compiler-dev on behalf of Schmidt, Lutz" <[hidden email] on behalf of [hidden email]> wrote:

    Hi all,
   
    I have fixed a tiny little bug (copy result to result register) in c1_LIRAssembler_s390.cpp and created a new webrev:
      http://cr.openjdk.java.net/~lucy/webrevs/8175368.02/
   
    I believe I still need one more review. Furthermore: any volunteer to sponsor my change?
   
    Thanks,
    Lutz
   
   
   
    On 28.02.17, 17:07, "Doerr, Martin" <[hidden email]> wrote:
   
        Hi Lutz,
       
        thanks for implementing my suggestions and for the explanations. The change looks good to me.
       
        Best regards,
        Martin
       
       
        -----Original Message-----
        From: Schmidt, Lutz
        Sent: Dienstag, 28. Februar 2017 15:28
        To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
        Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
       
        Hi Martin,
       
        Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/
       
        The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.
       
        Best regards,
        Lutz
       
       
       
        On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:
       
            Hi Lutz,
           
            thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
            I only have some minor suggestions:
           
            c1_LIRGenerator_s390.cpp
            load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
            As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
            The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
           
            macroAssembler_s390.cpp
            The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
           
            stubGenerator_s390.cpp
            The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
           
            stubRoutines_s390.cpp
            I didn't review all numbers :-), but I guess you have tested well enough?
           
            It'd be nice if you could share some performance numbers.
           
            Thanks and best regards,
            Martin
           
           
           
            -----Original Message-----
            From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
            Sent: Montag, 27. Februar 2017 22:33
            To: Vladimir Kozlov <[hidden email]>; [hidden email]
            Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
           
            Thank you, Vladimir, for looking at this.
           
            Regards, Lutz
           
           
            On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
           
                Shared changes are good. I assume s390 code will be reviewed by experts.
               
                Thanks,
                Vladimir
               
                On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
                > Hi all,
                >
                >
                >
                > the webrev URL below was incorrect. It displayed the correct location,
                > but pointed elsewhere. The issue is fixed now and the URLs are correct.
                >
                >
                >
                > Thank you!
                >
                > Lutz
                >
                >
                >
                > *From: *Lutz Schmidt <[hidden email]>
                > *Date: *Montag, 27. Februar 2017 um 15:55
                > *To: *"[hidden email]"
                > <[hidden email]>
                > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
                > CRC32C
                >
                >
                >
                > Hi all,
                >
                >
                >
                > may I kindly request reviews for my medium enhancement? Further down the
                > road, I would need a sponsor, too.
                >
                >
                >
                > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
                >
                > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
                >
                >
                >
                > Description:
                >
                > This intrinsic implementation provides some performance benefit over the
                > standard Java implementation. It uses only well-known “standard”
                > instructions, available on all supported IBM System z hardware. Being
                > very similar to the CRC32 intrinsics, it was tried to share as much code
                > as possible between these two.
                >
                >
                >
                > Performance may be expected to increase up to 2.0x, compared to a run
                > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
                > the byte array fed into CRC32C and, to some extent, on the H/W
                > generation the test runs on.
                >
                >
                >
                > Thanks,
                >
                > Lutz
                >
                >
                >
               
           
           
       
       
   
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Volker Simonis
Hi Lutz,

the change looks good!

As this is now s390-only, I've pushed it.

Regards,
Volker


On Wed, Mar 8, 2017 at 3:37 PM, Schmidt, Lutz <[hidden email]> wrote:

> Hi all,
>
> this is yet another update for my RFR. I rebased my changes on the current tip to resolve a clash in templateInterpreterGenerator_s390.cpp – detected by Martin (mdoerr). Thank you!
>
> Furthermore, I removed src/share/vm/c1/c1_compiler.cpp from the changeset. As a result, the CRC32C intrinsics on s390 will not be available in c1. Only later, after 8175369 has been pushed, they will become available. This “trick” avoids conflicting changes to the same source line by 8175368 and 8175369.
>
> Please find the most recent webrev at: http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/
>
> Regards,
> Lutz
>
>
>
> On 02.03.17, 17:22, "hotspot-compiler-dev on behalf of Schmidt, Lutz" <[hidden email] on behalf of [hidden email]> wrote:
>
>     Hi all,
>
>     I have fixed a tiny little bug (copy result to result register) in c1_LIRAssembler_s390.cpp and created a new webrev:
>       http://cr.openjdk.java.net/~lucy/webrevs/8175368.02/
>
>     I believe I still need one more review. Furthermore: any volunteer to sponsor my change?
>
>     Thanks,
>     Lutz
>
>
>
>     On 28.02.17, 17:07, "Doerr, Martin" <[hidden email]> wrote:
>
>         Hi Lutz,
>
>         thanks for implementing my suggestions and for the explanations. The change looks good to me.
>
>         Best regards,
>         Martin
>
>
>         -----Original Message-----
>         From: Schmidt, Lutz
>         Sent: Dienstag, 28. Februar 2017 15:28
>         To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
>         Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
>
>         Hi Martin,
>
>         Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/
>
>         The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.
>
>         Best regards,
>         Lutz
>
>
>
>         On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:
>
>             Hi Lutz,
>
>             thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
>             I only have some minor suggestions:
>
>             c1_LIRGenerator_s390.cpp
>             load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
>             As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
>             The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
>
>             macroAssembler_s390.cpp
>             The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
>
>             stubGenerator_s390.cpp
>             The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
>
>             stubRoutines_s390.cpp
>             I didn't review all numbers :-), but I guess you have tested well enough?
>
>             It'd be nice if you could share some performance numbers.
>
>             Thanks and best regards,
>             Martin
>
>
>
>             -----Original Message-----
>             From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
>             Sent: Montag, 27. Februar 2017 22:33
>             To: Vladimir Kozlov <[hidden email]>; [hidden email]
>             Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
>
>             Thank you, Vladimir, for looking at this.
>
>             Regards, Lutz
>
>
>             On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
>
>                 Shared changes are good. I assume s390 code will be reviewed by experts.
>
>                 Thanks,
>                 Vladimir
>
>                 On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
>                 > Hi all,
>                 >
>                 >
>                 >
>                 > the webrev URL below was incorrect. It displayed the correct location,
>                 > but pointed elsewhere. The issue is fixed now and the URLs are correct.
>                 >
>                 >
>                 >
>                 > Thank you!
>                 >
>                 > Lutz
>                 >
>                 >
>                 >
>                 > *From: *Lutz Schmidt <[hidden email]>
>                 > *Date: *Montag, 27. Februar 2017 um 15:55
>                 > *To: *"[hidden email]"
>                 > <[hidden email]>
>                 > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
>                 > CRC32C
>                 >
>                 >
>                 >
>                 > Hi all,
>                 >
>                 >
>                 >
>                 > may I kindly request reviews for my medium enhancement? Further down the
>                 > road, I would need a sponsor, too.
>                 >
>                 >
>                 >
>                 > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
>                 >
>                 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
>                 >
>                 >
>                 >
>                 > Description:
>                 >
>                 > This intrinsic implementation provides some performance benefit over the
>                 > standard Java implementation. It uses only well-known “standard”
>                 > instructions, available on all supported IBM System z hardware. Being
>                 > very similar to the CRC32 intrinsics, it was tried to share as much code
>                 > as possible between these two.
>                 >
>                 >
>                 >
>                 > Performance may be expected to increase up to 2.0x, compared to a run
>                 > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
>                 > the byte array fed into CRC32C and, to some extent, on the H/W
>                 > generation the test runs on.
>                 >
>                 >
>                 >
>                 > Thanks,
>                 >
>                 > Lutz
>                 >
>                 >
>                 >
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Schmidt, Lutz
Hi Volker,
thank you for reviewing and pushing my change.
Lutz


On 08.03.17, 16:44, "Volker Simonis" <[hidden email]> wrote:

    Hi Lutz,
   
    the change looks good!
   
    As this is now s390-only, I've pushed it.
   
    Regards,
    Volker
   
   
    On Wed, Mar 8, 2017 at 3:37 PM, Schmidt, Lutz <[hidden email]> wrote:
    > Hi all,
    >
    > this is yet another update for my RFR. I rebased my changes on the current tip to resolve a clash in templateInterpreterGenerator_s390.cpp – detected by Martin (mdoerr). Thank you!
    >
    > Furthermore, I removed src/share/vm/c1/c1_compiler.cpp from the changeset. As a result, the CRC32C intrinsics on s390 will not be available in c1. Only later, after 8175369 has been pushed, they will become available. This “trick” avoids conflicting changes to the same source line by 8175368 and 8175369.
    >
    > Please find the most recent webrev at: http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/
    >
    > Regards,
    > Lutz
    >
    >
    >
    > On 02.03.17, 17:22, "hotspot-compiler-dev on behalf of Schmidt, Lutz" <[hidden email] on behalf of [hidden email]> wrote:
    >
    >     Hi all,
    >
    >     I have fixed a tiny little bug (copy result to result register) in c1_LIRAssembler_s390.cpp and created a new webrev:
    >       http://cr.openjdk.java.net/~lucy/webrevs/8175368.02/
    >
    >     I believe I still need one more review. Furthermore: any volunteer to sponsor my change?
    >
    >     Thanks,
    >     Lutz
    >
    >
    >
    >     On 28.02.17, 17:07, "Doerr, Martin" <[hidden email]> wrote:
    >
    >         Hi Lutz,
    >
    >         thanks for implementing my suggestions and for the explanations. The change looks good to me.
    >
    >         Best regards,
    >         Martin
    >
    >
    >         -----Original Message-----
    >         From: Schmidt, Lutz
    >         Sent: Dienstag, 28. Februar 2017 15:28
    >         To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
    >         Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
    >
    >         Hi Martin,
    >
    >         Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/
    >
    >         The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.
    >
    >         Best regards,
    >         Lutz
    >
    >
    >
    >         On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:
    >
    >             Hi Lutz,
    >
    >             thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
    >             I only have some minor suggestions:
    >
    >             c1_LIRGenerator_s390.cpp
    >             load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
    >             As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
    >             The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
    >
    >             macroAssembler_s390.cpp
    >             The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
    >
    >             stubGenerator_s390.cpp
    >             The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
    >
    >             stubRoutines_s390.cpp
    >             I didn't review all numbers :-), but I guess you have tested well enough?
    >
    >             It'd be nice if you could share some performance numbers.
    >
    >             Thanks and best regards,
    >             Martin
    >
    >
    >
    >             -----Original Message-----
    >             From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
    >             Sent: Montag, 27. Februar 2017 22:33
    >             To: Vladimir Kozlov <[hidden email]>; [hidden email]
    >             Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
    >
    >             Thank you, Vladimir, for looking at this.
    >
    >             Regards, Lutz
    >
    >
    >             On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
    >
    >                 Shared changes are good. I assume s390 code will be reviewed by experts.
    >
    >                 Thanks,
    >                 Vladimir
    >
    >                 On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
    >                 > Hi all,
    >                 >
    >                 >
    >                 >
    >                 > the webrev URL below was incorrect. It displayed the correct location,
    >                 > but pointed elsewhere. The issue is fixed now and the URLs are correct.
    >                 >
    >                 >
    >                 >
    >                 > Thank you!
    >                 >
    >                 > Lutz
    >                 >
    >                 >
    >                 >
    >                 > *From: *Lutz Schmidt <[hidden email]>
    >                 > *Date: *Montag, 27. Februar 2017 um 15:55
    >                 > *To: *"[hidden email]"
    >                 > <[hidden email]>
    >                 > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
    >                 > CRC32C
    >                 >
    >                 >
    >                 >
    >                 > Hi all,
    >                 >
    >                 >
    >                 >
    >                 > may I kindly request reviews for my medium enhancement? Further down the
    >                 > road, I would need a sponsor, too.
    >                 >
    >                 >
    >                 >
    >                 > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
    >                 >
    >                 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
    >                 >
    >                 >
    >                 >
    >                 > Description:
    >                 >
    >                 > This intrinsic implementation provides some performance benefit over the
    >                 > standard Java implementation. It uses only well-known “standard”
    >                 > instructions, available on all supported IBM System z hardware. Being
    >                 > very similar to the CRC32 intrinsics, it was tried to share as much code
    >                 > as possible between these two.
    >                 >
    >                 >
    >                 >
    >                 > Performance may be expected to increase up to 2.0x, compared to a run
    >                 > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
    >                 > the byte array fed into CRC32C and, to some extent, on the H/W
    >                 > generation the test runs on.
    >                 >
    >                 >
    >                 >
    >                 > Thanks,
    >                 >
    >                 > Lutz
    >                 >
    >                 >
    >                 >
    >
    >
    >
    >
    >
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Volker Simonis
In reply to this post by Volker Simonis
Sorry, but I've just realized that I've unintentionally updated the
shared file c1_Compiler.cpp with this push which was intended to only
touch s390 files.

The problem is that the latest webrev
(http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/) doesn't display
any shared changes, and that's actually what Lutz apparently wanted to
change in the last version (i.e. move both, this and the ppc64 changes
to c1_Compiler.cpp into the ppc64 version of the change).
Unfortunately, the patch file from that webrev still contained the
shared changes for s390.

I'm not sure how that could happen? It may be a problem with
webrev.ksh in combination with the Mercurial Queues extension and the
fact that Lutz had a local change in the repository from which he
created the webrev.

So to cut a long story short, I've just pushed Lutz's s390 change
(http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/18ace35ee108) in good
faith that it only contains s390 specific changes and unintentionally
updated c1_Compiler.cpp.

I'm afraid there's nothing I can do now.
Sorry,
Volker


On Wed, Mar 8, 2017 at 4:44 PM, Volker Simonis <[hidden email]> wrote:

> Hi Lutz,
>
> the change looks good!
>
> As this is now s390-only, I've pushed it.
>
> Regards,
> Volker
>
>
> On Wed, Mar 8, 2017 at 3:37 PM, Schmidt, Lutz <[hidden email]> wrote:
>> Hi all,
>>
>> this is yet another update for my RFR. I rebased my changes on the current tip to resolve a clash in templateInterpreterGenerator_s390.cpp – detected by Martin (mdoerr). Thank you!
>>
>> Furthermore, I removed src/share/vm/c1/c1_compiler.cpp from the changeset. As a result, the CRC32C intrinsics on s390 will not be available in c1. Only later, after 8175369 has been pushed, they will become available. This “trick” avoids conflicting changes to the same source line by 8175368 and 8175369.
>>
>> Please find the most recent webrev at: http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/
>>
>> Regards,
>> Lutz
>>
>>
>>
>> On 02.03.17, 17:22, "hotspot-compiler-dev on behalf of Schmidt, Lutz" <[hidden email] on behalf of [hidden email]> wrote:
>>
>>     Hi all,
>>
>>     I have fixed a tiny little bug (copy result to result register) in c1_LIRAssembler_s390.cpp and created a new webrev:
>>       http://cr.openjdk.java.net/~lucy/webrevs/8175368.02/
>>
>>     I believe I still need one more review. Furthermore: any volunteer to sponsor my change?
>>
>>     Thanks,
>>     Lutz
>>
>>
>>
>>     On 28.02.17, 17:07, "Doerr, Martin" <[hidden email]> wrote:
>>
>>         Hi Lutz,
>>
>>         thanks for implementing my suggestions and for the explanations. The change looks good to me.
>>
>>         Best regards,
>>         Martin
>>
>>
>>         -----Original Message-----
>>         From: Schmidt, Lutz
>>         Sent: Dienstag, 28. Februar 2017 15:28
>>         To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
>>         Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
>>
>>         Hi Martin,
>>
>>         Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/
>>
>>         The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.
>>
>>         Best regards,
>>         Lutz
>>
>>
>>
>>         On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:
>>
>>             Hi Lutz,
>>
>>             thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
>>             I only have some minor suggestions:
>>
>>             c1_LIRGenerator_s390.cpp
>>             load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
>>             As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
>>             The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
>>
>>             macroAssembler_s390.cpp
>>             The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
>>
>>             stubGenerator_s390.cpp
>>             The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
>>
>>             stubRoutines_s390.cpp
>>             I didn't review all numbers :-), but I guess you have tested well enough?
>>
>>             It'd be nice if you could share some performance numbers.
>>
>>             Thanks and best regards,
>>             Martin
>>
>>
>>
>>             -----Original Message-----
>>             From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
>>             Sent: Montag, 27. Februar 2017 22:33
>>             To: Vladimir Kozlov <[hidden email]>; [hidden email]
>>             Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
>>
>>             Thank you, Vladimir, for looking at this.
>>
>>             Regards, Lutz
>>
>>
>>             On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
>>
>>                 Shared changes are good. I assume s390 code will be reviewed by experts.
>>
>>                 Thanks,
>>                 Vladimir
>>
>>                 On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
>>                 > Hi all,
>>                 >
>>                 >
>>                 >
>>                 > the webrev URL below was incorrect. It displayed the correct location,
>>                 > but pointed elsewhere. The issue is fixed now and the URLs are correct.
>>                 >
>>                 >
>>                 >
>>                 > Thank you!
>>                 >
>>                 > Lutz
>>                 >
>>                 >
>>                 >
>>                 > *From: *Lutz Schmidt <[hidden email]>
>>                 > *Date: *Montag, 27. Februar 2017 um 15:55
>>                 > *To: *"[hidden email]"
>>                 > <[hidden email]>
>>                 > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
>>                 > CRC32C
>>                 >
>>                 >
>>                 >
>>                 > Hi all,
>>                 >
>>                 >
>>                 >
>>                 > may I kindly request reviews for my medium enhancement? Further down the
>>                 > road, I would need a sponsor, too.
>>                 >
>>                 >
>>                 >
>>                 > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
>>                 >
>>                 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
>>                 >
>>                 >
>>                 >
>>                 > Description:
>>                 >
>>                 > This intrinsic implementation provides some performance benefit over the
>>                 > standard Java implementation. It uses only well-known “standard”
>>                 > instructions, available on all supported IBM System z hardware. Being
>>                 > very similar to the CRC32 intrinsics, it was tried to share as much code
>>                 > as possible between these two.
>>                 >
>>                 >
>>                 >
>>                 > Performance may be expected to increase up to 2.0x, compared to a run
>>                 > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
>>                 > the byte array fed into CRC32C and, to some extent, on the H/W
>>                 > generation the test runs on.
>>                 >
>>                 >
>>                 >
>>                 > Thanks,
>>                 >
>>                 > Lutz
>>                 >
>>                 >
>>                 >
>>
>>
>>
>>
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C

Vladimir Kozlov
No problem.

File a bug. Undo changes you don't want and I will review and sponsor it (running through JPRT to make sure everything works).

Thanks,
Vladimir

On 3/8/17 8:24 AM, Volker Simonis wrote:

> Sorry, but I've just realized that I've unintentionally updated the
> shared file c1_Compiler.cpp with this push which was intended to only
> touch s390 files.
>
> The problem is that the latest webrev
> (http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/) doesn't display
> any shared changes, and that's actually what Lutz apparently wanted to
> change in the last version (i.e. move both, this and the ppc64 changes
> to c1_Compiler.cpp into the ppc64 version of the change).
> Unfortunately, the patch file from that webrev still contained the
> shared changes for s390.
>
> I'm not sure how that could happen? It may be a problem with
> webrev.ksh in combination with the Mercurial Queues extension and the
> fact that Lutz had a local change in the repository from which he
> created the webrev.
>
> So to cut a long story short, I've just pushed Lutz's s390 change
> (http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/18ace35ee108) in good
> faith that it only contains s390 specific changes and unintentionally
> updated c1_Compiler.cpp.
>
> I'm afraid there's nothing I can do now.
> Sorry,
> Volker
>
>
> On Wed, Mar 8, 2017 at 4:44 PM, Volker Simonis <[hidden email]> wrote:
>> Hi Lutz,
>>
>> the change looks good!
>>
>> As this is now s390-only, I've pushed it.
>>
>> Regards,
>> Volker
>>
>>
>> On Wed, Mar 8, 2017 at 3:37 PM, Schmidt, Lutz <[hidden email]> wrote:
>>> Hi all,
>>>
>>> this is yet another update for my RFR. I rebased my changes on the current tip to resolve a clash in templateInterpreterGenerator_s390.cpp – detected by Martin (mdoerr). Thank you!
>>>
>>> Furthermore, I removed src/share/vm/c1/c1_compiler.cpp from the changeset. As a result, the CRC32C intrinsics on s390 will not be available in c1. Only later, after 8175369 has been pushed, they will become available. This “trick” avoids conflicting changes to the same source line by 8175368 and 8175369.
>>>
>>> Please find the most recent webrev at: http://cr.openjdk.java.net/~lucy/webrevs/8175368.03/
>>>
>>> Regards,
>>> Lutz
>>>
>>>
>>>
>>> On 02.03.17, 17:22, "hotspot-compiler-dev on behalf of Schmidt, Lutz" <[hidden email] on behalf of [hidden email]> wrote:
>>>
>>>     Hi all,
>>>
>>>     I have fixed a tiny little bug (copy result to result register) in c1_LIRAssembler_s390.cpp and created a new webrev:
>>>       http://cr.openjdk.java.net/~lucy/webrevs/8175368.02/
>>>
>>>     I believe I still need one more review. Furthermore: any volunteer to sponsor my change?
>>>
>>>     Thanks,
>>>     Lutz
>>>
>>>
>>>
>>>     On 28.02.17, 17:07, "Doerr, Martin" <[hidden email]> wrote:
>>>
>>>         Hi Lutz,
>>>
>>>         thanks for implementing my suggestions and for the explanations. The change looks good to me.
>>>
>>>         Best regards,
>>>         Martin
>>>
>>>
>>>         -----Original Message-----
>>>         From: Schmidt, Lutz
>>>         Sent: Dienstag, 28. Februar 2017 15:28
>>>         To: Doerr, Martin <[hidden email]>; Vladimir Kozlov <[hidden email]>; [hidden email]
>>>         Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
>>>
>>>         Hi Martin,
>>>
>>>         Thank you for reviewing and for your suggestions. I have reworked my changes to reflect your improvements. A new webrev can be found here: http://cr.openjdk.java.net/~lucy/webrevs/8175368.01/
>>>
>>>         The lookup tables in stubRoutines_s390.cpp have been verified by calculating CRC32C values for various byte streams with and without using the tables. Because the tables are auto-generated, it is very unlikely that just a few individual table elements are incorrect.
>>>
>>>         Best regards,
>>>         Lutz
>>>
>>>
>>>
>>>         On 28.02.17, 11:57, "Doerr, Martin" <[hidden email]> wrote:
>>>
>>>             Hi Lutz,
>>>
>>>             thank you very much for implementing it. It looks good. I especially like that you share a lot of the code between CRC32 and CRC32C.
>>>             I only have some minor suggestions:
>>>
>>>             c1_LIRGenerator_s390.cpp
>>>             load_int_as_long was implemented for stubs which expect sign (or zero) extended integer parameters.
>>>             As C2 was changed to no longer perform int to long conversions for stub calls, I think this should better be removed from C1, too.
>>>             The stubs are shared between C1 and C2 and they don't need conversions anymore, so I'd appreciate if the load_int_as_long got removed completely.
>>>
>>>             macroAssembler_s390.cpp
>>>             The function update_1word_crc32 uses separate load and xor instructions. I think some of them should get combined by using xor register,mem.
>>>
>>>             stubGenerator_s390.cpp
>>>             The comment in generate_CRC32_updateBytes should be: "must be same register as in generate..."
>>>
>>>             stubRoutines_s390.cpp
>>>             I didn't review all numbers :-), but I guess you have tested well enough?
>>>
>>>             It'd be nice if you could share some performance numbers.
>>>
>>>             Thanks and best regards,
>>>             Martin
>>>
>>>
>>>
>>>             -----Original Message-----
>>>             From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Schmidt, Lutz
>>>             Sent: Montag, 27. Februar 2017 22:33
>>>             To: Vladimir Kozlov <[hidden email]>; [hidden email]
>>>             Subject: Re: RFR (M) 8175368: [s390] Provide intrinsic implementation for CRC32C
>>>
>>>             Thank you, Vladimir, for looking at this.
>>>
>>>             Regards, Lutz
>>>
>>>
>>>             On 27.02.17, 22:30, "Vladimir Kozlov" <[hidden email]> wrote:
>>>
>>>                 Shared changes are good. I assume s390 code will be reviewed by experts.
>>>
>>>                 Thanks,
>>>                 Vladimir
>>>
>>>                 On 2/27/17 9:34 AM, Schmidt, Lutz wrote:
>>>                 > Hi all,
>>>                 >
>>>                 >
>>>                 >
>>>                 > the webrev URL below was incorrect. It displayed the correct location,
>>>                 > but pointed elsewhere. The issue is fixed now and the URLs are correct.
>>>                 >
>>>                 >
>>>                 >
>>>                 > Thank you!
>>>                 >
>>>                 > Lutz
>>>                 >
>>>                 >
>>>                 >
>>>                 > *From: *Lutz Schmidt <[hidden email]>
>>>                 > *Date: *Montag, 27. Februar 2017 um 15:55
>>>                 > *To: *"[hidden email]"
>>>                 > <[hidden email]>
>>>                 > *Subject: *RFR (M) 8175368: [s390] Provide intrinsic implementation for
>>>                 > CRC32C
>>>                 >
>>>                 >
>>>                 >
>>>                 > Hi all,
>>>                 >
>>>                 >
>>>                 >
>>>                 > may I kindly request reviews for my medium enhancement? Further down the
>>>                 > road, I would need a sponsor, too.
>>>                 >
>>>                 >
>>>                 >
>>>                 > Bug: https://bugs.openjdk.java.net/browse/JDK-8175368
>>>                 >
>>>                 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175368/
>>>                 >
>>>                 >
>>>                 >
>>>                 > Description:
>>>                 >
>>>                 > This intrinsic implementation provides some performance benefit over the
>>>                 > standard Java implementation. It uses only well-known “standard”
>>>                 > instructions, available on all supported IBM System z hardware. Being
>>>                 > very similar to the CRC32 intrinsics, it was tried to share as much code
>>>                 > as possible between these two.
>>>                 >
>>>                 >
>>>                 >
>>>                 > Performance may be expected to increase up to 2.0x, compared to a run
>>>                 > with –XX:-UseCRC32CIntrinsics. Actual speedup depends on the length of
>>>                 > the byte array fed into CRC32C and, to some extent, on the H/W
>>>                 > generation the test runs on.
>>>                 >
>>>                 >
>>>                 >
>>>                 > Thanks,
>>>                 >
>>>                 > Lutz
>>>                 >
>>>                 >
>>>                 >
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
Loading...