RE: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

RE: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Deshpande, Vivek R

Hi Vladimir

 

I have a fix for the slowdown observed on haswell due to jdk-8178811.

The solution selectively generates vzeroupper when AVX2/ AVX512 instructions have been executed and before the transition to SSE code.

I tested this fix with the test case given with bug and SPECjbb2015.

Could you please review and sponsor the patch?

Webrev:

http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/

I have also updated the bug: https://bugs.openjdk.java.net/browse/JDK-8190934

 

Regards,

Vivek

Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Nils Eliasson

Hi Vivek,

I have started regular testing of your patch. Hopefully the testing will be complete when Vladimir comes online.

Regards,

Nils


On 2017-12-14 08:22, Deshpande, Vivek R wrote:

Hi Vladimir

 

I have a fix for the slowdown observed on haswell due to jdk-8178811.

The solution selectively generates vzeroupper when AVX2/ AVX512 instructions have been executed and before the transition to SSE code.

I tested this fix with the test case given with bug and SPECjbb2015.

Could you please review and sponsor the patch?

Webrev:

http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/

I have also updated the bug: https://bugs.openjdk.java.net/browse/JDK-8190934

 

Regards,

Vivek


Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Claes Redestad

Hi,

not a review, but I've verified that this resolves a number of regressions we've seen in various microbenchmarks.

Thanks!

/Claes


On 2017-12-14 13:00, Nils Eliasson wrote:

Hi Vivek,

I have started regular testing of your patch. Hopefully the testing will be complete when Vladimir comes online.

Regards,

Nils


On 2017-12-14 08:22, Deshpande, Vivek R wrote:

Hi Vladimir

 

I have a fix for the slowdown observed on haswell due to jdk-8178811.

The solution selectively generates vzeroupper when AVX2/ AVX512 instructions have been executed and before the transition to SSE code.

I tested this fix with the test case given with bug and SPECjbb2015.

Could you please review and sponsor the patch?

Webrev:

http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/

I have also updated the bug: https://bugs.openjdk.java.net/browse/JDK-8190934

 

Regards,

Vivek



Reply | Threaded
Open this post in threaded view
|

RE: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Deshpande, Vivek R

Thanks Claes for conforming!

Also thanks to Nils for starting the testing right away.

 

Regards,

Vivek

From: hotspot-compiler-dev [mailto:[hidden email]] On Behalf Of Claes Redestad
Sent: Thursday, December 14, 2017 5:50 AM
To: [hidden email]
Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

 

Hi,

not a review, but I've verified that this resolves a number of regressions we've seen in various microbenchmarks.

Thanks!

/Claes

 

On 2017-12-14 13:00, Nils Eliasson wrote:

Hi Vivek,

I have started regular testing of your patch. Hopefully the testing will be complete when Vladimir comes online.

Regards,

Nils

 

On 2017-12-14 08:22, Deshpande, Vivek R wrote:

Hi Vladimir

 

I have a fix for the slowdown observed on haswell due to jdk-8178811.

The solution selectively generates vzeroupper when AVX2/ AVX512 instructions have been executed and before the transition to SSE code.

I tested this fix with the test case given with bug and SPECjbb2015.

Could you please review and sponsor the patch?

Webrev:

http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/

I have also updated the bug: https://bugs.openjdk.java.net/browse/JDK-8190934

 

Regards,

Vivek

 

 

Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Vladimir Kozlov
In reply to this post by Deshpande, Vivek R
Thank you, Vivek

1. You don't need new INCLUDE_CLEAR_UPPER_AVX. The main issue is UseAVX
flag is defined only on x86. You can use #ifdef X86 for that.

2. You don't need next guard

+#if INCLUDE_CLEAR_UPPER_AVX
+  clear_upper_avx();
+#endif

if you define empty body in else case (using #ifdef X86):

+  void clear_upper_avx() {
+#ifdef X86
+    if (UseAVX >= 2) {
+      C->set_clear_upper_avx(true);
+    }
+#endif
+  }

3. Factor next checks into one static method (similar to
clear_avx_size()) to avoid next repetitive checks in .ad file:

(C->max_vector_size() > 16 || C->clear_upper_avx() == true)

4. remove unrelated change (empty line removed) in macroAssembler_x86.cpp

Thanks,
Vladimir

On 12/13/17 11:22 PM, Deshpande, Vivek R wrote:

> Hi Vladimir
>
> I have a fix for the slowdown observed on haswell due to jdk-8178811.
>
> The solution selectively generates vzeroupper when AVX2/ AVX512
> instructions have been executed and before the transition to SSE code.
>
> I tested this fix with the test case given with bug and SPECjbb2015.
>
> Could you please review and sponsor the patch?
>
> Webrev:
>
> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/
>
> I have also updated the bug:
> https://bugs.openjdk.java.net/browse/JDK-8190934
>
> Regards,
>
> Vivek
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Vladimir Kozlov
Note, my suggesting changes don't affect performance and other testing
results. They are still valid because these changes are only refactoring.

Thanks,
Vladimir

On 12/14/17 10:32 AM, Vladimir Kozlov wrote:

> Thank you, Vivek
>
> 1. You don't need new INCLUDE_CLEAR_UPPER_AVX. The main issue is UseAVX
> flag is defined only on x86. You can use #ifdef X86 for that.
>
> 2. You don't need next guard
>
> +#if INCLUDE_CLEAR_UPPER_AVX
> +  clear_upper_avx();
> +#endif
>
> if you define empty body in else case (using #ifdef X86):
>
> +  void clear_upper_avx() {
> +#ifdef X86
> +    if (UseAVX >= 2) {
> +      C->set_clear_upper_avx(true);
> +    }
> +#endif
> +  }
>
> 3. Factor next checks into one static method (similar to
> clear_avx_size()) to avoid next repetitive checks in .ad file:
>
> (C->max_vector_size() > 16 || C->clear_upper_avx() == true)
>
> 4. remove unrelated change (empty line removed) in macroAssembler_x86.cpp
>
> Thanks,
> Vladimir
>
> On 12/13/17 11:22 PM, Deshpande, Vivek R wrote:
>> Hi Vladimir
>>
>> I have a fix for the slowdown observed on haswell due to jdk-8178811.
>>
>> The solution selectively generates vzeroupper when AVX2/ AVX512
>> instructions have been executed and before the transition to SSE code.
>>
>> I tested this fix with the test case given with bug and SPECjbb2015.
>>
>> Could you please review and sponsor the patch?
>>
>> Webrev:
>>
>> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/
>>
>> I have also updated the bug:
>> https://bugs.openjdk.java.net/browse/JDK-8190934
>>
>> Regards,
>>
>> Vivek
>>
Reply | Threaded
Open this post in threaded view
|

RE: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Deshpande, Vivek R
Hi Vladimir

I have updated webrev with your suggested changes.
Would you please review it and sponsor the changes.
The webrev is here:
http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.01/
I have also updated the bug.

Regards,
Vivek
-----Original Message-----
From: Vladimir Kozlov [mailto:[hidden email]]
Sent: Thursday, December 14, 2017 10:44 AM
To: Deshpande, Vivek R <[hidden email]>; [hidden email]
Cc: Viswanathan, Sandhya <[hidden email]>
Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Note, my suggesting changes don't affect performance and other testing results. They are still valid because these changes are only refactoring.

Thanks,
Vladimir

On 12/14/17 10:32 AM, Vladimir Kozlov wrote:

> Thank you, Vivek
>
> 1. You don't need new INCLUDE_CLEAR_UPPER_AVX. The main issue is
> UseAVX flag is defined only on x86. You can use #ifdef X86 for that.
>
> 2. You don't need next guard
>
> +#if INCLUDE_CLEAR_UPPER_AVX
> +  clear_upper_avx();
> +#endif
>
> if you define empty body in else case (using #ifdef X86):
>
> +  void clear_upper_avx() {
> +#ifdef X86
> +    if (UseAVX >= 2) {
> +      C->set_clear_upper_avx(true);
> +    }
> +#endif
> +  }
>
> 3. Factor next checks into one static method (similar to
> clear_avx_size()) to avoid next repetitive checks in .ad file:
>
> (C->max_vector_size() > 16 || C->clear_upper_avx() == true)
>
> 4. remove unrelated change (empty line removed) in
> macroAssembler_x86.cpp
>
> Thanks,
> Vladimir
>
> On 12/13/17 11:22 PM, Deshpande, Vivek R wrote:
>> Hi Vladimir
>>
>> I have a fix for the slowdown observed on haswell due to jdk-8178811.
>>
>> The solution selectively generates vzeroupper when AVX2/ AVX512
>> instructions have been executed and before the transition to SSE code.
>>
>> I tested this fix with the test case given with bug and SPECjbb2015.
>>
>> Could you please review and sponsor the patch?
>>
>> Webrev:
>>
>> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/
>>
>> I have also updated the bug:
>> https://bugs.openjdk.java.net/browse/JDK-8190934
>>
>> Regards,
>>
>> Vivek
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Vladimir Kozlov
Lets do one more iteration ;)

Pass Compile* C argument to generate_vzeroupper() in .ad file because
Compile::current() is expensive call and you have C already in 2 cases.

There is one extra space before { in compile.hpp for set_clear_upper_avx().

Please, update, webrev. I will start testing meanwhile.

Thanks,
Vladimir

On 12/14/17 3:02 PM, Deshpande, Vivek R wrote:

> Hi Vladimir
>
> I have updated webrev with your suggested changes.
> Would you please review it and sponsor the changes.
> The webrev is here:
> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.01/
> I have also updated the bug.
>
> Regards,
> Vivek
> -----Original Message-----
> From: Vladimir Kozlov [mailto:[hidden email]]
> Sent: Thursday, December 14, 2017 10:44 AM
> To: Deshpande, Vivek R <[hidden email]>; [hidden email]
> Cc: Viswanathan, Sandhya <[hidden email]>
> Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811
>
> Note, my suggesting changes don't affect performance and other testing results. They are still valid because these changes are only refactoring.
>
> Thanks,
> Vladimir
>
> On 12/14/17 10:32 AM, Vladimir Kozlov wrote:
>> Thank you, Vivek
>>
>> 1. You don't need new INCLUDE_CLEAR_UPPER_AVX. The main issue is
>> UseAVX flag is defined only on x86. You can use #ifdef X86 for that.
>>
>> 2. You don't need next guard
>>
>> +#if INCLUDE_CLEAR_UPPER_AVX
>> +  clear_upper_avx();
>> +#endif
>>
>> if you define empty body in else case (using #ifdef X86):
>>
>> +  void clear_upper_avx() {
>> +#ifdef X86
>> +    if (UseAVX >= 2) {
>> +      C->set_clear_upper_avx(true);
>> +    }
>> +#endif
>> +  }
>>
>> 3. Factor next checks into one static method (similar to
>> clear_avx_size()) to avoid next repetitive checks in .ad file:
>>
>> (C->max_vector_size() > 16 || C->clear_upper_avx() == true)
>>
>> 4. remove unrelated change (empty line removed) in
>> macroAssembler_x86.cpp
>>
>> Thanks,
>> Vladimir
>>
>> On 12/13/17 11:22 PM, Deshpande, Vivek R wrote:
>>> Hi Vladimir
>>>
>>> I have a fix for the slowdown observed on haswell due to jdk-8178811.
>>>
>>> The solution selectively generates vzeroupper when AVX2/ AVX512
>>> instructions have been executed and before the transition to SSE code.
>>>
>>> I tested this fix with the test case given with bug and SPECjbb2015.
>>>
>>> Could you please review and sponsor the patch?
>>>
>>> Webrev:
>>>
>>> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/
>>>
>>> I have also updated the bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8190934
>>>
>>> Regards,
>>>
>>> Vivek
>>>
Reply | Threaded
Open this post in threaded view
|

RE: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Deshpande, Vivek R
Hi Vladimir

I have updated the patch with your suggested changes.
Please find it here:
http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.02/
Please let me know what you think of the changes.

Regards,
Vivek

-----Original Message-----
From: Vladimir Kozlov [mailto:[hidden email]]
Sent: Thursday, December 14, 2017 4:11 PM
To: Deshpande, Vivek R <[hidden email]>; [hidden email]
Cc: Viswanathan, Sandhya <[hidden email]>
Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Lets do one more iteration ;)

Pass Compile* C argument to generate_vzeroupper() in .ad file because
Compile::current() is expensive call and you have C already in 2 cases.

There is one extra space before { in compile.hpp for set_clear_upper_avx().

Please, update, webrev. I will start testing meanwhile.

Thanks,
Vladimir

On 12/14/17 3:02 PM, Deshpande, Vivek R wrote:

> Hi Vladimir
>
> I have updated webrev with your suggested changes.
> Would you please review it and sponsor the changes.
> The webrev is here:
> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.01/
> I have also updated the bug.
>
> Regards,
> Vivek
> -----Original Message-----
> From: Vladimir Kozlov [mailto:[hidden email]]
> Sent: Thursday, December 14, 2017 10:44 AM
> To: Deshpande, Vivek R <[hidden email]>;
> [hidden email]
> Cc: Viswanathan, Sandhya <[hidden email]>
> Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to
> JDK-8178811
>
> Note, my suggesting changes don't affect performance and other testing results. They are still valid because these changes are only refactoring.
>
> Thanks,
> Vladimir
>
> On 12/14/17 10:32 AM, Vladimir Kozlov wrote:
>> Thank you, Vivek
>>
>> 1. You don't need new INCLUDE_CLEAR_UPPER_AVX. The main issue is
>> UseAVX flag is defined only on x86. You can use #ifdef X86 for that.
>>
>> 2. You don't need next guard
>>
>> +#if INCLUDE_CLEAR_UPPER_AVX
>> +  clear_upper_avx();
>> +#endif
>>
>> if you define empty body in else case (using #ifdef X86):
>>
>> +  void clear_upper_avx() {
>> +#ifdef X86
>> +    if (UseAVX >= 2) {
>> +      C->set_clear_upper_avx(true);
>> +    }
>> +#endif
>> +  }
>>
>> 3. Factor next checks into one static method (similar to
>> clear_avx_size()) to avoid next repetitive checks in .ad file:
>>
>> (C->max_vector_size() > 16 || C->clear_upper_avx() == true)
>>
>> 4. remove unrelated change (empty line removed) in
>> macroAssembler_x86.cpp
>>
>> Thanks,
>> Vladimir
>>
>> On 12/13/17 11:22 PM, Deshpande, Vivek R wrote:
>>> Hi Vladimir
>>>
>>> I have a fix for the slowdown observed on haswell due to jdk-8178811.
>>>
>>> The solution selectively generates vzeroupper when AVX2/ AVX512
>>> instructions have been executed and before the transition to SSE code.
>>>
>>> I tested this fix with the test case given with bug and SPECjbb2015.
>>>
>>> Could you please review and sponsor the patch?
>>>
>>> Webrev:
>>>
>>> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/
>>>
>>> I have also updated the bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8190934
>>>
>>> Regards,
>>>
>>> Vivek
>>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811

Vladimir Kozlov
Good.

Testing passed and it is ready for push. But since we forked jdk10
already I need to find procedure where to push. You will get notification.

Thanks,
Vladimir

On 12/15/17 12:25 AM, Deshpande, Vivek R wrote:

> Hi Vladimir
>
> I have updated the patch with your suggested changes.
> Please find it here:
> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.02/
> Please let me know what you think of the changes.
>
> Regards,
> Vivek
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:[hidden email]]
> Sent: Thursday, December 14, 2017 4:11 PM
> To: Deshpande, Vivek R <[hidden email]>; [hidden email]
> Cc: Viswanathan, Sandhya <[hidden email]>
> Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to JDK-8178811
>
> Lets do one more iteration ;)
>
> Pass Compile* C argument to generate_vzeroupper() in .ad file because
> Compile::current() is expensive call and you have C already in 2 cases.
>
> There is one extra space before { in compile.hpp for set_clear_upper_avx().
>
> Please, update, webrev. I will start testing meanwhile.
>
> Thanks,
> Vladimir
>
> On 12/14/17 3:02 PM, Deshpande, Vivek R wrote:
>> Hi Vladimir
>>
>> I have updated webrev with your suggested changes.
>> Would you please review it and sponsor the changes.
>> The webrev is here:
>> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.01/
>> I have also updated the bug.
>>
>> Regards,
>> Vivek
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:[hidden email]]
>> Sent: Thursday, December 14, 2017 10:44 AM
>> To: Deshpande, Vivek R <[hidden email]>;
>> [hidden email]
>> Cc: Viswanathan, Sandhya <[hidden email]>
>> Subject: Re: RFR(S): x86:8190934: Regressions on Haswell Xeon due to
>> JDK-8178811
>>
>> Note, my suggesting changes don't affect performance and other testing results. They are still valid because these changes are only refactoring.
>>
>> Thanks,
>> Vladimir
>>
>> On 12/14/17 10:32 AM, Vladimir Kozlov wrote:
>>> Thank you, Vivek
>>>
>>> 1. You don't need new INCLUDE_CLEAR_UPPER_AVX. The main issue is
>>> UseAVX flag is defined only on x86. You can use #ifdef X86 for that.
>>>
>>> 2. You don't need next guard
>>>
>>> +#if INCLUDE_CLEAR_UPPER_AVX
>>> +  clear_upper_avx();
>>> +#endif
>>>
>>> if you define empty body in else case (using #ifdef X86):
>>>
>>> +  void clear_upper_avx() {
>>> +#ifdef X86
>>> +    if (UseAVX >= 2) {
>>> +      C->set_clear_upper_avx(true);
>>> +    }
>>> +#endif
>>> +  }
>>>
>>> 3. Factor next checks into one static method (similar to
>>> clear_avx_size()) to avoid next repetitive checks in .ad file:
>>>
>>> (C->max_vector_size() > 16 || C->clear_upper_avx() == true)
>>>
>>> 4. remove unrelated change (empty line removed) in
>>> macroAssembler_x86.cpp
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 12/13/17 11:22 PM, Deshpande, Vivek R wrote:
>>>> Hi Vladimir
>>>>
>>>> I have a fix for the slowdown observed on haswell due to jdk-8178811.
>>>>
>>>> The solution selectively generates vzeroupper when AVX2/ AVX512
>>>> instructions have been executed and before the transition to SSE code.
>>>>
>>>> I tested this fix with the test case given with bug and SPECjbb2015.
>>>>
>>>> Could you please review and sponsor the patch?
>>>>
>>>> Webrev:
>>>>
>>>> http://cr.openjdk.java.net/~vdeshpande/8190934/webrev.00/
>>>>
>>>> I have also updated the bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8190934
>>>>
>>>> Regards,
>>>>
>>>> Vivek
>>>>