[10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Sergey Bylokhov
Hello,
Please review the fix for jdk10.

Bug: https://bugs.openjdk.java.net/browse/JDK-8185093
Webrev can be found at: http://cr.openjdk.java.net/~serb/8185093/webrev.00
jmh test: http://cr.openjdk.java.net/~serb/8185093/LocalGEproblem.java

While working on some other bug I found a performance issue in our
java2d pipeline in multi-threaded environment. We have a "hot spot" in
the "GraphicsEnvironment.getLocalGraphicsEnvironment()". This method is
executed every time the graphics object is created, for example
BufferedImage.java:

     public Graphics2D createGraphics() {
         GraphicsEnvironment env =
             GraphicsEnvironment.getLocalGraphicsEnvironment();
         return env.createGraphics(this);
     }

So even if the application will draw to a different Images it will be
blocked for some time in this method.
I created a jmh test case which shows that implementing this method via
holder will speedup the next code:
       Graphics2D graphics = bi.createGraphics();
       graphics.drawLine(0, 0, 100, 100);

  4 Threads:
  - Before the fix:  8922 ops/ms
  - After the fix :  9442 ops/ms
  8 Threads:
  - Before the fix:  4511 ops/ms
  - After the fix : 11899 ops/ms

The main issue which I would like to clarify is it possible to remove
synchronize keyword from the getLocalGraphicsEnvironment(), possibly
with updating a specification? We have similar issues in other parts of
code which I would like to update later after this one.

--
Best regards, Sergey.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Laurent Bourgès
Sergey,

That's very interesting: I optimized the renderer to be perfectly scalable (thread local), but I agree there are still several places with static fields (single thread optimization) or synchronized methods...

The main issue which I would like to clarify is it possible to remove synchronize keyword from the getLocalGraphicsEnvironment(), possibly with updating a specification? We have similar issues in other parts of code which I would like to update later after this one.

Could you give more details on your findings ?

I recently figured out that TexturePaint suffers such a problem...

Regards,
Laurent
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Jim Graham-5
In reply to this post by Sergey Bylokhov
This same thing is done in other places in a simpler way without creating an inner class simply by having createGE() do
the assignment and test for null, as in:

before:

     public static synchronized GraphicsEnvironment getLocalGraphicsEnvironment() {
         if (localEnv == null) {
             localEnv = createGE();
         }

         return localEnv;
     }

     private static GraphicsEnvironment createGE() {
         ...
         return ge;
     }

=> after:

     public static GraphicsEnvironment getLocalGraphicsEnvironment() {
         if (localEnv == null) {
             createGE();
         }

         return localEnv;
     }

     private static synchronized GraphicsEnvironment createGE() {
         if (localEnv != null) return;
         ...
         localEnv = ge;
     }

OR => alternate after:

     public static synchronized GraphicsEnvironment getLocalGraphicsEnvironment() {
         if (localEnv == null) {
             synchronized (GraphicsEnvironment.class) {
                 if (localEnv == null) {
                     localEnv = createGE();
                 }
             }
         }

         return localEnv;
     }

     // And no changes to createGE()

Note that there is a test for null both in getLGE() and also again in createGE() because the second one is inside a
synchronized block and required to prevent multiple instances.  The first one in getLGE() avoids having to synchronize
in the first place for the common case.

                        ...jim

On 7/24/17 5:09 PM, Sergey Bylokhov wrote:

> Hello,
> Please review the fix for jdk10.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8185093
> Webrev can be found at: http://cr.openjdk.java.net/~serb/8185093/webrev.00
> jmh test: http://cr.openjdk.java.net/~serb/8185093/LocalGEproblem.java
>
> While working on some other bug I found a performance issue in our java2d pipeline in multi-threaded environment. We
> have a "hot spot" in the "GraphicsEnvironment.getLocalGraphicsEnvironment()". This method is executed every time the
> graphics object is created, for example BufferedImage.java:
>
>      public Graphics2D createGraphics() {
>          GraphicsEnvironment env =
>              GraphicsEnvironment.getLocalGraphicsEnvironment();
>          return env.createGraphics(this);
>      }
>
> So even if the application will draw to a different Images it will be blocked for some time in this method.
> I created a jmh test case which shows that implementing this method via holder will speedup the next code:
>        Graphics2D graphics = bi.createGraphics();
>        graphics.drawLine(0, 0, 100, 100);
>
>   4 Threads:
>   - Before the fix:  8922 ops/ms
>   - After the fix :  9442 ops/ms
>   8 Threads:
>   - Before the fix:  4511 ops/ms
>   - After the fix : 11899 ops/ms
>
> The main issue which I would like to clarify is it possible to remove synchronize keyword from the
> getLocalGraphicsEnvironment(), possibly with updating a specification? We have similar issues in other parts of code
> which I would like to update later after this one.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Sergey Bylokhov
On 25.07.2017 2:48, Jim Graham wrote:
> This same thing is done in other places in a simpler way without
> creating an inner class simply by having createGE() do the assignment
> and test for null, as in:

I guess a solution which you mentions is a variant of "Double-checked
locking".

> => after:
>
>      public static GraphicsEnvironment getLocalGraphicsEnvironment() {
>          if (c== null) {
>              createGE();
>          }
>
>          return localEnv;
>      }
>
>      private static synchronized GraphicsEnvironment createGE() {
>          if (localEnv != null) return;
>          ...
>          localEnv = ge;
>      }
>

In this variant the field should be volatile, and this will introduce
some synchronization. Otherwise it will be possible to read non-null
value in localEnv while the constructor of GraphicsEnvironment was not
completed on some other thread(but the field was assigned already).


> OR => alternate after:
>
>      public static synchronized GraphicsEnvironment
> getLocalGraphicsEnvironment() {
>          if (localEnv == null) {
>              synchronized (GraphicsEnvironment.class) {
>                  if (localEnv == null) {
>                      localEnv = createGE();
>                  }
>              }
>          }
>
>          return localEnv;
>      }
>
>      // And no changes to createGE()
>

In this variant the field also should be volatile. I guess it was a typo
that the getLocalGraphicsEnvironment still "synchronized".

> Note that there is a test for null both in getLGE() and also again in
> createGE() because the second one is inside a synchronized block and
> required to prevent multiple instances.  The first one in getLGE()
> avoids having to synchronize in the first place for the common case.

Note that in case of holder - after the first call there was not any
synchronization(like in case of "synchronized" and "volatile").

Here some additional information which I used:
https://shipilev.net/blog/2014/safe-public-construction/

>
>              ...jim
>
> On 7/24/17 5:09 PM, Sergey Bylokhov wrote:
>> Hello,
>> Please review the fix for jdk10.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185093
>> Webrev can be found at:
>> http://cr.openjdk.java.net/~serb/8185093/webrev.00
>> jmh test: http://cr.openjdk.java.net/~serb/8185093/LocalGEproblem.java
>>
>> While working on some other bug I found a performance issue in our
>> java2d pipeline in multi-threaded environment. We have a "hot spot" in
>> the "GraphicsEnvironment.getLocalGraphicsEnvironment()". This method
>> is executed every time the graphics object is created, for example
>> BufferedImage.java:
>>
>>      public Graphics2D createGraphics() {
>>          GraphicsEnvironment env =
>>              GraphicsEnvironment.getLocalGraphicsEnvironment();
>>          return env.createGraphics(this);
>>      }
>>
>> So even if the application will draw to a different Images it will be
>> blocked for some time in this method.
>> I created a jmh test case which shows that implementing this method
>> via holder will speedup the next code:
>>        Graphics2D graphics = bi.createGraphics();
>>        graphics.drawLine(0, 0, 100, 100);
>>
>>   4 Threads:
>>   - Before the fix:  8922 ops/ms
>>   - After the fix :  9442 ops/ms
>>   8 Threads:
>>   - Before the fix:  4511 ops/ms
>>   - After the fix : 11899 ops/ms
>>
>> The main issue which I would like to clarify is it possible to remove
>> synchronize keyword from the getLocalGraphicsEnvironment(), possibly
>> with updating a specification? We have similar issues in other parts
>> of code which I would like to update later after this one.
>>


--
Best regards, Sergey.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Jim Graham-5
Hi Sergey,

On 7/25/17 12:53 PM, Sergey Bylokhov wrote:
> In this variant the field should be volatile, and this will introduce some synchronization. Otherwise it will be
> possible to read non-null value in localEnv while the constructor of GraphicsEnvironment was not completed on some other
> thread(but the field was assigned already).

Even though the assignment is the last statement in the method?

In looking through the article that you sent out (only briefly) it looks like the consideration is that the
initialization method run in one thread could reorder the instructions at run time so that the field is assigned before
the initialization steps are done, even if the code is written with a different specific ordering.

I'm guessing that the Java Memory Model ensures that membars are placed to protect static field initialization from
similar dangers?

If that is the case then I think we have a few places where we do this "manual conditional synchronization" that should
probably be investigated.  If I remember correctly, though, they may use the following paradigm which might not suffer
from that issue:

Does it help if a local variable is used, as in:

GE env = this.localEnv
if (env == null) {
     synchronized (class) {
         if (this.localEnv == null) {
             localEnv = createGE();
         }
         env = this.localEnv;
     }
}
return env;

This ensures that the second read from the field is done in the synchronized block for the case that it was originally
found to be null, though this might not help if the line "localEnv = createGE()" can reorder code from inside the call
to createGE() to code outside of it.  However those dangling reordered stores should be resolved before it leaves the
synchronized block, no?

In any case, I can see that the code used in the webrev will eventually have no dead code at all in the final compiled
version of the method whereas this version would minimally at least have a null check that must be executed even though
it would be vestigial once the initialization was done...

+1 for the inner class version - it's the most straightforward way to do this optimally...

                                ...jim
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Sergey Bylokhov
On 25.07.2017 15:03, Jim Graham wrote:
> On 7/25/17 12:53 PM, Sergey Bylokhov wrote:
>> In this variant the field should be volatile, and this will introduce
>> some synchronization. Otherwise it will be possible to read non-null
>> value in localEnv while the constructor of GraphicsEnvironment was not
>> completed on some other thread(but the field was assigned already > > Even though the assignment is the last statement in the method?

The assignment include these operations:
  - Allocate memory
  - Initialize the object
  - Assign the object to the field.

These operation may be reordered, so some other thread can see non-null
reference in the field while initialization was not complete.

>
> In looking through the article that you sent out (only briefly) it looks
> like the consideration is that the initialization method run in one
> thread could reorder the instructions at run time so that the field is
> assigned before the initialization steps are done, even if the code is
> written with a different specific ordering.
>
> I'm guessing that the Java Memory Model ensures that membars are placed
> to protect static field initialization from similar dangers?

The key feature here is initialization of the class.
The next line will be blocked while the LocalGE class will not be
initialized:
     return LocalGE.INSTANCE;
And initialization will start only when this code will run for the first
time.

>
> If that is the case then I think we have a few places where we do this
> "manual conditional synchronization" that should probably be
> investigated.  If I remember correctly, though, they may use the
> following paradigm which might not suffer from that issue:
>
> Does it help if a local variable is used, as in:
>
> GE env = this.localEnv
> if (env == null) {
>      synchronized (class) {
>          if (this.localEnv == null) {
>              localEnv = createGE();
>          }
>          env = this.localEnv;
>      }
> }
> return env;
>
> This ensures that the second read from the field is done in the
> synchronized block for the case that it was originally found to be null,
> though this might not help if the line "localEnv = createGE()" can
> reorder code from inside the call to createGE() to code outside of it.  
> However those dangling reordered stores should be resolved before it
> leaves the synchronized block, no?

I think that it is possible that one thread will do:
  - create an object
  - assign the object to the localEnv.
  - Initialize the object.
And the second thread will do:
  - Read non-null value from the this.localEnv to the env
  - Check env to null
  - Return non-initialized object.

> In any case, I can see that the code used in the webrev will eventually
> have no dead code at all in the final compiled version of the method
> whereas this version would minimally at least have a null check that
> must be executed even though it would be vestigial once the
> initialization was done...
>
> +1 for the inner class version - it's the most straightforward way to do
> this optimally...
--
Best regards, Sergey.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Jim Graham-5


On 7/25/17 3:37 PM, Sergey Bylokhov wrote:

>> If that is the case then I think we have a few places where we do this "manual conditional synchronization" that
>> should probably be investigated.  If I remember correctly, though, they may use the following paradigm which might not
>> suffer from that issue:
>>
>> Does it help if a local variable is used, as in:
>>
>> GE env = this.localEnv
>> if (env == null) {
>>      synchronized (class) {
>>          if (this.localEnv == null) {
>>              localEnv = createGE();
>>          }
>>          env = this.localEnv;
>>      }
>> }
>> return env;
>>
>> This ensures that the second read from the field is done in the synchronized block for the case that it was originally
>> found to be null, though this might not help if the line "localEnv = createGE()" can reorder code from inside the call
>> to createGE() to code outside of it. However those dangling reordered stores should be resolved before it leaves the
>> synchronized block, no?
>
> I think that it is possible that one thread will do:
>   - create an object
>   - assign the object to the localEnv.
>   - Initialize the object.
> And the second thread will do:
>   - Read non-null value from the this.localEnv to the env
>   - Check env to null
>   - Return non-initialized object.

I can see that.  I suppose one thing that keeps this from causing problems in AWT is that we tend to be single-threaded
for most applications.  I suppose that one could get even more sophisticated to prevent that case, but in the end, as I
said, I can see the simplicity of using the inner class now in getting all of this right.

I'll do a search and see if there are any cases of the above paradigm left...

                        ...jim
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Jim Graham-5
On 7/26/17 12:34 PM, Jim Graham wrote:
> I'll do a search and see if there are any cases of the above paradigm left...

I did a basic grep over the Java sources in java.desktop looking primarily for synchronizations on a class, ignoring
synchronized methods (both instance and static) and synchronizations on an object other than a class (such as "this" or
"getTreeLock()").

I didn't find any cases of initializing a singleton without all of the tests inside the synch block (such as Toolkit
which does do the lazy initialization, but it has all of the lazy tests inside the synch block so it can't have
unordered initialization, but it might benefit from an inner class treatment for performance).  Either I was imagining
it, or we've gradually cleaned up a lot of those in the past.

I did find the following related issues:

TrayIcon.setPopupMenu()
  - synchronizes on TrayIcon.class, but then relies on tests from outside the synch block for correctness I believe -
the "already set" test should probably be duplicated inside the synchronization

I found a number of cases of "synchronized (Foo.class)" inside Foo.java, and some of them were the first statement in
the method - which begs the question of why the method wasn't just made synchronized in the first place.
Window.getWindows() and Window.removeWindowFromList() for examples...

                                ...jim
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Philip Race
In reply to this post by Jim Graham-5


On 07/24/2017 05:09 PM, Sergey Bylokhov wrote:
>
>
> The main issue which I would like to clarify is it possible to remove
> synchronize keyword from the getLocalGraphicsEnvironment(), possibly
> with updating a specification? We have similar issues in other parts
> of code which I would like to update later after this one.


Yes, I do think we can remove synchronized but we should do a CSR (CCC)

On 07/25/2017 03:03 PM, Jim Graham wrote:
>
> +1 for the inner class version - it's the most straightforward way to
> do this optimally...

Same from me.

-phil.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Sergey Bylokhov
In reply to this post by Sergey Bylokhov
DRAFT version of CSR:
https://bugs.openjdk.java.net/browse/JDK-8185542

----- [hidden email] wrote:

> On 07/24/2017 05:09 PM, Sergey Bylokhov wrote:
> >
> >
> > The main issue which I would like to clarify is it possible to
> remove
> > synchronize keyword from the getLocalGraphicsEnvironment(), possibly
>
> > with updating a specification? We have similar issues in other parts
>
> > of code which I would like to update later after this one.
>
>
> Yes, I do think we can remove synchronized but we should do a CSR
> (CCC)
>
> On 07/25/2017 03:03 PM, Jim Graham wrote:
> >
> > +1 for the inner class version - it's the most straightforward way
> to
> > do this optimally...
>
> Same from me.
>
> -phil.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Jim Graham-5
I adjusted one of the statements to be a little more straightforward and set the compatibility risk to minimal.  If Phil
thinks the risk can be set to "None", that was my alternate, but I stuck with "minimal" to be safe...

                        ...jim

On 7/28/17 5:07 PM, Sergey Bylokhov wrote:

> DRAFT version of CSR:
> https://bugs.openjdk.java.net/browse/JDK-8185542
>
> ----- [hidden email] wrote:
>
>> On 07/24/2017 05:09 PM, Sergey Bylokhov wrote:
>>>
>>>
>>> The main issue which I would like to clarify is it possible to
>> remove
>>> synchronize keyword from the getLocalGraphicsEnvironment(), possibly
>>
>>> with updating a specification? We have similar issues in other parts
>>
>>> of code which I would like to update later after this one.
>>
>>
>> Yes, I do think we can remove synchronized but we should do a CSR
>> (CCC)
>>
>> On 07/25/2017 03:03 PM, Jim Graham wrote:
>>>
>>> +1 for the inner class version - it's the most straightforward way
>> to
>>> do this optimally...
>>
>> Same from me.
>>
>> -phil.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Jim Graham-5
To be clear, I couldn't think of any way in which removing a synchronized keyword would create a compatibility risk,
though I suppose it potentially could cause a program to proceed faster and thus tickle a race condition - that seems
rather remote and the race condition would be a "pre-existing condition"...

                        ...jim

On 7/31/17 1:32 PM, Jim Graham wrote:

> I adjusted one of the statements to be a little more straightforward and set the compatibility risk to minimal.  If Phil
> thinks the risk can be set to "None", that was my alternate, but I stuck with "minimal" to be safe...
>
>              ...jim
>
> On 7/28/17 5:07 PM, Sergey Bylokhov wrote:
>> DRAFT version of CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8185542
>>
>> ----- [hidden email] wrote:
>>
>>> On 07/24/2017 05:09 PM, Sergey Bylokhov wrote:
>>>>
>>>>
>>>> The main issue which I would like to clarify is it possible to
>>> remove
>>>> synchronize keyword from the getLocalGraphicsEnvironment(), possibly
>>>
>>>> with updating a specification? We have similar issues in other parts
>>>
>>>> of code which I would like to update later after this one.
>>>
>>>
>>> Yes, I do think we can remove synchronized but we should do a CSR
>>> (CCC)
>>>
>>> On 07/25/2017 03:03 PM, Jim Graham wrote:
>>>>
>>>> +1 for the inner class version - it's the most straightforward way
>>> to
>>>> do this optimally...
>>>
>>> Same from me.
>>>
>>> -phil.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [10] Review Request: 8185093 Expensive multi-core choke point when any graphics objects are created

Philip Race
In reply to this post by Jim Graham-5
I think minimal or low .. I could never say "none".

I've added my "review" as well.

-phil

On 7/31/17, 1:32 PM, Jim Graham wrote:

> I adjusted one of the statements to be a little more straightforward
> and set the compatibility risk to minimal.  If Phil thinks the risk
> can be set to "None", that was my alternate, but I stuck with
> "minimal" to be safe...
>
>             ...jim
>
> On 7/28/17 5:07 PM, Sergey Bylokhov wrote:
>> DRAFT version of CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8185542
>>
>> ----- [hidden email] wrote:
>>
>>> On 07/24/2017 05:09 PM, Sergey Bylokhov wrote:
>>>>
>>>>
>>>> The main issue which I would like to clarify is it possible to
>>> remove
>>>> synchronize keyword from the getLocalGraphicsEnvironment(), possibly
>>>
>>>> with updating a specification? We have similar issues in other parts
>>>
>>>> of code which I would like to update later after this one.
>>>
>>>
>>> Yes, I do think we can remove synchronized but we should do a CSR
>>> (CCC)
>>>
>>> On 07/25/2017 03:03 PM, Jim Graham wrote:
>>>>
>>>> +1 for the inner class version - it's the most straightforward way
>>> to
>>>> do this optimally...
>>>
>>> Same from me.
>>>
>>> -phil.
Loading...