RFR (M) 8186042: Optimize OopMapCache lookup

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFR (M) 8186042: Optimize OopMapCache lookup

coleen.phillimore
Summary: Use lock free access to oopMapCache
Contributed-by: [hidden email], [hidden email]

The OopMapCache::lookup() function took out a mutex to protect access
between the GC threads that are running concurrently.  See bug for more
info.   The function lookup() is run by multiple GC threads
concurrently.  If there's a collision in the hashtable, this uses atomic
cmpxchg to add the entry to a list to be cleaned up after the safepoint
is over.   GC isn't doing lookup at that point.

This change is contributed by Frederic Parain, with some cleanup and
logging from me.

open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8186042

Tested with RBT equivalent of nightly on linux x64.  Also ran dacapo
with -Xint -Xlog:interpreter+oopmap=debug to verify.  This change also
removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration
however) in favor of new logging.  A linked CSR request is pending.

Thanks,
Coleen

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8186042: Optimize OopMapCache lookup

David Holmes
Hi Coleen,

This looks good to me - Reviewed. A couple of minor comments below.

On 12/08/2017 1:46 AM, [hidden email] wrote:

> Summary: Use lock free access to oopMapCache
> Contributed-by: [hidden email], [hidden email]
>
> The OopMapCache::lookup() function took out a mutex to protect access
> between the GC threads that are running concurrently.  See bug for more
> info.   The function lookup() is run by multiple GC threads
> concurrently.  If there's a collision in the hashtable, this uses atomic
> cmpxchg to add the entry to a list to be cleaned up after the safepoint
> is over.   GC isn't doing lookup at that point.
>
> This change is contributed by Frederic Parain, with some cleanup and
> logging from me.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8186042

src/share/vm/interpreter/oopMapCache.cpp

The logging here doesn't follow our usual pattern of only making the
logging calls if the appropriate logging level is enabled. ??

src/share/vm/interpreter/oopMapCache.hpp

Nit: OopMapCacheEntry* volatile* _array;

Space after volatile please as it refers to the OMCE* not the *_array.

> Tested with RBT equivalent of nightly on linux x64.  Also ran dacapo
> with -Xint -Xlog:interpreter+oopmap=debug to verify.  This change also

Do you have some performance improvement numbers for this optimization?
Or is this a pre-emptive strike against a potential locking bottleneck?

> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration
> however) in favor of new logging.  A linked CSR request is pending.

Ok.

Thanks,
David

> Thanks,
> Coleen
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8186042: Optimize OopMapCache lookup

coleen.phillimore


On 8/13/17 8:29 PM, David Holmes wrote:

> Hi Coleen,
>
> This looks good to me - Reviewed. A couple of minor comments below.
>
> On 12/08/2017 1:46 AM, [hidden email] wrote:
>> Summary: Use lock free access to oopMapCache
>> Contributed-by: [hidden email], [hidden email]
>>
>> The OopMapCache::lookup() function took out a mutex to protect access
>> between the GC threads that are running concurrently. See bug for
>> more info.   The function lookup() is run by multiple GC threads
>> concurrently.  If there's a collision in the hashtable, this uses
>> atomic cmpxchg to add the entry to a list to be cleaned up after the
>> safepoint is over.   GC isn't doing lookup at that point.
>>
>> This change is contributed by Frederic Parain, with some cleanup and
>> logging from me.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042
>
> src/share/vm/interpreter/oopMapCache.cpp
>
> The logging here doesn't follow our usual pattern of only making the
> logging calls if the appropriate logging level is enabled. ??

I'm not sure what you're talking about.

I think this is the new pattern of doing logging:

   Log(interpreter, oopmap) logv;
   LogStream st(logv.trace());

Otherwise

      log_debug(interpreter, oopmap)("*** collision in oopmap cache -
flushing item ***");

checks for whether debug level logging is on.   You only need this:

   if (log_is_enabled(Debug, interpreter, oopmap)) {
     static int count = 0;
     ResourceMark rm;
     log_debug(interpreter, oopmap)
           ("%d - Computing oopmap at bci %d for %s at hash %d",
++count, bci,
            method()->name_and_sig_as_C_string(), probe);
   }

If you're creating a ResourceMark.

Unless logging changed (which I think it did).
>
> src/share/vm/interpreter/oopMapCache.hpp
>
> Nit: OopMapCacheEntry* volatile* _array;
>
> Space after volatile please as it refers to the OMCE* not the *_array.

Like this?

   OopMapCacheEntry* volatile * _array;

Per coding standard, we don't put the '*' without * in front of the name
of the variable, only the type, like
    OopMapCacheEntry* blah;
vs
    OopMapCacheEntry *blah;
>
>> Tested with RBT equivalent of nightly on linux x64.  Also ran dacapo
>> with -Xint -Xlog:interpreter+oopmap=debug to verify.  This change also
>
> Do you have some performance improvement numbers for this
> optimization? Or is this a pre-emptive strike against a potential
> locking bottleneck?

Erik Osterlund ran benchmarks but I don't have access to them and he is
on vacation.   Heresay is that the bottleneck observed for this lock is
gone.
>
>> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration
>> however) in favor of new logging. A linked CSR request is pending.
>
> Ok.

Thanks for commenting on the CSR and the review.

Coleen
>
>
> Thanks,
> David
>
>> Thanks,
>> Coleen
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8186042: Optimize OopMapCache lookup

David Holmes
On 14/08/2017 10:17 PM, [hidden email] wrote:

> On 8/13/17 8:29 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> This looks good to me - Reviewed. A couple of minor comments below.
>>
>> On 12/08/2017 1:46 AM, [hidden email] wrote:
>>> Summary: Use lock free access to oopMapCache
>>> Contributed-by: [hidden email], [hidden email]
>>>
>>> The OopMapCache::lookup() function took out a mutex to protect access
>>> between the GC threads that are running concurrently. See bug for
>>> more info.   The function lookup() is run by multiple GC threads
>>> concurrently.  If there's a collision in the hashtable, this uses
>>> atomic cmpxchg to add the entry to a list to be cleaned up after the
>>> safepoint is over.   GC isn't doing lookup at that point.
>>>
>>> This change is contributed by Frederic Parain, with some cleanup and
>>> logging from me.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042
>>
>> src/share/vm/interpreter/oopMapCache.cpp
>>
>> The logging here doesn't follow our usual pattern of only making the
>> logging calls if the appropriate logging level is enabled. ??
>
> I'm not sure what you're talking about.
>
> I think this is the new pattern of doing logging:
>
>    Log(interpreter, oopmap) logv;
>    LogStream st(logv.trace());

I'm not familiar with that form. I wonder how the subsequent
unconditional st.print is handled? ie how much work do we do before
determining logging is not enabled?

> Otherwise
>
>       log_debug(interpreter, oopmap)("*** collision in oopmap cache -
> flushing item ***");
>
> checks for whether debug level logging is on.  

Yes.

> You only need this:
>    if (log_is_enabled(Debug, interpreter, oopmap)) {
>      static int count = 0;
>      ResourceMark rm;
>      log_debug(interpreter, oopmap)
>            ("%d - Computing oopmap at bci %d for %s at hash %d",
> ++count, bci,
>             method()->name_and_sig_as_C_string(), probe);
>    }
>
> If you're creating a ResourceMark.

I don't associate it only with use of ResourceMarks, but any time you
have to do a bit of work to gather all the information that will be
passed to the actual logging statement. The aim being not to do any of
that if logging is not enabled.

> Unless logging changed (which I think it did).

Perhaps. In which case I'd like to understand how this new form
minimises the wasted effort when logging is disabled.

>>
>> src/share/vm/interpreter/oopMapCache.hpp
>>
>> Nit: OopMapCacheEntry* volatile* _array;
>>
>> Space after volatile please as it refers to the OMCE* not the *_array.
>
> Like this?
>
>    OopMapCacheEntry* volatile * _array;

Yes.

> Per coding standard, we don't put the '*' without * in front of the name
> of the variable, only the type, like
>     OopMapCacheEntry* blah;
> vs
>     OopMapCacheEntry *blah;

Okay I see where this comes from. Personally I find it odd to have the
"volatile*". Anyway minor issue.

>>
>>> Tested with RBT equivalent of nightly on linux x64.  Also ran dacapo
>>> with -Xint -Xlog:interpreter+oopmap=debug to verify.  This change also
>>
>> Do you have some performance improvement numbers for this
>> optimization? Or is this a pre-emptive strike against a potential
>> locking bottleneck?
>
> Erik Osterlund ran benchmarks but I don't have access to them and he is
> on vacation.   Heresay is that the bottleneck observed for this lock is
> gone.

Okay we can get the gory details added to the bug later.

Thanks,
David

>>
>>> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration
>>> however) in favor of new logging. A linked CSR request is pending.
>>
>> Ok.
>
> Thanks for commenting on the CSR and the review.
>
> Coleen
>>
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Coleen
>>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8186042: Optimize OopMapCache lookup

coleen.phillimore


On 8/14/17 8:29 AM, David Holmes wrote:

> On 14/08/2017 10:17 PM, [hidden email] wrote:
>> On 8/13/17 8:29 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> This looks good to me - Reviewed. A couple of minor comments below.
>>>
>>> On 12/08/2017 1:46 AM, [hidden email] wrote:
>>>> Summary: Use lock free access to oopMapCache
>>>> Contributed-by: [hidden email],
>>>> [hidden email]
>>>>
>>>> The OopMapCache::lookup() function took out a mutex to protect
>>>> access between the GC threads that are running concurrently. See
>>>> bug for more info.   The function lookup() is run by multiple GC
>>>> threads concurrently.  If there's a collision in the hashtable,
>>>> this uses atomic cmpxchg to add the entry to a list to be cleaned
>>>> up after the safepoint is over.   GC isn't doing lookup at that point.
>>>>
>>>> This change is contributed by Frederic Parain, with some cleanup
>>>> and logging from me.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042
>>>
>>> src/share/vm/interpreter/oopMapCache.cpp
>>>
>>> The logging here doesn't follow our usual pattern of only making the
>>> logging calls if the appropriate logging level is enabled. ??
>>
>> I'm not sure what you're talking about.
>>
>> I think this is the new pattern of doing logging:
>>
>>    Log(interpreter, oopmap) logv;
>>    LogStream st(logv.trace());
>
> I'm not familiar with that form. I wonder how the subsequent
> unconditional st.print is handled? ie how much work do we do before
> determining logging is not enabled?

I wish I knew how to get generated .s file with the new build.  The
printing code in the function OopMapCacheEntry::verify_mask isn't doing
anything interesting (like getting as_C_string() etc).  The implied
check for log_is_enabled before all of the st.print()s can't be worse
than the existing global checks for TraceOopMapGeneration && Verbose.  
It's a ton cleaner and the performance is a wash.

>
>> Otherwise
>>
>>       log_debug(interpreter, oopmap)("*** collision in oopmap cache -
>> flushing item ***");
>>
>> checks for whether debug level logging is on.
>
> Yes.
>
>> You only need this:
>>    if (log_is_enabled(Debug, interpreter, oopmap)) {
>>      static int count = 0;
>>      ResourceMark rm;
>>      log_debug(interpreter, oopmap)
>>            ("%d - Computing oopmap at bci %d for %s at hash %d",
>> ++count, bci,
>>             method()->name_and_sig_as_C_string(), probe);
>>    }
>>
>> If you're creating a ResourceMark.
>
> I don't associate it only with use of ResourceMarks, but any time you
> have to do a bit of work to gather all the information that will be
> passed to the actual logging statement. The aim being not to do any of
> that if logging is not enabled.
>
>> Unless logging changed (which I think it did).
>
> Perhaps. In which case I'd like to understand how this new form
> minimises the wasted effort when logging is disabled.
>
>>>
>>> src/share/vm/interpreter/oopMapCache.hpp
>>>
>>> Nit: OopMapCacheEntry* volatile* _array;
>>>
>>> Space after volatile please as it refers to the OMCE* not the *_array.
>>
>> Like this?
>>
>>    OopMapCacheEntry* volatile * _array;
>
> Yes.
>
>> Per coding standard, we don't put the '*' without * in front of the
>> name of the variable, only the type, like
>>     OopMapCacheEntry* blah;
>> vs
>>     OopMapCacheEntry *blah;
>
> Okay I see where this comes from. Personally I find it odd to have the
> "volatile*". Anyway minor issue.
>
>>>
>>>> Tested with RBT equivalent of nightly on linux x64.  Also ran
>>>> dacapo with -Xint -Xlog:interpreter+oopmap=debug to verify.  This
>>>> change also
>>>
>>> Do you have some performance improvement numbers for this
>>> optimization? Or is this a pre-emptive strike against a potential
>>> locking bottleneck?
>>
>> Erik Osterlund ran benchmarks but I don't have access to them and he
>> is on vacation.   Heresay is that the bottleneck observed for this
>> lock is gone.
>
> Okay we can get the gory details added to the bug later.

Yes, RedHat will see the improvements also.

Thanks,
Coleen

>
> Thanks,
> David
>
>>>
>>>> removes -XX:+TraceOopMapGeneration (not
>>>> -XX:+TraceNewOopMapGeneration however) in favor of new logging. A
>>>> linked CSR request is pending.
>>>
>>> Ok.
>>
>> Thanks for commenting on the CSR and the review.
>>
>> Coleen
>>>
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8186042: Optimize OopMapCache lookup

coleen.phillimore


On 8/14/17 8:45 AM, [hidden email] wrote:

>
>
> On 8/14/17 8:29 AM, David Holmes wrote:
>> On 14/08/2017 10:17 PM, [hidden email] wrote:
>>> On 8/13/17 8:29 PM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> This looks good to me - Reviewed. A couple of minor comments below.
>>>>
>>>> On 12/08/2017 1:46 AM, [hidden email] wrote:
>>>>> Summary: Use lock free access to oopMapCache
>>>>> Contributed-by: [hidden email],
>>>>> [hidden email]
>>>>>
>>>>> The OopMapCache::lookup() function took out a mutex to protect
>>>>> access between the GC threads that are running concurrently. See
>>>>> bug for more info.   The function lookup() is run by multiple GC
>>>>> threads concurrently.  If there's a collision in the hashtable,
>>>>> this uses atomic cmpxchg to add the entry to a list to be cleaned
>>>>> up after the safepoint is over.   GC isn't doing lookup at that
>>>>> point.
>>>>>
>>>>> This change is contributed by Frederic Parain, with some cleanup
>>>>> and logging from me.
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042
>>>>
>>>> src/share/vm/interpreter/oopMapCache.cpp
>>>>
>>>> The logging here doesn't follow our usual pattern of only making
>>>> the logging calls if the appropriate logging level is enabled. ??
>>>
>>> I'm not sure what you're talking about.
>>>
>>> I think this is the new pattern of doing logging:
>>>
>>>    Log(interpreter, oopmap) logv;
>>>    LogStream st(logv.trace());
>>
>> I'm not familiar with that form. I wonder how the subsequent
>> unconditional st.print is handled? ie how much work do we do before
>> determining logging is not enabled?
>
> I wish I knew how to get generated .s file with the new build. The
> printing code in the function OopMapCacheEntry::verify_mask isn't
> doing anything interesting (like getting as_C_string() etc).  The
> implied check for log_is_enabled before all of the st.print()s can't
> be worse than the existing global checks for TraceOopMapGeneration &&
> Verbose.  It's a ton cleaner and the performance is a wash.

The function verify_maks is called under an assert, so the log trace
queries won't affect product performance.
Coleen

>>
>>> Otherwise
>>>
>>>       log_debug(interpreter, oopmap)("*** collision in oopmap cache
>>> - flushing item ***");
>>>
>>> checks for whether debug level logging is on.
>>
>> Yes.
>>
>>> You only need this:
>>>    if (log_is_enabled(Debug, interpreter, oopmap)) {
>>>      static int count = 0;
>>>      ResourceMark rm;
>>>      log_debug(interpreter, oopmap)
>>>            ("%d - Computing oopmap at bci %d for %s at hash %d",
>>> ++count, bci,
>>>             method()->name_and_sig_as_C_string(), probe);
>>>    }
>>>
>>> If you're creating a ResourceMark.
>>
>> I don't associate it only with use of ResourceMarks, but any time you
>> have to do a bit of work to gather all the information that will be
>> passed to the actual logging statement. The aim being not to do any
>> of that if logging is not enabled.
>>
>>> Unless logging changed (which I think it did).
>>
>> Perhaps. In which case I'd like to understand how this new form
>> minimises the wasted effort when logging is disabled.
>>
>>>>
>>>> src/share/vm/interpreter/oopMapCache.hpp
>>>>
>>>> Nit: OopMapCacheEntry* volatile* _array;
>>>>
>>>> Space after volatile please as it refers to the OMCE* not the *_array.
>>>
>>> Like this?
>>>
>>>    OopMapCacheEntry* volatile * _array;
>>
>> Yes.
>>
>>> Per coding standard, we don't put the '*' without * in front of the
>>> name of the variable, only the type, like
>>>     OopMapCacheEntry* blah;
>>> vs
>>>     OopMapCacheEntry *blah;
>>
>> Okay I see where this comes from. Personally I find it odd to have
>> the "volatile*". Anyway minor issue.
>>
>>>>
>>>>> Tested with RBT equivalent of nightly on linux x64.  Also ran
>>>>> dacapo with -Xint -Xlog:interpreter+oopmap=debug to verify.  This
>>>>> change also
>>>>
>>>> Do you have some performance improvement numbers for this
>>>> optimization? Or is this a pre-emptive strike against a potential
>>>> locking bottleneck?
>>>
>>> Erik Osterlund ran benchmarks but I don't have access to them and he
>>> is on vacation.   Heresay is that the bottleneck observed for this
>>> lock is gone.
>>
>> Okay we can get the gory details added to the bug later.
>
> Yes, RedHat will see the improvements also.
>
> Thanks,
> Coleen
>
>>
>> Thanks,
>> David
>>
>>>>
>>>>> removes -XX:+TraceOopMapGeneration (not
>>>>> -XX:+TraceNewOopMapGeneration however) in favor of new logging. A
>>>>> linked CSR request is pending.
>>>>
>>>> Ok.
>>>
>>> Thanks for commenting on the CSR and the review.
>>>
>>> Coleen
>>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR (M) 8186042: Optimize OopMapCache lookup

coleen.phillimore
In reply to this post by coleen.phillimore


On 8/15/17 12:55 PM, [hidden email] wrote:
> Hi Coleen,
>
> It looks good in general.

Thank you Serguei for the code review.

> One comment though.
>
> http://cr.openjdk.java.net/~coleenp/8186042.01/webrev/src/share/vm/interpreter/oopMapCache.cpp.frames.html
>
>   516   // Entry is not in hashtable.
> 517 // Compute entry
> 518
> 519 OopMapCacheEntry* tmp = NEW_C_HEAP_OBJ(OopMapCacheEntry, mtClass);
> 520 tmp->initialize();
> 521 tmp->fill(method, bci);
> 522 entry_for->resource_copy(tmp);
>   523
>   524   if (method->should_not_be_cached()) {
>   525     // It is either not safe or not a good idea to cache this Method*
>   526     // at this time. We give the caller of lookup() a copy of the
>   527     // interesting info via parameter entry_for, but we don't add it to
>   528     // the cache. See the gory details in Method*.cpp.
> 529 FREE_C_HEAP_OBJ(tmp);
>   530     return;
>   531   }
>
The OopMapCacheEntry we allocate at 519 is used to fill in the entry_for
passed in at line 522, so it can't be moved.  The fill() method does the
abstract interpretation, filling in tmp and that's used to fill in
entry_for.   entry_for is an InterpreterOopMap, which is derived from
OopMapCacheEntry.  It's sort of odd code, ie not straightforward.

Thanks,
Coleen


>  Would it better to move the fragment 524-531 above the line 516?
>  Then the line (529 FREE_C_HEAP_OBJ(tmp);) could be removed.
>
>
> Thanks,
> Serguei
>
>
> On 8/11/17 08:46, [hidden email] wrote:
>> Summary: Use lock free access to oopMapCache
>> Contributed-by: [hidden email], [hidden email]
>>
>> The OopMapCache::lookup() function took out a mutex to protect access
>> between the GC threads that are running concurrently. See bug for
>> more info.   The function lookup() is run by multiple GC threads
>> concurrently.  If there's a collision in the hashtable, this uses
>> atomic cmpxchg to add the entry to a list to be cleaned up after the
>> safepoint is over.   GC isn't doing lookup at that point.
>>
>> This change is contributed by Frederic Parain, with some cleanup and
>> logging from me.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042
>>
>> Tested with RBT equivalent of nightly on linux x64.  Also ran dacapo
>> with -Xint -Xlog:interpreter+oopmap=debug to verify. This change also
>> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration
>> however) in favor of new logging. A linked CSR request is pending.
>>
>> Thanks,
>> Coleen
>>
>

Loading...