RFR(XL): 8167108 - SMR and JavaThread Lifecycle

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
Greetings,

We have a (eXtra Large) fix for the following bug:

8167108 inconsistent handling of SR_lock can lead to crashes
https://bugs.openjdk.java.net/browse/JDK-8167108

This fix adds a Safe Memory Reclamation (SMR) mechanism based on
Hazard Pointers to manage JavaThread lifecycle.

Here's a PDF for the internal wiki that we've been using to describe
and track the work on this project:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf

Dan has noticed that the indenting is wrong in some of the code quotes
in the PDF that are not present in the internal wiki. We don't have a
solution for that problem yet.

Here's the webrev for current JDK10 version of this fix:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full

This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
testing, additional stress testing on Dan's Solaris X64 server, and
additional testing on Erik and Robbin's machines.

We welcome comments, suggestions and feedback.

Daniel Daugherty
Erik Osterlund
Robbin Ehn
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
Many thanks to the folks that reviewed this internally and provided
much appreciated feedback:

- Daniel Daugherty
- David Holmes
- Erik Osterlund
- Jerry Thornbrugh
- Karen Kinnear
- Kim Barrett
- Robbin Ehn
- Serguei Spitsyn
- Stefan Karlson

Since there are three contributing authors, we have been reviewing
(and arguing over) each other's code. It has been an adventure!

Dan, Erik, and Robbin


On 10/9/17 1:41 PM, Daniel D. Daugherty wrote:

> Greetings,
>
> We have a (eXtra Large) fix for the following bug:
>
> 8167108 inconsistent handling of SR_lock can lead to crashes
> https://bugs.openjdk.java.net/browse/JDK-8167108
>
> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
> Hazard Pointers to manage JavaThread lifecycle.
>
> Here's a PDF for the internal wiki that we've been using to describe
> and track the work on this project:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>
>
> Dan has noticed that the indenting is wrong in some of the code quotes
> in the PDF that are not present in the internal wiki. We don't have a
> solution for that problem yet.
>
> Here's the webrev for current JDK10 version of this fix:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>
> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
> testing, additional stress testing on Dan's Solaris X64 server, and
> additional testing on Erik and Robbin's machines.
>
> We welcome comments, suggestions and feedback.
>
> Daniel Daugherty
> Erik Osterlund
> Robbin Ehn
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

coleen.phillimore
In reply to this post by Daniel D. Daugherty

Hi,  From my initial look at this, I really like new interface for
walking the threads list, except every instance but 2 uses of
JavaThreadIterator has a preceding ThreadsListHandle declaration.

+  ThreadsListHandle tlh;
+  JavaThreadIterator jti(tlh.list());


Could there be a wrapper that embeds the ThreadsListHandle, like:

class JavaThreadsListIterator {
     ThreadsListHandle _tlh;
...
}

Thanks,
Coleen

On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:

> Greetings,
>
> We have a (eXtra Large) fix for the following bug:
>
> 8167108 inconsistent handling of SR_lock can lead to crashes
> https://bugs.openjdk.java.net/browse/JDK-8167108
>
> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
> Hazard Pointers to manage JavaThread lifecycle.
>
> Here's a PDF for the internal wiki that we've been using to describe
> and track the work on this project:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>
>
> Dan has noticed that the indenting is wrong in some of the code quotes
> in the PDF that are not present in the internal wiki. We don't have a
> solution for that problem yet.
>
> Here's the webrev for current JDK10 version of this fix:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>
> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
> testing, additional stress testing on Dan's Solaris X64 server, and
> additional testing on Erik and Robbin's machines.
>
> We welcome comments, suggestions and feedback.
>
> Daniel Daugherty
> Erik Osterlund
> Robbin Ehn

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

David Holmes
On 12/10/2017 1:32 AM, [hidden email] wrote:

>
> Hi,  From my initial look at this, I really like new interface for
> walking the threads list, except every instance but 2 uses of
> JavaThreadIterator has a preceding ThreadsListHandle declaration.
>
> +  ThreadsListHandle tlh;
> +  JavaThreadIterator jti(tlh.list());
>
>
> Could there be a wrapper that embeds the ThreadsListHandle, like:
>
> class JavaThreadsListIterator {
>      ThreadsListHandle _tlh;
> ...
> }

The ThreadsListHandle could not itself be a stack-object in that case.

David

> Thanks,
> Coleen
>
> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> We have a (eXtra Large) fix for the following bug:
>>
>> 8167108 inconsistent handling of SR_lock can lead to crashes
>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>
>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>> Hazard Pointers to manage JavaThread lifecycle.
>>
>> Here's a PDF for the internal wiki that we've been using to describe
>> and track the work on this project:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>
>>
>> Dan has noticed that the indenting is wrong in some of the code quotes
>> in the PDF that are not present in the internal wiki. We don't have a
>> solution for that problem yet.
>>
>> Here's the webrev for current JDK10 version of this fix:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>
>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>> testing, additional stress testing on Dan's Solaris X64 server, and
>> additional testing on Erik and Robbin's machines.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Daniel Daugherty
>> Erik Osterlund
>> Robbin Ehn
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

David Holmes
Please ignore this gaseous brain explosion. :)

Thanks,
David

On 12/10/2017 7:30 AM, David Holmes wrote:

> On 12/10/2017 1:32 AM, [hidden email] wrote:
>>
>> Hi,  From my initial look at this, I really like new interface for
>> walking the threads list, except every instance but 2 uses of
>> JavaThreadIterator has a preceding ThreadsListHandle declaration.
>>
>> +  ThreadsListHandle tlh;
>> +  JavaThreadIterator jti(tlh.list());
>>
>>
>> Could there be a wrapper that embeds the ThreadsListHandle, like:
>>
>> class JavaThreadsListIterator {
>>      ThreadsListHandle _tlh;
>> ...
>> }
>
> The ThreadsListHandle could not itself be a stack-object in that case.
>
> David
>
>> Thanks,
>> Coleen
>>
>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> We have a (eXtra Large) fix for the following bug:
>>>
>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>
>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>> Hazard Pointers to manage JavaThread lifecycle.
>>>
>>> Here's a PDF for the internal wiki that we've been using to describe
>>> and track the work on this project:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>
>>>
>>> Dan has noticed that the indenting is wrong in some of the code quotes
>>> in the PDF that are not present in the internal wiki. We don't have a
>>> solution for that problem yet.
>>>
>>> Here's the webrev for current JDK10 version of this fix:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>
>>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>> additional testing on Erik and Robbin's machines.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Daniel Daugherty
>>> Erik Osterlund
>>> Robbin Ehn
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
In reply to this post by coleen.phillimore
Added back the other three OpenJDK aliases being used for this review...

Coleen, thanks for chiming in on this review thread. Sorry for the
delay in responding... I was on vacation...


On 10/11/17 11:32 AM, [hidden email] wrote:

>
> Hi,  From my initial look at this, I really like new interface for
> walking the threads list, except every instance but 2 uses of
> JavaThreadIterator has a preceding ThreadsListHandle declaration.
>
> +  ThreadsListHandle tlh;
> +  JavaThreadIterator jti(tlh.list());
>
>
> Could there be a wrapper that embeds the ThreadsListHandle, like:
>
> class JavaThreadsListIterator {
>     ThreadsListHandle _tlh;
> ...
> }

This issue has been addressed as part of my response to Stefan K's
code review comments. I quoted the above comment in that reply so
it should be easy to find that resolution...

Dan


>
> Thanks,
> Coleen
>
> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> We have a (eXtra Large) fix for the following bug:
>>
>> 8167108 inconsistent handling of SR_lock can lead to crashes
>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>
>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>> Hazard Pointers to manage JavaThread lifecycle.
>>
>> Here's a PDF for the internal wiki that we've been using to describe
>> and track the work on this project:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>
>>
>> Dan has noticed that the indenting is wrong in some of the code quotes
>> in the PDF that are not present in the internal wiki. We don't have a
>> solution for that problem yet.
>>
>> Here's the webrev for current JDK10 version of this fix:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>
>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>> testing, additional stress testing on Dan's Solaris X64 server, and
>> additional testing on Erik and Robbin's machines.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Daniel Daugherty
>> Erik Osterlund
>> Robbin Ehn
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
In reply to this post by Daniel D. Daugherty
Greetings,

I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.

Here are the updated webrevs:

Here's the mq comment for the change:

   Rebase to 2017.10.25 PIT snapshot.

Here is the full webrev:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/

And here is the delta webrev:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/

I ran the above bits throught Mach5 tier[1-5] testing over the holiday
weekend. Didn't see any issues in a quick look. Going to take a closer
look today.

We welcome comments, suggestions and feedback.

Dan, Erik and Robbin


On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:

> Greetings,
>
> Resolving one of the code review comments (from both Stefan K and Coleen)
> on jdk10-04-full required quite a few changes so it is being done as a
> standalone patch and corresponding webrevs:
>
> Here's the mq comment for the change:
>
>   stefank, coleenp CR - refactor most JavaThreadIterator usage to use
>     JavaThreadIteratorWithHandle.
>
> Here is the full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>
> And here is the delta webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>
> We welcome comments, suggestions and feedback.
>
> Dan, Erik and Robbin
>
>
> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> We have a (eXtra Large) fix for the following bug:
>>
>> 8167108 inconsistent handling of SR_lock can lead to crashes
>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>
>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>> Hazard Pointers to manage JavaThread lifecycle.
>>
>> Here's a PDF for the internal wiki that we've been using to describe
>> and track the work on this project:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>
>>
>> Dan has noticed that the indenting is wrong in some of the code quotes
>> in the PDF that are not present in the internal wiki. We don't have a
>> solution for that problem yet.
>>
>> Here's the webrev for current JDK10 version of this fix:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>
>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>> testing, additional stress testing on Dan's Solaris X64 server, and
>> additional testing on Erik and Robbin's machines.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Daniel Daugherty
>> Erik Osterlund
>> Robbin Ehn
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
Greetings,

I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
(Yes, we're playing chase-the-repo...)

Here is the updated full webrev:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/

Unlike the previous rebase, there were no changes required to the
open code to get the rebased bits to build so there is no delta
webrev.

These bits have been run through JPRT and I've submitted the usual
Mach5 tier[1-5] test run...

We welcome comments, suggestions and feedback.

Dan, Erik and Robbin


On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:

> Greetings,
>
> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>
> Here are the updated webrevs:
>
> Here's the mq comment for the change:
>
>   Rebase to 2017.10.25 PIT snapshot.
>
> Here is the full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>
> And here is the delta webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>
> I ran the above bits throught Mach5 tier[1-5] testing over the holiday
> weekend. Didn't see any issues in a quick look. Going to take a closer
> look today.
>
> We welcome comments, suggestions and feedback.
>
> Dan, Erik and Robbin
>
>
> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> Resolving one of the code review comments (from both Stefan K and
>> Coleen)
>> on jdk10-04-full required quite a few changes so it is being done as a
>> standalone patch and corresponding webrevs:
>>
>> Here's the mq comment for the change:
>>
>>   stefank, coleenp CR - refactor most JavaThreadIterator usage to use
>>     JavaThreadIteratorWithHandle.
>>
>> Here is the full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>
>> And here is the delta webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>
>> We welcome comments, suggestions and feedback.
>>
>> Dan, Erik and Robbin
>>
>>
>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> We have a (eXtra Large) fix for the following bug:
>>>
>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>
>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>> Hazard Pointers to manage JavaThread lifecycle.
>>>
>>> Here's a PDF for the internal wiki that we've been using to describe
>>> and track the work on this project:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>
>>>
>>> Dan has noticed that the indenting is wrong in some of the code quotes
>>> in the PDF that are not present in the internal wiki. We don't have a
>>> solution for that problem yet.
>>>
>>> Here's the webrev for current JDK10 version of this fix:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>
>>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>> additional testing on Erik and Robbin's machines.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Daniel Daugherty
>>> Erik Osterlund
>>> Robbin Ehn
>>>
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
Greetings,

Robbin rebased the project last night/this morning to merge with Thread
Local Handshakes (TLH) and also picked up additional changesets up thru:

> Changeset: fa736014cf28
> Author:    cjplummer
> Date:      2017-11-14 18:08 -0800
> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>
> 8191049: Add alternate version of pns() that is callable from within hotspot source
> Summary: added pns2() to debug.cpp
> Reviewed-by: stuefe, gthornbr

This is the first time we've rebased the project to bits that are this
fresh (< 12 hours old at rebase time). We've done this because we think
we're done with this project and are in the final review-change-retest
cycle(s)... In other words, we're not planning on making any more major
changes for this project... :-)

*** Time for code reviewers to chime in on this thread! ***

Here is the updated full webrev:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/

Here's is the delta webrev from the 2017.11.10 rebase:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/

Dan has submitted the bits for the usual Mach5 tier[1-5] testing
(and the new baseline also)...

We're expecting this round to be a little noisier than usual because
we did not rebase on a PIT snapshot so the new baseline has not been
through Jesper's usual care-and-feeding of integration_blockers, etc.

We welcome comments, suggestions and feedback.

Dan, Erik and Robbin


On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:

> Greetings,
>
> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
> (Yes, we're playing chase-the-repo...)
>
> Here is the updated full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>
> Unlike the previous rebase, there were no changes required to the
> open code to get the rebased bits to build so there is no delta
> webrev.
>
> These bits have been run through JPRT and I've submitted the usual
> Mach5 tier[1-5] test run...
>
> We welcome comments, suggestions and feedback.
>
> Dan, Erik and Robbin
>
>
> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>
>> Here are the updated webrevs:
>>
>> Here's the mq comment for the change:
>>
>>   Rebase to 2017.10.25 PIT snapshot.
>>
>> Here is the full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>
>> And here is the delta webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>
>> I ran the above bits throught Mach5 tier[1-5] testing over the holiday
>> weekend. Didn't see any issues in a quick look. Going to take a closer
>> look today.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Dan, Erik and Robbin
>>
>>
>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Resolving one of the code review comments (from both Stefan K and
>>> Coleen)
>>> on jdk10-04-full required quite a few changes so it is being done as a
>>> standalone patch and corresponding webrevs:
>>>
>>> Here's the mq comment for the change:
>>>
>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage to use
>>>     JavaThreadIteratorWithHandle.
>>>
>>> Here is the full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>
>>> And here is the delta webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> We have a (eXtra Large) fix for the following bug:
>>>>
>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>
>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>
>>>> Here's a PDF for the internal wiki that we've been using to describe
>>>> and track the work on this project:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>
>>>>
>>>> Dan has noticed that the indenting is wrong in some of the code quotes
>>>> in the PDF that are not present in the internal wiki. We don't have a
>>>> solution for that problem yet.
>>>>
>>>> Here's the webrev for current JDK10 version of this fix:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>
>>>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>>> additional testing on Erik and Robbin's machines.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Daniel Daugherty
>>>> Erik Osterlund
>>>> Robbin Ehn
>>>>
>>>
>>>
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

coleen.phillimore

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/os/linux/os_linux.cpp.frames.html

I read David's comments about the threads list iterator, and I was going
to say that it can be cleaned up later, as the bulk of the change is the
SMR part but this looks truly bizarre.   It looks like it shouldn't
compile because 'jt' isn't in scope.

Why isn't this the syntax:

JavaThreadIteratorWithHandle jtiwh;
for (JavaThread* jt = jtiwh.first(); jt != NULL; jt = jtiwh.next()) {
}

Which would do the obvious thing without anyone having to squint at the
code.

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/threadSMR.hpp.html

As a hater of acronmys, can this file use "Safe Memory Reclaimation" and
briefly describe the concept in the beginning of the header file, so one
knows why it's called threadSMR.hpp?   It doesn't need to be long and
include why Threads list needs this.  Sometimes we tell new people that
the hotspot documentation is in the header files.

  186   JavaThreadIteratorWithHandle() : _tlh(), _index(0) {}

This _tlh() call should not be necessary.  The compiler should generate
this for you in the constructor.

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/threadSMR.cpp.html

   32 ThreadsList::ThreadsList(int entries) : _length(entries), _threads(NEW_C_HEAP_ARRAY(JavaThread*, entries + 1, mtGC)), _next_list(NULL) {

Seems like it should be mtThread rather than mtGC.

Should

   62   if (EnableThreadSMRStatistics) {

really be UL, ie: if (log_is_enabled(Info, thread, smr, statistics)) ?


http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/test/hotspot/jtreg/runtime/ErrorHandling/NestedThreadsListHandleInErrorHandlingTest.java.html

Can you use for these tests instead (there were a couple):

*@requires (vm.debug == true)*

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/thread.cpp.udiff.html

+// thread, has been added the Threads list, the system is not at a

Has "not" been added to the Threads list?  missing "not"?

+ return (unsigned int)(((uint32_t)(uintptr_t)s1) * 2654435761u);

Can you add a comment about where this number came from?

I can't find the caller of threads_do_smr.

If these functions xchg_smr_thread_list, get_smr_java_thread_list,
inc_smr_deleted_thread_count are only used by thread.cpp, I think they
should go in that file and not in the .inline.hpp file to be included
and possibly called by other files.  I think they're private to class
Threads.

I don't have any in-depth comments.  This looks really good from my day
of reading it.

Thanks,
Coleen

On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:

> Greetings,
>
> Testing of the last round of changes revealed a hang in one of the new
> TLH tests. Robbin has fixed the hang, updated the existing TLH test, and
> added another TLH test for good measure.
>
> Here is the updated full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>
> Here is the updated delta webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>
> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
> no unexpected failures.
>
> We welcome comments, suggestions and feedback.
>
> Dan, Erik and Robbin
>
>
> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> Robbin rebased the project last night/this morning to merge with Thread
>> Local Handshakes (TLH) and also picked up additional changesets up thru:
>>
>>> Changeset: fa736014cf28
>>> Author:    cjplummer
>>> Date:      2017-11-14 18:08 -0800
>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>
>>> 8191049: Add alternate version of pns() that is callable from within
>>> hotspot source
>>> Summary: added pns2() to debug.cpp
>>> Reviewed-by: stuefe, gthornbr
>>
>> This is the first time we've rebased the project to bits that are this
>> fresh (< 12 hours old at rebase time). We've done this because we think
>> we're done with this project and are in the final review-change-retest
>> cycle(s)... In other words, we're not planning on making any more major
>> changes for this project... :-)
>>
>> *** Time for code reviewers to chime in on this thread! ***
>>
>> Here is the updated full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>
>> Here's is the delta webrev from the 2017.11.10 rebase:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>
>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>> (and the new baseline also)...
>>
>> We're expecting this round to be a little noisier than usual because
>> we did not rebase on a PIT snapshot so the new baseline has not been
>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Dan, Erik and Robbin
>>
>>
>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>> (Yes, we're playing chase-the-repo...)
>>>
>>> Here is the updated full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>
>>> Unlike the previous rebase, there were no changes required to the
>>> open code to get the rebased bits to build so there is no delta
>>> webrev.
>>>
>>> These bits have been run through JPRT and I've submitted the usual
>>> Mach5 tier[1-5] test run...
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>
>>>> Here are the updated webrevs:
>>>>
>>>> Here's the mq comment for the change:
>>>>
>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>
>>>> Here is the full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>
>>>> And here is the delta webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>
>>>> I ran the above bits throught Mach5 tier[1-5] testing over the holiday
>>>> weekend. Didn't see any issues in a quick look. Going to take a closer
>>>> look today.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Resolving one of the code review comments (from both Stefan K and
>>>>> Coleen)
>>>>> on jdk10-04-full required quite a few changes so it is being done
>>>>> as a
>>>>> standalone patch and corresponding webrevs:
>>>>>
>>>>> Here's the mq comment for the change:
>>>>>
>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage to use
>>>>>     JavaThreadIteratorWithHandle.
>>>>>
>>>>> Here is the full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>
>>>>> And here is the delta webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>
>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>
>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>
>>>>>> Here's a PDF for the internal wiki that we've been using to describe
>>>>>> and track the work on this project:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>
>>>>>>
>>>>>> Dan has noticed that the indenting is wrong in some of the code
>>>>>> quotes
>>>>>> in the PDF that are not present in the internal wiki. We don't
>>>>>> have a
>>>>>> solution for that problem yet.
>>>>>>
>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>
>>>>>> This fix has been run through many rounds of JPRT and Mach5
>>>>>> tier[2-5]
>>>>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Daniel Daugherty
>>>>>> Erik Osterlund
>>>>>> Robbin Ehn
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

coleen.phillimore
In reply to this post by Daniel D. Daugherty


On 11/20/17 10:51 PM, Daniel D. Daugherty wrote:

> On 11/20/17 9:13 PM, David Holmes wrote:
>> Hi Dan,
>>
>> Just to be clear my comment about use of jint's was not about
>> expected size but the fact you were using a j-type instead of a C++
>> type when the field is not a Java field. (Coleen has been on a
>> crusade to remove j-types from where they don't belong - and they
>> were removed from the Atomic API as part of that recent
>> templatization effort.)
>
> Thanks for making that clear. I think I've gotten rid of all
> the jint's at this point...

Oh great, I had missed that.   Thank you David for commenting on that.

Coleen

>
>
>> No further comments. :)
>
> Thanks. I'll be sending out more webrevs when I get through all
> the code review comments...
>
> Dan
>
>
>>
>> Thanks,
>> David
>>
>> On 21/11/2017 11:50 AM, Daniel D. Daugherty wrote:
>>> On 11/20/17 12:51 AM, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> Figured I'd better cast an eye over this again before it gets
>>>> pushed :)
>>>
>>> Thanks for reviewing again!!
>>>
>>>
>>>> Only one thing (repeated many times) jumped out at me and apologies
>>>> if this has already been debated back and forth. I missed the
>>>> exchange where the for loop iteration was rewritten into this
>>>> unusual form:
>>>>
>>>> for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt =
>>>> jtiwh.next(); ) {
>>>>
>>>> On first reading I expect most people would mistake the control
>>>> expression for the iteration-expression due to the next(). I didn't
>>>> even know the control expression could introduce a local variable!
>>>> I find this really hard to read stylistically and far too
>>>> cute/clever for its own good. It also violates the style rules
>>>> regarding implicit boolean checks.
>>>>
>>>> I know Stefan proposed this as he did not like the alternate (in a
>>>> few places):
>>>>
>>>> +  {
>>>> +    ThreadsListHandle tlh;
>>>> +    JavaThreadIterator jti(tlh.list());
>>>> +    for (JavaThread* t = jti.first(); t != NULL; t = jti.next()) {
>>>> ...
>>>> +  }
>>>>
>>>> but it seems to me the iterator code has evolved since then and
>>>> only a couple of places need a distinct TLH separate from the
>>>> actual iterator object. So I'm more in favour of the more classic
>>>> for-loop, with the iterator declared before the loop. Or even
>>>> convert to while loops, as this iterator seems more suited to that.
>>>
>>> Actually both Coleen and Stefan had a problem with how much additional
>>> code was added to support an iterator model. In some cases we went from
>>> 1-line to 3-lines and in some cases we went from 1-line to 5-lines. For
>>> Coleen, she wanted 2 of the new lines compressed into 1 new line by
>>> using
>>> an iterator with an embedded ThreadsListHandle. Stefan wanted us to try
>>> and get back to 1-line where possible.
>>>
>>> The advantages of the new style are:
>>>
>>> - 1-line to 1-line in all but a few cases
>>> - automatic limited scope of the embedded ThreadsListHandle so we were
>>>    able to remove extra braces that were added earlier in most cases
>>>
>>> The disadvantages of the new style are:
>>>
>>> - it is an unusual for-loop style (in HotSpot)
>>> - it breaks HotSpot's style rule about implied booleans
>>>
>>>
>>> Stefan K is pretty adamant that the original iterator version where we
>>> go from 1-line to 3-lines or 1-line to 5-lines is not acceptable for GC
>>> code. The compiler guys haven't chimed in on this debate so I'm
>>> guessing
>>> they don't have a strong opinion either way. One thing that we don't
>>> want
>>> to do is have different styles for the different teams.
>>>
>>> So I had to make a judgement call on whether I thought Runtime could
>>> live
>>> with Stefan's idea. Originally I wasn't fond of it either and
>>> breaking the
>>> implied boolean style rule is a problem for me (I post that comment
>>> a lot
>>> in my code reviews). However, I have to say that I'm a lot happier
>>> about
>>> the compactness of the code and the fact that limited ThreadsListHandle
>>> scope comes for 'free' in most cases.
>>>
>>> We're planning to keep the new iterator style for the following
>>> reasons:
>>>
>>> - smaller change footprint
>>> - consistent style used for all of the teams
>>> - limited ThreadsListHandle scope comes for free in most cases
>>>
>>> We're hoping that you can live with this decision (for now) and maybe
>>> even grow to like it in the future. :-)
>>>
>>>
>>>> Other than that just a couple of comments on variable type choices,
>>>> and a few typos spotted below.
>>>
>>> Replies embedded below.
>>>
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/globals.hpp
>>>>
>>>> 2476   diagnostic(bool, EnableThreadSMRExtraValidityChecks, true,
>>>>         \
>>>> 2477           "Enable Thread SMR extra validity checks") \
>>>> 2478         \
>>>> 2479   diagnostic(bool, EnableThreadSMRStatistics, true,         \
>>>> 2480           "Enable Thread SMR Statistics")         \
>>>>
>>>> Indent of strings is off by  3 spaces.
>>>
>>> Fixed.
>>>
>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/handshake.cpp
>>>>
>>>>  140       // There is assumption in code that Threads_lock should
>>>> be lock
>>>>  200       // There is assumption in code that Threads_lock should
>>>> be lock
>>>>
>>>> lock -> locked
>>>
>>> Fixed.
>>>
>>>
>>>> 146         // handshake_process_by_vmthread will check the
>>>> semaphore for us again
>>>>
>>>> Needs period at end.
>>>
>>> Fixed.
>>>
>>>
>>>> 148         // should be okey since the new thread will not have an
>>>> operation.
>>>>
>>>> okey -> okay
>>>
>>> Fixed.
>>>
>>>
>>>> 151         // We can't warn here is since the thread do
>>>> cancel_handshake after it have been removed
>>>>
>>>> I think:
>>>>
>>>> // We can't warn here since the thread does cancel_handshake after
>>>> it has been removed
>>>
>>> Fixed.
>>>
>>>
>>>> 152         // from ThreadsList. So we should just keep looping
>>>> here until while below return negative.
>>>>
>>>> from -> from the
>>>>
>>>> Needs period at end.
>>>
>>> Fixed both.
>>>
>>>
>>>>
>>>>  204             // A new thread on the ThreadsList will not have
>>>> an operation.
>>>>
>>>> Change period to comma, and ...
>>>
>>> Fixed.
>>>
>>>
>>>> 205             // Hence is skipped in handshake_process_by_vmthread.
>>>>
>>>> Hence is -> hence it is
>>>
>>> Fixed.
>>>
>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/thread.cpp
>>>>
>>>> 1482     // dcubed - Looks like the Threads_lock is for stable access
>>>> 1483     // to _jvmci_old_thread_counters and _jvmci_counters.
>>>>
>>>> Does this comment need revisiting?
>>>
>>> We've been trying to decide what to do about this comment. We'll be
>>> filing a follow up bug for the Compiler team to take another look at
>>> how _jvmci_old_thread_counters and _jvmci_counters are protected.
>>> Threads_lock is a big hammer and there should be a less expensive
>>> solution. Once that bug gets filed, this comment can go away.
>>>
>>>
>>>> 3451 volatile jint ...
>>>>
>>>> Why are you using jint for field types here? (Surprised Coleen
>>>> hasn't spotted them! ;-) ).
>>>
>>> volatile jint         Threads::_smr_delete_notify = 0;
>>> volatile jint         Threads::_smr_deleted_thread_cnt = 0;
>>> volatile jint         Threads::_smr_deleted_thread_time_max = 0;
>>> volatile jint         Threads::_smr_deleted_thread_times = 0;
>>> :
>>> volatile jint         Threads::_smr_tlh_cnt = 0;
>>> volatile jint         Threads::_smr_tlh_time_max = 0;
>>> volatile jint         Threads::_smr_tlh_times = 0;
>>>
>>> _smr_delete_notify is a "flag" that indicates when an _smr_delete_lock
>>> notify() operation is needed. It counts up and down and should be a
>>> fairly
>>> small number.
>>>
>>> _smr_deleted_thread_cnt counts the number of Threads that have been
>>> deleted over the life of the VM. It's an always increasing number so
>>> it's size depends on how long the VM has been running.
>>>
>>> _smr_deleted_thread_time_max is the maximum time in millis it has
>>> taken to delete a thread. This field was originally a jlong, but
>>> IIRC the Atomic::cmpxchg() code was not happy on ARM... (might have
>>> just been Atomic::add() that was not happy)
>>>
>>> _smr_deleted_thread_times accumulates the time in millis that it
>>> has taken to delete threads. It's an always increasing number so
>>> it's size depends on how long the VM has been running. This field
>>> was originally a jlong, but IIRC the Atomic::add() code was not
>>> happy on ARM... (might have just been Atomic::cmpxchg() that was
>>> not happy)
>>>
>>> _smr_tlh_cnt counts the number of ThreadsListHandles that have been
>>> deleted over the life of the VM. It's an always increasing number so
>>> it's size depends on how long the VM has been running.
>>>
>>> _smr_tlh_time_max is the maximum time in millis it has taken to
>>> delete a ThreadsListHandle. This field was originally a jlong, but
>>> IIRC the Atomic::cmpxchg() code was not happy on ARM... (might have
>>> just been Atomic::add() that was not happy)
>>>
>>> _smr_tlh_times  accumulates the time in millis that it has taken to
>>> delete ThreadsListHandles. It's an always increasing number so
>>> it's size depends on how long the VM has been running.  This field
>>> was originally a jlong, but IIRC the Atomic::add() code was not
>>> happy on ARM... (might have just been Atomic::cmpxchg() that was
>>> not happy)
>>>
>>>
>>> It just dawned on me that I'm not sure whether you think the 'jint'
>>> fields should be larger or smaller or the same size but a different
>>> type... :-)
>>>
>>> The fact that I had to write so much to explain what these fields
>>> are for and how they're used indicates I need more comments. See
>>> below...
>>>
>>>
>>>> 3456 long Threads::_smr_java_thread_list_alloc_cnt = 1;
>>>> 3457 long Threads::_smr_java_thread_list_free_cnt = 0;
>>>>
>>>> long? If you really want 64-bit counters here you won't get them on
>>>> Windows with that declaration. If you really want variable sized
>>>> counters I suggest uintptr_t; otherwise uint64_t.
>>>
>>> long                  Threads::_smr_java_thread_list_alloc_cnt = 1;
>>> long                  Threads::_smr_java_thread_list_free_cnt = 0;
>>>
>>> _smr_java_thread_list_alloc_cnt counts the number of ThreadsLists that
>>> have been allocated over the life of the VM. It's an always increasing
>>> number so it's size depends on how long the VM has been running and how
>>> many Threads have been created and destroyed.
>>>
>>> _smr_java_thread_list_free_cnt counts the number of ThreadsLists that
>>> have been freed over the life of the VM. It's an always increasing
>>> number so it's size depends on how long the VM has been running and how
>>> many Threads have been created and destroyed.
>>>
>>> I can't remember why we chose 'long' and I agree it's not a good choice
>>> for Win*.
>>>
>>>
>>> Okay so it dawns on me that we haven't written down a description for
>>> the various new '_cnt', '_max' and '_times" fields so I'm adding these
>>> comments to thread.hpp:
>>>
>>>   private:
>>>    // Safe Memory Reclamation (SMR) support:
>>>    static Monitor*              _smr_delete_lock;
>>>    // The '_cnt', '_max' and '_times" fields are enabled via
>>>    // -XX:+EnableThreadSMRStatistics:
>>>                                 // # of parallel threads in
>>> _smr_delete_lock->wait().
>>>    static uint                  _smr_delete_lock_wait_cnt;
>>>                                 // Max # of parallel threads in
>>> _smr_delete_lock->wait().
>>>    static uint                  _smr_delete_lock_wait_max;
>>>                                 // Flag to indicate when an
>>> _smr_delete_lock->notify() is needed.
>>>    static volatile uint         _smr_delete_notify;
>>>                                 // # of threads deleted over VM
>>> lifetime.
>>>    static volatile uint         _smr_deleted_thread_cnt;
>>>                                 // Max time in millis to delete a
>>> thread.
>>>    static volatile uint         _smr_deleted_thread_time_max;
>>>                                 // Cumulative time in millis to
>>> delete threads.
>>>    static volatile uint         _smr_deleted_thread_times;
>>>    static ThreadsList* volatile _smr_java_thread_list;
>>>    static ThreadsList*          get_smr_java_thread_list();
>>>    static ThreadsList* xchg_smr_java_thread_list(ThreadsList*
>>> new_list);
>>>                                 // # of ThreadsLists allocated over
>>> VM lifetime.
>>>    static uint64_t _smr_java_thread_list_alloc_cnt;
>>>                                 // # of ThreadsLists freed over VM
>>> lifetime.
>>>    static uint64_t _smr_java_thread_list_free_cnt;
>>>                                 // Max size ThreadsList allocated.
>>>    static uint                  _smr_java_thread_list_max;
>>>                                 // Max # of nested ThreadsLists for
>>> a thread.
>>>    static uint                  _smr_nested_thread_list_max;
>>>                                 // # of ThreadsListHandles deleted
>>> over VM lifetime.
>>>    static volatile uint         _smr_tlh_cnt;
>>>                                 // Max time in millis to delete a
>>> ThreadsListHandle.
>>>    static volatile jint         _smr_tlh_time_max;
>>>                                 // Cumulative time in millis to
>>> delete ThreadsListHandles.
>>>    static volatile jint         _smr_tlh_times;
>>>    static ThreadsList*          _smr_to_delete_list;
>>>                                 // # of parallel ThreadsLists on the
>>> to-delete list.
>>>    static uint                  _smr_to_delete_list_cnt;
>>>                                 // Max # of parallel ThreadsLists on
>>> the to-delete list.
>>>    static uint                  _smr_to_delete_list_max;
>>>
>>>
>>> I've also gone through all the new '_cnt', '_max' and '_times" fields
>>> in thread.cpp and added an impl note to explain the choice of type:
>>>
>>> // Safe Memory Reclamation (SMR) support:
>>> Monitor*              Threads::_smr_delete_lock =
>>>                            new Monitor(Monitor::special,
>>> "smr_delete_lock",
>>>                                        false /* allow_vm_block */,
>>> Monitor::_safepoint_check_never);
>>> // The '_cnt', '_max' and '_times" fields are enabled via
>>> // -XX:+EnableThreadSMRStatistics:
>>>                        // # of parallel threads in
>>> _smr_delete_lock->wait().
>>>                        // Impl note: Hard to imagine > 64K waiting
>>> threads so
>>>                        // this could be 16-bit, but there is no nice
>>> 16-bit
>>>                        // _FORMAT support.
>>> uint                  Threads::_smr_delete_lock_wait_cnt = 0;
>>>                        // Max # of parallel threads in
>>> _smr_delete_lock->wait().
>>>                        // Impl note: See _smr_delete_lock_wait_cnt
>>> note.
>>> uint                  Threads::_smr_delete_lock_wait_max = 0;
>>>                        // Flag to indicate when an
>>> _smr_delete_lock->notify() is needed.
>>>                        // Impl note: See _smr_delete_lock_wait_cnt
>>> note.
>>> volatile uint         Threads::_smr_delete_notify = 0;
>>>                        // # of threads deleted over VM lifetime.
>>>                        // Impl note: Atomically incremented over VM
>>> lifetime so
>>>                        // use unsigned for more range. Unsigned
>>> 64-bit would
>>>                        // be more future proof, but 64-bit atomic
>>> inc isn't
>>>                        // available everywhere (or is it?).
>>> volatile uint         Threads::_smr_deleted_thread_cnt = 0;
>>>                        // Max time in millis to delete a thread.
>>>                        // Impl note: 16-bit might be too small on an
>>> overloaded
>>>                        // machine. Use unsigned since this is a time
>>> value. Set
>>>                        // via Atomic::cmpxchg() in a loop for
>>> correctness.
>>> volatile uint         Threads::_smr_deleted_thread_time_max = 0;
>>>                        // Cumulative time in millis to delete threads.
>>>                        // Impl note: Atomically added to over VM
>>> lifetime so use
>>>                        // unsigned for more range. Unsigned 64-bit
>>> would be more
>>>                        // future proof, but 64-bit atomic inc isn't
>>> available
>>>                        // everywhere (or is it?).
>>> volatile uint         Threads::_smr_deleted_thread_times = 0;
>>> ThreadsList* volatile Threads::_smr_java_thread_list = new
>>> ThreadsList(0);
>>>                        // # of ThreadsLists allocated over VM lifetime.
>>>                        // Impl note: We allocate a new ThreadsList
>>> for every
>>>                        // thread create and every thread delete so
>>> we need a
>>>                        // bigger type than the
>>> _smr_deleted_thread_cnt field.
>>> uint64_t              Threads::_smr_java_thread_list_alloc_cnt = 1;
>>>                        // # of ThreadsLists freed over VM lifetime.
>>>                        // Impl note: See
>>> _smr_java_thread_list_alloc_cnt note.
>>> uint64_t              Threads::_smr_java_thread_list_free_cnt = 0;
>>>                        // Max size ThreadsList allocated.
>>>                        // Impl note: Max # of threads alive at one
>>> time should
>>>                        // fit in unsigned 32-bit.
>>> uint                  Threads::_smr_java_thread_list_max = 0;
>>>                        // Max # of nested ThreadsLists for a thread.
>>>                        // Impl note: Hard to imagine > 64K nested
>>> ThreadsLists
>>>                        // so his could be 16-bit, but there is no
>>> nice 16-bit
>>>                        // _FORMAT support.
>>> uint                  Threads::_smr_nested_thread_list_max = 0;
>>>                        // # of ThreadsListHandles deleted over VM
>>> lifetime.
>>>                        // Impl note: Atomically incremented over VM
>>> lifetime so
>>>                        // use unsigned for more range. There will be
>>> fewer
>>>                        // ThreadsListHandles than threads so
>>> unsigned 32-bit
>>>                        // should be fine.
>>> volatile uint         Threads::_smr_tlh_cnt = 0;
>>>                        // Max time in millis to delete a
>>> ThreadsListHandle.
>>>                        // Impl note: 16-bit might be too small on an
>>> overloaded
>>>                        // machine. Use unsigned since this is a time
>>> value. Set
>>>                        // via Atomic::cmpxchg() in a loop for
>>> correctness.
>>> volatile uint         Threads::_smr_tlh_time_max = 0;
>>>                        // Cumulative time in millis to delete
>>> ThreadsListHandles.
>>>                        // Impl note: Atomically added to over VM
>>> lifetime so use
>>>                        // unsigned for more range. Unsigned 64-bit
>>> would be more
>>>                        // future proof, but 64-bit atomic inc isn't
>>> available
>>>                        // everywhere (or is it?).
>>> volatile uint         Threads::_smr_tlh_times = 0;
>>> ThreadsList*          Threads::_smr_to_delete_list = NULL;
>>>                        // # of parallel ThreadsLists on the
>>> to-delete list.
>>>                        // Impl note: Hard to imagine > 64K
>>> ThreadsLists needing
>>>                        // to be deleted so this could be 16-bit, but
>>> there is no
>>>                        // nice 16-bit _FORMAT support.
>>> uint                  Threads::_smr_to_delete_list_cnt = 0;
>>>                        // Max # of parallel ThreadsLists on the
>>> to-delete list.
>>>                        // Impl note: See _smr_to_delete_list_cnt note.
>>> uint                  Threads::_smr_to_delete_list_max = 0;
>>>
>>>
>>> Yikes! With the additional comments, the additions to thread.hpp and
>>> thread.cpp grew by a bunch... I've done a test build build on my Mac
>>> with these changes and I'm about to kick off a Mach5 tier1 job...
>>>
>>> Dan
>>>
>>>
>>>
>>>>
>>>> ----
>>>>
>>>> On 19/11/2017 11:49 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Testing of the last round of changes revealed a hang in one of the
>>>>> new
>>>>> TLH tests. Robbin has fixed the hang, updated the existing TLH
>>>>> test, and
>>>>> added another TLH test for good measure.
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>>>
>>>>> Here is the updated delta webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>>>
>>>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>>>> no unexpected failures.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Robbin rebased the project last night/this morning to merge with
>>>>>> Thread
>>>>>> Local Handshakes (TLH) and also picked up additional changesets
>>>>>> up thru:
>>>>>>
>>>>>>> Changeset: fa736014cf28
>>>>>>> Author:    cjplummer
>>>>>>> Date:      2017-11-14 18:08 -0800
>>>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>>>
>>>>>>> 8191049: Add alternate version of pns() that is callable from
>>>>>>> within hotspot source
>>>>>>> Summary: added pns2() to debug.cpp
>>>>>>> Reviewed-by: stuefe, gthornbr
>>>>>>
>>>>>> This is the first time we've rebased the project to bits that are
>>>>>> this
>>>>>> fresh (< 12 hours old at rebase time). We've done this because we
>>>>>> think
>>>>>> we're done with this project and are in the final
>>>>>> review-change-retest
>>>>>> cycle(s)... In other words, we're not planning on making any more
>>>>>> major
>>>>>> changes for this project... :-)
>>>>>>
>>>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>>>
>>>>>> Here is the updated full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>>>
>>>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>>>
>>>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>>>> (and the new baseline also)...
>>>>>>
>>>>>> We're expecting this round to be a little noisier than usual because
>>>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>>>> through Jesper's usual care-and-feeding of integration_blockers,
>>>>>> etc.
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>>>> (Yes, we're playing chase-the-repo...)
>>>>>>>
>>>>>>> Here is the updated full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>>>
>>>>>>> Unlike the previous rebase, there were no changes required to the
>>>>>>> open code to get the rebased bits to build so there is no delta
>>>>>>> webrev.
>>>>>>>
>>>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>>>> Mach5 tier[1-5] test run...
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>>>
>>>>>>>> Here are the updated webrevs:
>>>>>>>>
>>>>>>>> Here's the mq comment for the change:
>>>>>>>>
>>>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>>>
>>>>>>>> Here is the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>>>
>>>>>>>> And here is the delta webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>>>
>>>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the
>>>>>>>> holiday
>>>>>>>> weekend. Didn't see any issues in a quick look. Going to take a
>>>>>>>> closer
>>>>>>>> look today.
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Dan, Erik and Robbin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> Resolving one of the code review comments (from both Stefan K
>>>>>>>>> and Coleen)
>>>>>>>>> on jdk10-04-full required quite a few changes so it is being
>>>>>>>>> done as a
>>>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>>>
>>>>>>>>> Here's the mq comment for the change:
>>>>>>>>>
>>>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage
>>>>>>>>> to use
>>>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>>>
>>>>>>>>> Here is the full webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>>>
>>>>>>>>> And here is the delta webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>>>
>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>
>>>>>>>>> Dan, Erik and Robbin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>>>
>>>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>>>
>>>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>>>
>>>>>>>>>> Here's a PDF for the internal wiki that we've been using to
>>>>>>>>>> describe
>>>>>>>>>> and track the work on this project:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Dan has noticed that the indenting is wrong in some of the
>>>>>>>>>> code quotes
>>>>>>>>>> in the PDF that are not present in the internal wiki. We
>>>>>>>>>> don't have a
>>>>>>>>>> solution for that problem yet.
>>>>>>>>>>
>>>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>>>
>>>>>>>>>> This fix has been run through many rounds of JPRT and Mach5
>>>>>>>>>> tier[2-5]
>>>>>>>>>> testing, additional stress testing on Dan's Solaris X64
>>>>>>>>>> server, and
>>>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>>>
>>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>>
>>>>>>>>>> Daniel Daugherty
>>>>>>>>>> Erik Osterlund
>>>>>>>>>> Robbin Ehn
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

coleen.phillimore
In reply to this post by Daniel D. Daugherty


On 11/20/17 8:50 PM, Daniel D. Daugherty wrote:

> On 11/20/17 12:51 AM, David Holmes wrote:
>> Hi Dan,
>>
>> Figured I'd better cast an eye over this again before it gets pushed :)
>
> Thanks for reviewing again!!
>
>
>> Only one thing (repeated many times) jumped out at me and apologies
>> if this has already been debated back and forth. I missed the
>> exchange where the for loop iteration was rewritten into this unusual
>> form:
>>
>> for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt =
>> jtiwh.next(); ) {
>>
>> On first reading I expect most people would mistake the control
>> expression for the iteration-expression due to the next(). I didn't
>> even know the control expression could introduce a local variable! I
>> find this really hard to read stylistically and far too cute/clever
>> for its own good. It also violates the style rules regarding implicit
>> boolean checks.
>>
>> I know Stefan proposed this as he did not like the alternate (in a
>> few places):
>>
>> +  {
>> +    ThreadsListHandle tlh;
>> +    JavaThreadIterator jti(tlh.list());
>> +    for (JavaThread* t = jti.first(); t != NULL; t = jti.next()) {
>> ...
>> +  }
>>
>> but it seems to me the iterator code has evolved since then and only
>> a couple of places need a distinct TLH separate from the actual
>> iterator object. So I'm more in favour of the more classic for-loop,
>> with the iterator declared before the loop. Or even convert to while
>> loops, as this iterator seems more suited to that.
>
> Actually both Coleen and Stefan had a problem with how much additional
> code was added to support an iterator model. In some cases we went from
> 1-line to 3-lines and in some cases we went from 1-line to 5-lines. For
> Coleen, she wanted 2 of the new lines compressed into 1 new line by using
> an iterator with an embedded ThreadsListHandle. Stefan wanted us to try
> and get back to 1-line where possible.
>
> The advantages of the new style are:
>
> - 1-line to 1-line in all but a few cases
> - automatic limited scope of the embedded ThreadsListHandle so we were
>   able to remove extra braces that were added earlier in most cases
>
> The disadvantages of the new style are:
>
> - it is an unusual for-loop style (in HotSpot)
> - it breaks HotSpot's style rule about implied booleans
>
>
> Stefan K is pretty adamant that the original iterator version where we
> go from 1-line to 3-lines or 1-line to 5-lines is not acceptable for GC
> code. The compiler guys haven't chimed in on this debate so I'm guessing
> they don't have a strong opinion either way. One thing that we don't want
> to do is have different styles for the different teams.
>
> So I had to make a judgement call on whether I thought Runtime could live
> with Stefan's idea. Originally I wasn't fond of it either and breaking
> the
> implied boolean style rule is a problem for me (I post that comment a lot
> in my code reviews). However, I have to say that I'm a lot happier about
> the compactness of the code and the fact that limited ThreadsListHandle
> scope comes for 'free' in most cases.
>
> We're planning to keep the new iterator style for the following reasons:
>
> - smaller change footprint
> - consistent style used for all of the teams
> - limited ThreadsListHandle scope comes for free in most cases
>
> We're hoping that you can live with this decision (for now) and maybe
> even grow to like it in the future. :-)

The limited ThreadsListHandle scope, since ThreadsListHandle registers
and unregisters lists to SMR, seems to be worth the cost of looking at
this strange code pattern.   Thank you for the explanation.

>
>
>> Other than that just a couple of comments on variable type choices,
>> and a few typos spotted below.
>
> Replies embedded below.
>
>
>>
>> Thanks,
>> David
>> ---
>>
>> src/hotspot/share/runtime/globals.hpp
>>
>> 2476   diagnostic(bool, EnableThreadSMRExtraValidityChecks, true,
>>         \
>> 2477           "Enable Thread SMR extra validity checks") \
>> 2478         \
>> 2479   diagnostic(bool, EnableThreadSMRStatistics, true, \
>> 2480           "Enable Thread SMR Statistics")         \
>>
>> Indent of strings is off by  3 spaces.
>
> Fixed.
>
>
>> ---
>>
>> src/hotspot/share/runtime/handshake.cpp
>>
>>  140       // There is assumption in code that Threads_lock should be
>> lock
>>  200       // There is assumption in code that Threads_lock should be
>> lock
>>
>> lock -> locked
>
> Fixed.
>
>
>> 146         // handshake_process_by_vmthread will check the semaphore
>> for us again
>>
>> Needs period at end.
>
> Fixed.
>
>
>> 148         // should be okey since the new thread will not have an
>> operation.
>>
>> okey -> okay
>
> Fixed.
>
>
>> 151         // We can't warn here is since the thread do
>> cancel_handshake after it have been removed
>>
>> I think:
>>
>> // We can't warn here since the thread does cancel_handshake after it
>> has been removed
>
> Fixed.
>
>
>> 152         // from ThreadsList. So we should just keep looping here
>> until while below return negative.
>>
>> from -> from the
>>
>> Needs period at end.
>
> Fixed both.
>
>
>>
>>  204             // A new thread on the ThreadsList will not have an
>> operation.
>>
>> Change period to comma, and ...
>
> Fixed.
>
>
>> 205             // Hence is skipped in handshake_process_by_vmthread.
>>
>> Hence is -> hence it is
>
> Fixed.
>
>
>> ---
>>
>> src/hotspot/share/runtime/thread.cpp
>>
>> 1482     // dcubed - Looks like the Threads_lock is for stable access
>> 1483     // to _jvmci_old_thread_counters and _jvmci_counters.
>>
>> Does this comment need revisiting?
>
> We've been trying to decide what to do about this comment. We'll be
> filing a follow up bug for the Compiler team to take another look at
> how _jvmci_old_thread_counters and _jvmci_counters are protected.
> Threads_lock is a big hammer and there should be a less expensive
> solution. Once that bug gets filed, this comment can go away.
>
>
>> 3451 volatile jint ...
>>
>> Why are you using jint for field types here? (Surprised Coleen hasn't
>> spotted them! ;-) ).

Yes, it was the jtypes I would have objected to.  And long isn't a good
choice on windows because it's only 32 bits there.

>
> volatile jint         Threads::_smr_delete_notify = 0;
> volatile jint         Threads::_smr_deleted_thread_cnt = 0;
> volatile jint         Threads::_smr_deleted_thread_time_max = 0;
> volatile jint         Threads::_smr_deleted_thread_times = 0;
> :
> volatile jint         Threads::_smr_tlh_cnt = 0;
> volatile jint         Threads::_smr_tlh_time_max = 0;
> volatile jint         Threads::_smr_tlh_times = 0;
>
> _smr_delete_notify is a "flag" that indicates when an _smr_delete_lock
> notify() operation is needed. It counts up and down and should be a
> fairly
> small number.
>
> _smr_deleted_thread_cnt counts the number of Threads that have been
> deleted over the life of the VM. It's an always increasing number so
> it's size depends on how long the VM has been running.
>
> _smr_deleted_thread_time_max is the maximum time in millis it has
> taken to delete a thread. This field was originally a jlong, but
> IIRC the Atomic::cmpxchg() code was not happy on ARM... (might have
> just been Atomic::add() that was not happy)
>
> _smr_deleted_thread_times accumulates the time in millis that it
> has taken to delete threads. It's an always increasing number so
> it's size depends on how long the VM has been running. This field
> was originally a jlong, but IIRC the Atomic::add() code was not
> happy on ARM... (might have just been Atomic::cmpxchg() that was
> not happy)
>
> _smr_tlh_cnt counts the number of ThreadsListHandles that have been
> deleted over the life of the VM. It's an always increasing number so
> it's size depends on how long the VM has been running.
>
> _smr_tlh_time_max is the maximum time in millis it has taken to
> delete a ThreadsListHandle. This field was originally a jlong, but
> IIRC the Atomic::cmpxchg() code was not happy on ARM... (might have
> just been Atomic::add() that was not happy)
>
> _smr_tlh_times  accumulates the time in millis that it has taken to
> delete ThreadsListHandles. It's an always increasing number so
> it's size depends on how long the VM has been running.  This field
> was originally a jlong, but IIRC the Atomic::add() code was not
> happy on ARM... (might have just been Atomic::cmpxchg() that was
> not happy)
>
>
> It just dawned on me that I'm not sure whether you think the 'jint'
> fields should be larger or smaller or the same size but a different
> type... :-)
>
> The fact that I had to write so much to explain what these fields
> are for and how they're used indicates I need more comments. See
> below...
>
>
>> 3456 long Threads::_smr_java_thread_list_alloc_cnt = 1;
>> 3457 long Threads::_smr_java_thread_list_free_cnt = 0;
>>
>> long? If you really want 64-bit counters here you won't get them on
>> Windows with that declaration. If you really want variable sized
>> counters I suggest uintptr_t; otherwise uint64_t.
>
> long                  Threads::_smr_java_thread_list_alloc_cnt = 1;
> long                  Threads::_smr_java_thread_list_free_cnt = 0;
>
> _smr_java_thread_list_alloc_cnt counts the number of ThreadsLists that
> have been allocated over the life of the VM. It's an always increasing
> number so it's size depends on how long the VM has been running and how
> many Threads have been created and destroyed.
>
> _smr_java_thread_list_free_cnt counts the number of ThreadsLists that
> have been freed over the life of the VM. It's an always increasing
> number so it's size depends on how long the VM has been running and how
> many Threads have been created and destroyed.
>
> I can't remember why we chose 'long' and I agree it's not a good choice
> for Win*.
>
>
> Okay so it dawns on me that we haven't written down a description for
> the various new '_cnt', '_max' and '_times" fields so I'm adding these
> comments to thread.hpp:
>

With this comment format, it's really hard to pick out the name of the
field.  Nobody uses 80 columns anymore.  Can you move the comments over
some spaces so the field names are visible?

>  private:
>   // Safe Memory Reclamation (SMR) support:
>   static Monitor*              _smr_delete_lock;
>   // The '_cnt', '_max' and '_times" fields are enabled via
>   // -XX:+EnableThreadSMRStatistics:
>                                // # of parallel threads in
> _smr_delete_lock->wait().
>   static uint                  _smr_delete_lock_wait_cnt;
>                                // Max # of parallel threads in
> _smr_delete_lock->wait().
>   static uint                  _smr_delete_lock_wait_max;
>                                // Flag to indicate when an
> _smr_delete_lock->notify() is needed.
>   static volatile uint         _smr_delete_notify;
>                                // # of threads deleted over VM lifetime.
>   static volatile uint         _smr_deleted_thread_cnt;
>                                // Max time in millis to delete a thread.
>   static volatile uint         _smr_deleted_thread_time_max;
>                                // Cumulative time in millis to delete
> threads.
>   static volatile uint         _smr_deleted_thread_times;
>   static ThreadsList* volatile _smr_java_thread_list;
>   static ThreadsList*          get_smr_java_thread_list();
>   static ThreadsList* xchg_smr_java_thread_list(ThreadsList* new_list);
>                                // # of ThreadsLists allocated over VM
> lifetime.
>   static uint64_t              _smr_java_thread_list_alloc_cnt;
>                                // # of ThreadsLists freed over VM
> lifetime.
>   static uint64_t              _smr_java_thread_list_free_cnt;
>                                // Max size ThreadsList allocated.
>   static uint                  _smr_java_thread_list_max;
>                                // Max # of nested ThreadsLists for a
> thread.
>   static uint                  _smr_nested_thread_list_max;
>                                // # of ThreadsListHandles deleted over
> VM lifetime.
>   static volatile uint         _smr_tlh_cnt;
>                                // Max time in millis to delete a
> ThreadsListHandle.
>   static volatile jint         _smr_tlh_time_max;
>                                // Cumulative time in millis to delete
> ThreadsListHandles.
>   static volatile jint         _smr_tlh_times;
>   static ThreadsList*          _smr_to_delete_list;
>                                // # of parallel ThreadsLists on the
> to-delete list.
>   static uint                  _smr_to_delete_list_cnt;
>                                // Max # of parallel ThreadsLists on
> the to-delete list.
>   static uint                  _smr_to_delete_list_max;
>
>
> I've also gone through all the new '_cnt', '_max' and '_times" fields
> in thread.cpp and added an impl note to explain the choice of type:
>

Since this is in the cpp file and there is so much comment text, can you
just make it in column 1 and leave a blank line after each field.  The
indentation style is really hard to read and only useful if the comment
is short.

> // Safe Memory Reclamation (SMR) support:
> Monitor*              Threads::_smr_delete_lock =
>                           new Monitor(Monitor::special,
> "smr_delete_lock",
>                                       false /* allow_vm_block */,
> Monitor::_safepoint_check_never);
> // The '_cnt', '_max' and '_times" fields are enabled via
> // -XX:+EnableThreadSMRStatistics:
>                       // # of parallel threads in
> _smr_delete_lock->wait().
>                       // Impl note: Hard to imagine > 64K waiting
> threads so
>                       // this could be 16-bit, but there is no nice
> 16-bit
>                       // _FORMAT support.
> uint                  Threads::_smr_delete_lock_wait_cnt = 0;
>                       // Max # of parallel threads in
> _smr_delete_lock->wait().
>                       // Impl note: See _smr_delete_lock_wait_cnt note.
> uint                  Threads::_smr_delete_lock_wait_max = 0;
>                       // Flag to indicate when an
> _smr_delete_lock->notify() is needed.
>                       // Impl note: See _smr_delete_lock_wait_cnt note.
> volatile uint         Threads::_smr_delete_notify = 0;
>                       // # of threads deleted over VM lifetime.
>                       // Impl note: Atomically incremented over VM
> lifetime so
>                       // use unsigned for more range. Unsigned 64-bit
> would
>                       // be more future proof, but 64-bit atomic inc
> isn't
>                       // available everywhere (or is it?).
> volatile uint         Threads::_smr_deleted_thread_cnt = 0;
>                       // Max time in millis to delete a thread.
>                       // Impl note: 16-bit might be too small on an
> overloaded
>                       // machine. Use unsigned since this is a time
> value. Set
>                       // via Atomic::cmpxchg() in a loop for correctness.
> volatile uint         Threads::_smr_deleted_thread_time_max = 0;
>                       // Cumulative time in millis to delete threads.
>                       // Impl note: Atomically added to over VM
> lifetime so use
>                       // unsigned for more range. Unsigned 64-bit
> would be more
>                       // future proof, but 64-bit atomic inc isn't
> available
>                       // everywhere (or is it?).
> volatile uint         Threads::_smr_deleted_thread_times = 0;
> ThreadsList* volatile Threads::_smr_java_thread_list = new
> ThreadsList(0);
>                       // # of ThreadsLists allocated over VM lifetime.
>                       // Impl note: We allocate a new ThreadsList for
> every
>                       // thread create and every thread delete so we
> need a
>                       // bigger type than the _smr_deleted_thread_cnt
> field.
> uint64_t              Threads::_smr_java_thread_list_alloc_cnt = 1;
>                       // # of ThreadsLists freed over VM lifetime.
>                       // Impl note: See
> _smr_java_thread_list_alloc_cnt note.
> uint64_t              Threads::_smr_java_thread_list_free_cnt = 0;
>                       // Max size ThreadsList allocated.
>                       // Impl note: Max # of threads alive at one time
> should
>                       // fit in unsigned 32-bit.
> uint                  Threads::_smr_java_thread_list_max = 0;
>                       // Max # of nested ThreadsLists for a thread.
>                       // Impl note: Hard to imagine > 64K nested
> ThreadsLists
>                       // so his could be 16-bit, but there is no nice
> 16-bit
>                       // _FORMAT support.
> uint                  Threads::_smr_nested_thread_list_max = 0;
>                       // # of ThreadsListHandles deleted over VM
> lifetime.
>                       // Impl note: Atomically incremented over VM
> lifetime so
>                       // use unsigned for more range. There will be fewer
>                       // ThreadsListHandles than threads so unsigned
> 32-bit
>                       // should be fine.
> volatile uint         Threads::_smr_tlh_cnt = 0;
>                       // Max time in millis to delete a
> ThreadsListHandle.
>                       // Impl note: 16-bit might be too small on an
> overloaded
>                       // machine. Use unsigned since this is a time
> value. Set
>                       // via Atomic::cmpxchg() in a loop for correctness.
> volatile uint         Threads::_smr_tlh_time_max = 0;
>                       // Cumulative time in millis to delete
> ThreadsListHandles.
>                       // Impl note: Atomically added to over VM
> lifetime so use
>                       // unsigned for more range. Unsigned 64-bit
> would be more
>                       // future proof, but 64-bit atomic inc isn't
> available
>                       // everywhere (or is it?).
> volatile uint         Threads::_smr_tlh_times = 0;
> ThreadsList*          Threads::_smr_to_delete_list = NULL;
>                       // # of parallel ThreadsLists on the to-delete
> list.
>                       // Impl note: Hard to imagine > 64K ThreadsLists
> needing
>                       // to be deleted so this could be 16-bit, but
> there is no
>                       // nice 16-bit _FORMAT support.
> uint                  Threads::_smr_to_delete_list_cnt = 0;
>                       // Max # of parallel ThreadsLists on the
> to-delete list.
>                       // Impl note: See _smr_to_delete_list_cnt note.
> uint                  Threads::_smr_to_delete_list_max = 0;
>
>
> Yikes! With the additional comments, the additions to thread.hpp and
> thread.cpp grew by a bunch... I've done a test build build on my Mac
> with these changes and I'm about to kick off a Mach5 tier1 job...

Another reason why I agreed with some earlier comment that this should
be in a new file because it's a new thing. I was somewhat surprised that
it's not in threadSMR.{hpp,cpp}.   This refactoring could be deferred
but thread.cpp is too big!

thanks,
Coleen

>
> Dan
>
>
>
>>
>> ----
>>
>> On 19/11/2017 11:49 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Testing of the last round of changes revealed a hang in one of the new
>>> TLH tests. Robbin has fixed the hang, updated the existing TLH test,
>>> and
>>> added another TLH test for good measure.
>>>
>>> Here is the updated full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>
>>> Here is the updated delta webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>
>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>> no unexpected failures.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> Robbin rebased the project last night/this morning to merge with
>>>> Thread
>>>> Local Handshakes (TLH) and also picked up additional changesets up
>>>> thru:
>>>>
>>>>> Changeset: fa736014cf28
>>>>> Author:    cjplummer
>>>>> Date:      2017-11-14 18:08 -0800
>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>
>>>>> 8191049: Add alternate version of pns() that is callable from
>>>>> within hotspot source
>>>>> Summary: added pns2() to debug.cpp
>>>>> Reviewed-by: stuefe, gthornbr
>>>>
>>>> This is the first time we've rebased the project to bits that are this
>>>> fresh (< 12 hours old at rebase time). We've done this because we
>>>> think
>>>> we're done with this project and are in the final review-change-retest
>>>> cycle(s)... In other words, we're not planning on making any more
>>>> major
>>>> changes for this project... :-)
>>>>
>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>
>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>
>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>> (and the new baseline also)...
>>>>
>>>> We're expecting this round to be a little noisier than usual because
>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>> (Yes, we're playing chase-the-repo...)
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>
>>>>> Unlike the previous rebase, there were no changes required to the
>>>>> open code to get the rebased bits to build so there is no delta
>>>>> webrev.
>>>>>
>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>> Mach5 tier[1-5] test run...
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>
>>>>>> Here are the updated webrevs:
>>>>>>
>>>>>> Here's the mq comment for the change:
>>>>>>
>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>
>>>>>> Here is the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>
>>>>>> And here is the delta webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>
>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the
>>>>>> holiday
>>>>>> weekend. Didn't see any issues in a quick look. Going to take a
>>>>>> closer
>>>>>> look today.
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> Resolving one of the code review comments (from both Stefan K
>>>>>>> and Coleen)
>>>>>>> on jdk10-04-full required quite a few changes so it is being
>>>>>>> done as a
>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>
>>>>>>> Here's the mq comment for the change:
>>>>>>>
>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage
>>>>>>> to use
>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>
>>>>>>> Here is the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>
>>>>>>> And here is the delta webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>
>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>
>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>
>>>>>>>> Here's a PDF for the internal wiki that we've been using to
>>>>>>>> describe
>>>>>>>> and track the work on this project:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>
>>>>>>>>
>>>>>>>> Dan has noticed that the indenting is wrong in some of the code
>>>>>>>> quotes
>>>>>>>> in the PDF that are not present in the internal wiki. We don't
>>>>>>>> have a
>>>>>>>> solution for that problem yet.
>>>>>>>>
>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>
>>>>>>>> This fix has been run through many rounds of JPRT and Mach5
>>>>>>>> tier[2-5]
>>>>>>>> testing, additional stress testing on Dan's Solaris X64 server,
>>>>>>>> and
>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Daniel Daugherty
>>>>>>>> Erik Osterlund
>>>>>>>> Robbin Ehn
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

coleen.phillimore
In reply to this post by Daniel D. Daugherty


On 11/22/17 7:51 AM, Daniel D. Daugherty wrote:

> On 11/21/17 11:26 PM, David Holmes wrote:
>> Hi Dan,
>>
>> On 22/11/2017 10:12 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> *** We are wrapping up code review on this project so it is time ***
>>> *** for the various code reviewers to chime in one last time. ***
>>>
>>> In this latest round, we had three different reviewers chime in so
>>> we're
>>> doing delta webrevs for each of those resolutions:
>>>
>>> David H's resolutions:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.0-delta/ 
>>>
>>>
>>>    mq comment for dholmes_CR:
>>>      - Fix indents, trailing spaces and various typos.
>>>      - Add descriptions for the '_cnt', '_max' and '_times" fields,
>>>        add impl notes to document the type choices.
>>
>> src/hotspot/share/runtime/thread.cpp
>>
>> 3466 // range. Unsigned 64-bit would be more future proof, but 64-bit
>> atomic inc
>> 3467 // isn't available everywhere (or is it?).
>>
>> 64-bit atomics should be available on all currently supported
>> platforms (the last time this was an issue was for PPC32 - and the
>> atomic templates should have filled in any previous holes in the
>> allowed types).
>
> Hmmm... I ran into this when I merged the project with the atomic
> templates. I got a JPRT build failure on one of the ARM platforms...
> Unfortunately, I didn't save the log files so I can't quote the
> error message from the template stuff...
>
>
>> But if it's always 64-bit then you'd have to use Atomic::load to
>> allow for 32-bit platforms.
>
> What's the official story on 32-bit platforms? Is that what
> bit me in JPRT? i.e., did we still have a 32-bit ARM build
> config'ed in JPRT a month or two ago???
>
>
>> These counts etc are all just for statistics so we aren't really
>> concerned about eventual overflows in long running VMs - right?
>
> Yup these are just statistics... overflow is not a killer.
>
>
>>
>> ---
>>
>>                                        // # of parallel threads in
>> _smr_delete_lock->wait().
>>   static uint                  _smr_delete_lock_wait_cnt;
>>                                        // Max # of parallel threads
>> in _smr_delete_lock->wait().
>>
>>
>> why are the comments indented like that? I thought this is what
>> Coleen previously commented on. Please just start the comments in the
>> first column - no need for any indent. Thanks.
>
> Coleen asked for the new comments in thread.cpp to be moved
> over to column 1. She asked for the new comments in thread.hpp
> to be indented with more spaces. I started with 2 spaces and
> the variables still didn't stand out. I tried 4 spaces and
> they still didn't stand out probably because of the leading
> _smr_... So I went with 8 spaces...

Since you have the same but more indepth comments in the .cpp file, and
at least for my sampling the comments in the .hpp file look redundant, I
think you should not have the same comments in the .hpp file.   They're
really values that the implementation uses, not part of the interface. 
My vote is to remove the comments from the .hpp file.

Coleen

>
>
>>
>>>    src/hotspot/share/runtime/globals.hpp change is white-space only
>>> so it
>>>    is only visible via the file's patch link.
>>>
>>>
>>> Robin W's resolutions:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.0,10.10.1-delta/ 
>>>
>>>
>>>    mq comment for robinw_CR:
>>>      - Fix some inefficient code, update some comments, fix some
>>> indents,
>>>        and add some 'const' specifiers.
>>>
>>>    src/hotspot/share/runtime/vm_operations.hpp change is white-space
>>> only so
>>>    it is only visible via the file's patch link.
>>
>> All looks fine to me. Some good catches there from Robin!
>
> Yes, a new pair of eyes on this code is always useful.
>
>
>>
>>> Coleen's resolutions:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.1,10.10.2-delta/ 
>>>
>>>
>>>    mq comment for coleenp_CR:
>>>      - Change ThreadsList::_threads from 'mtGC' -> 'mtThread',
>>>      - add header comment to threadSMR.hpp file,
>>>      - cleanup JavaThreadIteratorWithHandle ctr,
>>>      - make ErrorHandling more efficient.
>>
>> src/hotspot/share/runtime/threadSMR.hpp
>>
>> + // Thread Safe Memory Reclaimation (Thread-SMR) support.
>>
>> Only one 'i' in Reclamation :)
>
> Will fix. I tried typing both and neither looked right to me.
> (I might be getting blurry eyed with this stuff)...
> So I went with the spelling from Coleen's comment... :-)
>
> Thanks for taking another look!!
>
> Dan
>
>
>>
>> Thanks,
>> David
>> ------
>>
>>>
>>> Some folks prefer to see all of the delta webrevs together so here is
>>> a delta webrev relative to jdk10-09-full:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.2-delta/ 
>>>
>>>
>>>    src/hotspot/share/runtime/globals.hpp and
>>>    src/hotspot/share/runtime/vm_operations.hpp changes are
>>> white-space only
>>>    so they are only visible via the file's patch link.
>>>
>>>
>>> And, of course, some folks prefer to see everything all at once so here
>>> is the latest full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10-full/
>>>
>>>
>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing. The
>>> prelim Mach5 tier1 (JPRT equivalent) on these bits passed just fine...
>>>
>>> The project is currently baselined on Jesper's 2017-11-17 PIT snapshot
>>> so there will be some catching up to do before integration...
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> Testing of the last round of changes revealed a hang in one of the new
>>>> TLH tests. Robbin has fixed the hang, updated the existing TLH
>>>> test, and
>>>> added another TLH test for good measure.
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>>
>>>> Here is the updated delta webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>>
>>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>>> no unexpected failures.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Robbin rebased the project last night/this morning to merge with
>>>>> Thread
>>>>> Local Handshakes (TLH) and also picked up additional changesets up
>>>>> thru:
>>>>>
>>>>>> Changeset: fa736014cf28
>>>>>> Author:    cjplummer
>>>>>> Date:      2017-11-14 18:08 -0800
>>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>>
>>>>>> 8191049: Add alternate version of pns() that is callable from
>>>>>> within hotspot source
>>>>>> Summary: added pns2() to debug.cpp
>>>>>> Reviewed-by: stuefe, gthornbr
>>>>>
>>>>> This is the first time we've rebased the project to bits that are
>>>>> this
>>>>> fresh (< 12 hours old at rebase time). We've done this because we
>>>>> think
>>>>> we're done with this project and are in the final
>>>>> review-change-retest
>>>>> cycle(s)... In other words, we're not planning on making any more
>>>>> major
>>>>> changes for this project... :-)
>>>>>
>>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>>
>>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>>
>>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>>> (and the new baseline also)...
>>>>>
>>>>> We're expecting this round to be a little noisier than usual because
>>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>>> (Yes, we're playing chase-the-repo...)
>>>>>>
>>>>>> Here is the updated full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>>
>>>>>> Unlike the previous rebase, there were no changes required to the
>>>>>> open code to get the rebased bits to build so there is no delta
>>>>>> webrev.
>>>>>>
>>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>>> Mach5 tier[1-5] test run...
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>>
>>>>>>> Here are the updated webrevs:
>>>>>>>
>>>>>>> Here's the mq comment for the change:
>>>>>>>
>>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>>
>>>>>>> Here is the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>>
>>>>>>> And here is the delta webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>>
>>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the
>>>>>>> holiday
>>>>>>> weekend. Didn't see any issues in a quick look. Going to take a
>>>>>>> closer
>>>>>>> look today.
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> Resolving one of the code review comments (from both Stefan K
>>>>>>>> and Coleen)
>>>>>>>> on jdk10-04-full required quite a few changes so it is being
>>>>>>>> done as a
>>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>>
>>>>>>>> Here's the mq comment for the change:
>>>>>>>>
>>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage
>>>>>>>> to use
>>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>>
>>>>>>>> Here is the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>>
>>>>>>>> And here is the delta webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Dan, Erik and Robbin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>>
>>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>>
>>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>>
>>>>>>>>> Here's a PDF for the internal wiki that we've been using to
>>>>>>>>> describe
>>>>>>>>> and track the work on this project:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dan has noticed that the indenting is wrong in some of the
>>>>>>>>> code quotes
>>>>>>>>> in the PDF that are not present in the internal wiki. We don't
>>>>>>>>> have a
>>>>>>>>> solution for that problem yet.
>>>>>>>>>
>>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>>
>>>>>>>>> This fix has been run through many rounds of JPRT and Mach5
>>>>>>>>> tier[2-5]
>>>>>>>>> testing, additional stress testing on Dan's Solaris X64
>>>>>>>>> server, and
>>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>>
>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>
>>>>>>>>> Daniel Daugherty
>>>>>>>>> Erik Osterlund
>>>>>>>>> Robbin Ehn
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Gerald Thornbrugh
In reply to this post by Daniel D. Daugherty
Hi Dan,

Your changes look good.

Jerry

> On Nov 21, 2017, at 5:12 PM, Daniel D. Daugherty <[hidden email]> wrote:
>
> Greetings,
>
> *** We are wrapping up code review on this project so it is time ***
> *** for the various code reviewers to chime in one last time. ***
>
> In this latest round, we had three different reviewers chime in so we're
> doing delta webrevs for each of those resolutions:
>
> David H's resolutions:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.0-delta/
>
>   mq comment for dholmes_CR:
>     - Fix indents, trailing spaces and various typos.
>     - Add descriptions for the '_cnt', '_max' and '_times" fields,
>       add impl notes to document the type choices.
>
>   src/hotspot/share/runtime/globals.hpp change is white-space only so it
>   is only visible via the file's patch link.
>
>
> Robin W's resolutions:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.0,10.10.1-delta/
>
>   mq comment for robinw_CR:
>     - Fix some inefficient code, update some comments, fix some indents,
>       and add some 'const' specifiers.
>
>   src/hotspot/share/runtime/vm_operations.hpp change is white-space only so
>   it is only visible via the file's patch link.
>
>
> Coleen's resolutions:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.1,10.10.2-delta/
>
>   mq comment for coleenp_CR:
>     - Change ThreadsList::_threads from 'mtGC' -> 'mtThread',
>     - add header comment to threadSMR.hpp file,
>     - cleanup JavaThreadIteratorWithHandle ctr,
>     - make ErrorHandling more efficient.
>
>
> Some folks prefer to see all of the delta webrevs together so here is
> a delta webrev relative to jdk10-09-full:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.2-delta/
>
>   src/hotspot/share/runtime/globals.hpp and
>   src/hotspot/share/runtime/vm_operations.hpp changes are white-space only
>   so they are only visible via the file's patch link.
>
>
> And, of course, some folks prefer to see everything all at once so here
> is the latest full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10-full/
>
>
> Dan has submitted the bits for the usual Mach5 tier[1-5] testing. The
> prelim Mach5 tier1 (JPRT equivalent) on these bits passed just fine...
>
> The project is currently baselined on Jesper's 2017-11-17 PIT snapshot
> so there will be some catching up to do before integration...
>
> We welcome comments, suggestions and feedback.
>
> Dan, Erik and Robbin
>
>
> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> Testing of the last round of changes revealed a hang in one of the new
>> TLH tests. Robbin has fixed the hang, updated the existing TLH test, and
>> added another TLH test for good measure.
>>
>> Here is the updated full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>
>> Here is the updated delta webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>
>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>> no unexpected failures.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Dan, Erik and Robbin
>>
>>
>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Robbin rebased the project last night/this morning to merge with Thread
>>> Local Handshakes (TLH) and also picked up additional changesets up thru:
>>>
>>>> Changeset: fa736014cf28
>>>> Author:    cjplummer
>>>> Date:      2017-11-14 18:08 -0800
>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>
>>>> 8191049: Add alternate version of pns() that is callable from within hotspot source
>>>> Summary: added pns2() to debug.cpp
>>>> Reviewed-by: stuefe, gthornbr
>>>
>>> This is the first time we've rebased the project to bits that are this
>>> fresh (< 12 hours old at rebase time). We've done this because we think
>>> we're done with this project and are in the final review-change-retest
>>> cycle(s)... In other words, we're not planning on making any more major
>>> changes for this project... :-)
>>>
>>> *** Time for code reviewers to chime in on this thread! ***
>>>
>>> Here is the updated full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>
>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>
>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>> (and the new baseline also)...
>>>
>>> We're expecting this round to be a little noisier than usual because
>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>> (Yes, we're playing chase-the-repo...)
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>
>>>> Unlike the previous rebase, there were no changes required to the
>>>> open code to get the rebased bits to build so there is no delta
>>>> webrev.
>>>>
>>>> These bits have been run through JPRT and I've submitted the usual
>>>> Mach5 tier[1-5] test run...
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>
>>>>> Here are the updated webrevs:
>>>>>
>>>>> Here's the mq comment for the change:
>>>>>
>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>
>>>>> Here is the full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>
>>>>> And here is the delta webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>
>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the holiday
>>>>> weekend. Didn't see any issues in a quick look. Going to take a closer
>>>>> look today.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Resolving one of the code review comments (from both Stefan K and Coleen)
>>>>>> on jdk10-04-full required quite a few changes so it is being done as a
>>>>>> standalone patch and corresponding webrevs:
>>>>>>
>>>>>> Here's the mq comment for the change:
>>>>>>
>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage to use
>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>
>>>>>> Here is the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>
>>>>>> And here is the delta webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>
>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>
>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>
>>>>>>> Here's a PDF for the internal wiki that we've been using to describe
>>>>>>> and track the work on this project:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>
>>>>>>> Dan has noticed that the indenting is wrong in some of the code quotes
>>>>>>> in the PDF that are not present in the internal wiki. We don't have a
>>>>>>> solution for that problem yet.
>>>>>>>
>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>
>>>>>>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>>>>>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Daniel Daugherty
>>>>>>> Erik Osterlund
>>>>>>> Robbin Ehn
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

David Holmes
In reply to this post by Daniel D. Daugherty
Just for the record ...

On 23/11/2017 6:20 PM, Robbin Ehn wrote:
> Thanks Dan for dragging this freight train to the docks, it's time to
> ship it!

I agree. The latest delta seems fine to me.

> Created follow-up bug:
> 8191809: Threads::number_of_threads() is not 'MT-safe'
> https://bugs.openjdk.java.net/browse/JDK-8191809

Just updated that bug - I don't see any MT issues there. :)

Cheers,
David

PS. Dan: yes JPRT was still doing 32-bit ARM a "month or two back".
64-bit atomics should not be a concern. That said I thought all the
atomic updates were done for ARM etc.

> /Robbin
>
> On 2017-11-23 03:16, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've made minor tweaks to the Thread-SMR project based on the remaining
>> code review comments over the last couple of days. I've also rebased the
>> project to jdk/hs bits as of mid-afternoon (my TZ) on 2017.11.22. I'm
>> running baseline Mach5 Tier[1-5] testing and prototype Mach5 Tier[1-5]
>> testing...
>>
>> Here's a delta webrev for the latest round of tweaks:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-11-delta
>>
>>
>> Here's the proposed commit message:
>>
>> $ cat SMR_prototype_10/open/commit.txt
>> 8167108: inconsistent handling of SR_lock can lead to crashes
>> Summary: Add Thread Safe Memory Reclamation (Thread-SMR) mechanism.
>> Reviewed-by: coleenp, dcubed, dholmes, eosterlund, gthornbr, kbarrett,
>> rehn, sspitsyn, stefank
>> Contributed-by: [hidden email],
>> [hidden email], [hidden email]
>>
>> I've gone through 880+ emails in my archive for this project and added
>> anyone that made a code review comment. Reviewers are not in my usual
>> by-comment-posting order; it was way too hard to figure that out... So
>> reviewers and contributors are simply in alpha-sort order...
>>
>> Here's the collection of follow-up bugs that I filed based on sifting
>> through the email archive:
>>
>> 8191784 JavaThread::collect_counters() should stop using Threads_lock
>> https://bugs.openjdk.java.net/browse/JDK-8191784
>>
>> 8191785 revoke_bias() needs a new assertion
>> https://bugs.openjdk.java.net/browse/JDK-8191785
>>
>> 8191786 Thread-SMR hash table size should be dynamic
>> https://bugs.openjdk.java.net/browse/JDK-8191786
>>
>> 8191787 move private inline functions from thread.inline.hpp ->
>> thread.cpp
>> https://bugs.openjdk.java.net/browse/JDK-8191787
>>
>> 8191789 migrate more Thread-SMR stuff from thread.[ch]pp ->
>> threadSMR.[ch]pp
>> https://bugs.openjdk.java.net/browse/JDK-8191789
>>
>> 8191798 redo nested ThreadsListHandle to drop Threads_lock
>> https://bugs.openjdk.java.net/browse/JDK-8191789
>>
>> I have undoubtedly missed at least one future idea that someone had
>> and for that my apologies...
>>
>> Thanks again to everyone who contributed to the Thread-SMR project!!
>>
>> Special thanks goes to Karen Kinnear, Kim Barrett and David Holmes for
>> their rigorous review of the design that we documented on the wiki.
>> Thanks for keeping us honest!
>>
>> I also have to thank my co-contributor Erik Österlund for starting the
>> ball rolling on this crazy project with his presentation on Safe Memory
>> Reclamation, for the initial prototype and for all your patches. Erik,
>> hopefully we got far enough in your crazy vision... :-)
>>
>> Thanks to my co-contributor Robbin Ehn for proposing that we do early
>> code reviews in the form of patches. Don't like something? Fix it with
>> a patch and a new test if appropriate!! A pretty cool way to work that
>> was new to me.
>>
>> Finally, many thanks to Jerry Thornbrugh for all the early code reviews,
>> whiteboard chats and phone calls. Thanks for asking the right questions
>> and pointing out some places where our logic was faulty. Also thanks for
>> keeping me sane when I was literally on my own in Monument, CO.
>>
>> Dan
>>
>>
>> On 11/21/17 7:12 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> *** We are wrapping up code review on this project so it is time ***
>>> *** for the various code reviewers to chime in one last time. ***
>>>
>>> In this latest round, we had three different reviewers chime in so we're
>>> doing delta webrevs for each of those resolutions:
>>>
>>> David H's resolutions:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.0-delta/ 
>>>
>>>
>>>   mq comment for dholmes_CR:
>>>     - Fix indents, trailing spaces and various typos.
>>>     - Add descriptions for the '_cnt', '_max' and '_times" fields,
>>>       add impl notes to document the type choices.
>>>
>>>   src/hotspot/share/runtime/globals.hpp change is white-space only so it
>>>   is only visible via the file's patch link.
>>>
>>>
>>> Robin W's resolutions:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.0,10.10.1-delta/ 
>>>
>>>
>>>   mq comment for robinw_CR:
>>>     - Fix some inefficient code, update some comments, fix some indents,
>>>       and add some 'const' specifiers.
>>>
>>>   src/hotspot/share/runtime/vm_operations.hpp change is white-space
>>> only so
>>>   it is only visible via the file's patch link.
>>>
>>>
>>> Coleen's resolutions:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.1,10.10.2-delta/ 
>>>
>>>
>>>   mq comment for coleenp_CR:
>>>     - Change ThreadsList::_threads from 'mtGC' -> 'mtThread',
>>>     - add header comment to threadSMR.hpp file,
>>>     - cleanup JavaThreadIteratorWithHandle ctr,
>>>     - make ErrorHandling more efficient.
>>>
>>>
>>> Some folks prefer to see all of the delta webrevs together so here is
>>> a delta webrev relative to jdk10-09-full:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.2-delta/ 
>>>
>>>
>>>   src/hotspot/share/runtime/globals.hpp and
>>>   src/hotspot/share/runtime/vm_operations.hpp changes are white-space
>>> only
>>>   so they are only visible via the file's patch link.
>>>
>>>
>>> And, of course, some folks prefer to see everything all at once so here
>>> is the latest full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10-full/
>>>
>>>
>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing. The
>>> prelim Mach5 tier1 (JPRT equivalent) on these bits passed just fine...
>>>
>>> The project is currently baselined on Jesper's 2017-11-17 PIT snapshot
>>> so there will be some catching up to do before integration...
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> Testing of the last round of changes revealed a hang in one of the new
>>>> TLH tests. Robbin has fixed the hang, updated the existing TLH test,
>>>> and
>>>> added another TLH test for good measure.
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>>
>>>> Here is the updated delta webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>>
>>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>>> no unexpected failures.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Robbin rebased the project last night/this morning to merge with
>>>>> Thread
>>>>> Local Handshakes (TLH) and also picked up additional changesets up
>>>>> thru:
>>>>>
>>>>>> Changeset: fa736014cf28
>>>>>> Author:    cjplummer
>>>>>> Date:      2017-11-14 18:08 -0800
>>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>>
>>>>>> 8191049: Add alternate version of pns() that is callable from
>>>>>> within hotspot source
>>>>>> Summary: added pns2() to debug.cpp
>>>>>> Reviewed-by: stuefe, gthornbr
>>>>>
>>>>> This is the first time we've rebased the project to bits that are this
>>>>> fresh (< 12 hours old at rebase time). We've done this because we
>>>>> think
>>>>> we're done with this project and are in the final review-change-retest
>>>>> cycle(s)... In other words, we're not planning on making any more
>>>>> major
>>>>> changes for this project... :-)
>>>>>
>>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>>
>>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>>
>>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>>> (and the new baseline also)...
>>>>>
>>>>> We're expecting this round to be a little noisier than usual because
>>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>>> (Yes, we're playing chase-the-repo...)
>>>>>>
>>>>>> Here is the updated full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>>
>>>>>> Unlike the previous rebase, there were no changes required to the
>>>>>> open code to get the rebased bits to build so there is no delta
>>>>>> webrev.
>>>>>>
>>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>>> Mach5 tier[1-5] test run...
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>>
>>>>>>> Here are the updated webrevs:
>>>>>>>
>>>>>>> Here's the mq comment for the change:
>>>>>>>
>>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>>
>>>>>>> Here is the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>>
>>>>>>> And here is the delta webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>>
>>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the
>>>>>>> holiday
>>>>>>> weekend. Didn't see any issues in a quick look. Going to take a
>>>>>>> closer
>>>>>>> look today.
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> Resolving one of the code review comments (from both Stefan K
>>>>>>>> and Coleen)
>>>>>>>> on jdk10-04-full required quite a few changes so it is being
>>>>>>>> done as a
>>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>>
>>>>>>>> Here's the mq comment for the change:
>>>>>>>>
>>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage
>>>>>>>> to use
>>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>>
>>>>>>>> Here is the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>>
>>>>>>>> And here is the delta webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Dan, Erik and Robbin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>>
>>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>>
>>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>>
>>>>>>>>> Here's a PDF for the internal wiki that we've been using to
>>>>>>>>> describe
>>>>>>>>> and track the work on this project:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dan has noticed that the indenting is wrong in some of the code
>>>>>>>>> quotes
>>>>>>>>> in the PDF that are not present in the internal wiki. We don't
>>>>>>>>> have a
>>>>>>>>> solution for that problem yet.
>>>>>>>>>
>>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>>
>>>>>>>>> This fix has been run through many rounds of JPRT and Mach5
>>>>>>>>> tier[2-5]
>>>>>>>>> testing, additional stress testing on Dan's Solaris X64 server,
>>>>>>>>> and
>>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>>
>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>
>>>>>>>>> Daniel Daugherty
>>>>>>>>> Erik Osterlund
>>>>>>>>> Robbin Ehn
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty
On 11/28/17 3:01 AM, David Holmes wrote:
> Just for the record ...
>
> On 23/11/2017 6:20 PM, Robbin Ehn wrote:
>> Thanks Dan for dragging this freight train to the docks, it's time to
>> ship it!
>
> I agree. The latest delta seems fine to me.

Thanks!


>
>> Created follow-up bug:
>> 8191809: Threads::number_of_threads() is not 'MT-safe'
>> https://bugs.openjdk.java.net/browse/JDK-8191809
>
> Just updated that bug - I don't see any MT issues there. :)
>
> Cheers,
> David
>
> PS. Dan: yes JPRT was still doing 32-bit ARM a "month or two back".
> 64-bit atomics should not be a concern. That said I thought all the
> atomic updates were done for ARM etc.

The atomic template updates were done for ARM. It was the new
template stuff that gave me the error about a match not being
available for one of the operations on a 64-bit field. I'm
going to guess that pre-template, we would have gotten a
runtime error or something...

I really wish I had saved that JPRT build log... :-(

Dan


>
>> /Robbin
>>
>> On 2017-11-23 03:16, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I've made minor tweaks to the Thread-SMR project based on the remaining
>>> code review comments over the last couple of days. I've also rebased
>>> the
>>> project to jdk/hs bits as of mid-afternoon (my TZ) on 2017.11.22. I'm
>>> running baseline Mach5 Tier[1-5] testing and prototype Mach5 Tier[1-5]
>>> testing...
>>>
>>> Here's a delta webrev for the latest round of tweaks:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-11-delta
>>>
>>>
>>> Here's the proposed commit message:
>>>
>>> $ cat SMR_prototype_10/open/commit.txt
>>> 8167108: inconsistent handling of SR_lock can lead to crashes
>>> Summary: Add Thread Safe Memory Reclamation (Thread-SMR) mechanism.
>>> Reviewed-by: coleenp, dcubed, dholmes, eosterlund, gthornbr,
>>> kbarrett, rehn, sspitsyn, stefank
>>> Contributed-by: [hidden email],
>>> [hidden email], [hidden email]
>>>
>>> I've gone through 880+ emails in my archive for this project and added
>>> anyone that made a code review comment. Reviewers are not in my usual
>>> by-comment-posting order; it was way too hard to figure that out... So
>>> reviewers and contributors are simply in alpha-sort order...
>>>
>>> Here's the collection of follow-up bugs that I filed based on sifting
>>> through the email archive:
>>>
>>> 8191784 JavaThread::collect_counters() should stop using Threads_lock
>>> https://bugs.openjdk.java.net/browse/JDK-8191784
>>>
>>> 8191785 revoke_bias() needs a new assertion
>>> https://bugs.openjdk.java.net/browse/JDK-8191785
>>>
>>> 8191786 Thread-SMR hash table size should be dynamic
>>> https://bugs.openjdk.java.net/browse/JDK-8191786
>>>
>>> 8191787 move private inline functions from thread.inline.hpp ->
>>> thread.cpp
>>> https://bugs.openjdk.java.net/browse/JDK-8191787
>>>
>>> 8191789 migrate more Thread-SMR stuff from thread.[ch]pp ->
>>> threadSMR.[ch]pp
>>> https://bugs.openjdk.java.net/browse/JDK-8191789
>>>
>>> 8191798 redo nested ThreadsListHandle to drop Threads_lock
>>> https://bugs.openjdk.java.net/browse/JDK-8191789
>>>
>>> I have undoubtedly missed at least one future idea that someone had
>>> and for that my apologies...
>>>
>>> Thanks again to everyone who contributed to the Thread-SMR project!!
>>>
>>> Special thanks goes to Karen Kinnear, Kim Barrett and David Holmes for
>>> their rigorous review of the design that we documented on the wiki.
>>> Thanks for keeping us honest!
>>>
>>> I also have to thank my co-contributor Erik Österlund for starting the
>>> ball rolling on this crazy project with his presentation on Safe Memory
>>> Reclamation, for the initial prototype and for all your patches. Erik,
>>> hopefully we got far enough in your crazy vision... :-)
>>>
>>> Thanks to my co-contributor Robbin Ehn for proposing that we do early
>>> code reviews in the form of patches. Don't like something? Fix it with
>>> a patch and a new test if appropriate!! A pretty cool way to work that
>>> was new to me.
>>>
>>> Finally, many thanks to Jerry Thornbrugh for all the early code
>>> reviews,
>>> whiteboard chats and phone calls. Thanks for asking the right questions
>>> and pointing out some places where our logic was faulty. Also thanks
>>> for
>>> keeping me sane when I was literally on my own in Monument, CO.
>>>
>>> Dan
>>>
>>>
>>> On 11/21/17 7:12 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> *** We are wrapping up code review on this project so it is time ***
>>>> *** for the various code reviewers to chime in one last time. ***
>>>>
>>>> In this latest round, we had three different reviewers chime in so
>>>> we're
>>>> doing delta webrevs for each of those resolutions:
>>>>
>>>> David H's resolutions:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.0-delta/ 
>>>>
>>>>
>>>>   mq comment for dholmes_CR:
>>>>     - Fix indents, trailing spaces and various typos.
>>>>     - Add descriptions for the '_cnt', '_max' and '_times" fields,
>>>>       add impl notes to document the type choices.
>>>>
>>>>   src/hotspot/share/runtime/globals.hpp change is white-space only
>>>> so it
>>>>   is only visible via the file's patch link.
>>>>
>>>>
>>>> Robin W's resolutions:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.0,10.10.1-delta/ 
>>>>
>>>>
>>>>   mq comment for robinw_CR:
>>>>     - Fix some inefficient code, update some comments, fix some
>>>> indents,
>>>>       and add some 'const' specifiers.
>>>>
>>>>   src/hotspot/share/runtime/vm_operations.hpp change is white-space
>>>> only so
>>>>   it is only visible via the file's patch link.
>>>>
>>>>
>>>> Coleen's resolutions:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.1,10.10.2-delta/ 
>>>>
>>>>
>>>>   mq comment for coleenp_CR:
>>>>     - Change ThreadsList::_threads from 'mtGC' -> 'mtThread',
>>>>     - add header comment to threadSMR.hpp file,
>>>>     - cleanup JavaThreadIteratorWithHandle ctr,
>>>>     - make ErrorHandling more efficient.
>>>>
>>>>
>>>> Some folks prefer to see all of the delta webrevs together so here is
>>>> a delta webrev relative to jdk10-09-full:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.2-delta/ 
>>>>
>>>>
>>>>   src/hotspot/share/runtime/globals.hpp and
>>>>   src/hotspot/share/runtime/vm_operations.hpp changes are
>>>> white-space only
>>>>   so they are only visible via the file's patch link.
>>>>
>>>>
>>>> And, of course, some folks prefer to see everything all at once so
>>>> here
>>>> is the latest full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10-full/
>>>>
>>>>
>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing. The
>>>> prelim Mach5 tier1 (JPRT equivalent) on these bits passed just fine...
>>>>
>>>> The project is currently baselined on Jesper's 2017-11-17 PIT snapshot
>>>> so there will be some catching up to do before integration...
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Testing of the last round of changes revealed a hang in one of the
>>>>> new
>>>>> TLH tests. Robbin has fixed the hang, updated the existing TLH
>>>>> test, and
>>>>> added another TLH test for good measure.
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>>>
>>>>> Here is the updated delta webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>>>
>>>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>>>> no unexpected failures.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Robbin rebased the project last night/this morning to merge with
>>>>>> Thread
>>>>>> Local Handshakes (TLH) and also picked up additional changesets
>>>>>> up thru:
>>>>>>
>>>>>>> Changeset: fa736014cf28
>>>>>>> Author:    cjplummer
>>>>>>> Date:      2017-11-14 18:08 -0800
>>>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>>>
>>>>>>> 8191049: Add alternate version of pns() that is callable from
>>>>>>> within hotspot source
>>>>>>> Summary: added pns2() to debug.cpp
>>>>>>> Reviewed-by: stuefe, gthornbr
>>>>>>
>>>>>> This is the first time we've rebased the project to bits that are
>>>>>> this
>>>>>> fresh (< 12 hours old at rebase time). We've done this because we
>>>>>> think
>>>>>> we're done with this project and are in the final
>>>>>> review-change-retest
>>>>>> cycle(s)... In other words, we're not planning on making any more
>>>>>> major
>>>>>> changes for this project... :-)
>>>>>>
>>>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>>>
>>>>>> Here is the updated full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>>>
>>>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>>>
>>>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>>>> (and the new baseline also)...
>>>>>>
>>>>>> We're expecting this round to be a little noisier than usual because
>>>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>>>> through Jesper's usual care-and-feeding of integration_blockers,
>>>>>> etc.
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>>>> (Yes, we're playing chase-the-repo...)
>>>>>>>
>>>>>>> Here is the updated full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>>>
>>>>>>> Unlike the previous rebase, there were no changes required to the
>>>>>>> open code to get the rebased bits to build so there is no delta
>>>>>>> webrev.
>>>>>>>
>>>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>>>> Mach5 tier[1-5] test run...
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>>>
>>>>>>>> Here are the updated webrevs:
>>>>>>>>
>>>>>>>> Here's the mq comment for the change:
>>>>>>>>
>>>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>>>
>>>>>>>> Here is the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>>>
>>>>>>>> And here is the delta webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>>>
>>>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the
>>>>>>>> holiday
>>>>>>>> weekend. Didn't see any issues in a quick look. Going to take a
>>>>>>>> closer
>>>>>>>> look today.
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Dan, Erik and Robbin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> Resolving one of the code review comments (from both Stefan K
>>>>>>>>> and Coleen)
>>>>>>>>> on jdk10-04-full required quite a few changes so it is being
>>>>>>>>> done as a
>>>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>>>
>>>>>>>>> Here's the mq comment for the change:
>>>>>>>>>
>>>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage
>>>>>>>>> to use
>>>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>>>
>>>>>>>>> Here is the full webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>>>
>>>>>>>>> And here is the delta webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>>>
>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>
>>>>>>>>> Dan, Erik and Robbin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>>>
>>>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>>>
>>>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>>>
>>>>>>>>>> Here's a PDF for the internal wiki that we've been using to
>>>>>>>>>> describe
>>>>>>>>>> and track the work on this project:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Dan has noticed that the indenting is wrong in some of the
>>>>>>>>>> code quotes
>>>>>>>>>> in the PDF that are not present in the internal wiki. We
>>>>>>>>>> don't have a
>>>>>>>>>> solution for that problem yet.
>>>>>>>>>>
>>>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>>>
>>>>>>>>>> This fix has been run through many rounds of JPRT and Mach5
>>>>>>>>>> tier[2-5]
>>>>>>>>>> testing, additional stress testing on Dan's Solaris X64
>>>>>>>>>> server, and
>>>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>>>
>>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>>
>>>>>>>>>> Daniel Daugherty
>>>>>>>>>> Erik Osterlund
>>>>>>>>>> Robbin Ehn
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>