Quantcast

[9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
Hi all,

  can I have reviews for this small fix that changes a guarantee() in
parallel gc when some logging buffer overflows into a warning message
about the overflow and (continued) reuse of the last entry of that
buffer so that the user then can then rerun with a larger buffer.

Since it is so late in the release cycle I would like to keep the fix
simple instead of rewriting the buffer logic, i.e. make it expandable
etc. 

This change has been mostly contributed by Ramki from Twitter, as part
of JDK-7180571; however we just got this bug from release testing that
is a duplicate, and he's unavailable to do an RFR right now, and I want
this fixed asap. I intend to put him as author for this change.

CR:
https://bugs.openjdk.java.net/browse/JDK-8177963
Webrev:
http://cr.openjdk.java.net/~tschatzl/8177963/webrev/
Testing:
local testing, new test, jprt

Thanks,
  Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Kim Barrett
> On Apr 4, 2017, at 5:48 AM, Thomas Schatzl <[hidden email]> wrote:
>
> Hi all,
>
>   can I have reviews for this small fix that changes a guarantee() in
> parallel gc when some logging buffer overflows into a warning message
> about the overflow and (continued) reuse of the last entry of that
> buffer so that the user then can then rerun with a larger buffer.
>
> Since it is so late in the release cycle I would like to keep the fix
> simple instead of rewriting the buffer logic, i.e. make it expandable
> etc.
>
> This change has been mostly contributed by Ramki from Twitter, as part
> of JDK-7180571; however we just got this bug from release testing that
> is a duplicate, and he's unavailable to do an RFR right now, and I want
> this fixed asap. I intend to put him as author for this change.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8177963
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev/
> Testing:
> local testing, new test, jprt
>
> Thanks,
>   Thomas

I'm surprised the type of the "name" argument for add_task_timestamp
is not "const char*".

I dislike the casts of GCTaskTimeStampEntries to uint (from uintx),
introduced so it can be MIN2'd with _time_stamp_index. Could
GCTaskTimeStampEntries be given a type of uint? Certainly the code
doesn't really support values that won't fit in uint, since
_time_stamp_index will overflow.

I think it would be helpful for print_task_time_stamps to include some
indication of overflow in the header it prints.

But related to both the cast (for MIN2) and the printing, do we really
need _time_stamp_index to be the actual number of add_task_timestamp
calls? Could we instead just have add_task_timestamp do nothing (other
than warn the first time) when the buffer is full?

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

sangheon.kim@oracle.com
In reply to this post by Thomas Schatzl
Hi Thomas,

Thank you for fixing this!

http://cr.openjdk.java.net/~tschatzl/8177963/webrev/src/share/vm/gc/parallel/gcTaskThread.cpp.frames.html
62   uint time = MIN2(_time_stamp_index, (uint)GCTaskTimeStampEntries - 1);
Variable name seems not appropriate. 'index' instead of 'time'?

   68   _time_stamp_index++;
We would face overflow here, so increase only if it is less than
GCTaskTimeStampEntries?
(Because you want to re-use the last buffer when buffers overflow, if
not line 58 also should be '>=')
Or simply rotate it, _time_stamp_index%=GCTaskTimeStampEntries

http://cr.openjdk.java.net/~tschatzl/8177963/webrev/src/share/vm/gc/parallel/gcTaskThread.hpp.frames.html
You already updated all other files except this.

Thanks,
Sangheon


On 04/04/2017 02:48 AM, Thomas Schatzl wrote:

> Hi all,
>
>    can I have reviews for this small fix that changes a guarantee() in
> parallel gc when some logging buffer overflows into a warning message
> about the overflow and (continued) reuse of the last entry of that
> buffer so that the user then can then rerun with a larger buffer.
>
> Since it is so late in the release cycle I would like to keep the fix
> simple instead of rewriting the buffer logic, i.e. make it expandable
> etc.
>
> This change has been mostly contributed by Ramki from Twitter, as part
> of JDK-7180571; however we just got this bug from release testing that
> is a duplicate, and he's unavailable to do an RFR right now, and I want
> this fixed asap. I intend to put him as author for this change.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8177963
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev/
> Testing:
> local testing, new test, jprt
>
> Thanks,
>    Thomas
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
Hi Sangheon,

On Tue, 2017-04-04 at 12:40 -0700, sangheon wrote:
> Hi Thomas,
>
> Thank you for fixing this!
>
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev/src/share/vm/gc/p
> arallel/gcTaskThread.cpp.frames.html
> 62   uint time = MIN2(_time_stamp_index, (uint)GCTaskTimeStampEntries
> - 1);
> Variable name seems not appropriate. 'index' instead of 'time'?

Kind of fixed after incorporating Kim's suggestions.

>
>    68   _time_stamp_index++;
> We would face overflow here, so increase only if it is less than 
> GCTaskTimeStampEntries?
> (Because you want to re-use the last buffer when buffers overflow, if
> not line 58 also should be '>=')
> Or simply rotate it, _time_stamp_index%=GCTaskTimeStampEntries

I do not really care either way, we just should not crash. In this
version I implemented Kim's suggestion to ignore all further time
stamps.

> http://cr.openjdk.java.net/~tschatzl/8177963/webrev/src/share/vm/gc/p
> arallel/gcTaskThread.hpp.frames.html
> You already updated all other files except this.

Fixed.

New webrevs at:
http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)

Thanks,
  Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
In reply to this post by Kim Barrett
Hi Kim,

 thanks for your review.

On Tue, 2017-04-04 at 15:35 -0400, Kim Barrett wrote:

> >
> > On Apr 4, 2017, at 5:48 AM, Thomas Schatzl <[hidden email]
> > om> wrote:
> >
> > Hi all,
> >
> >   can I have reviews for this small fix that changes a guarantee()
> > in parallel gc when some logging buffer overflows into a warning
> > message about the overflow and (continued) reuse of the last entry
> > of that buffer so that the user then can then rerun with a larger
> > buffer.
> >
> > Since it is so late in the release cycle I would like to keep the
> > fix simple instead of rewriting the buffer logic, i.e. make it
> > expandable etc. 
> >
> > This change has been mostly contributed by Ramki from Twitter, as
> > part of JDK-7180571; however we just got this bug from release
> > testing that is a duplicate, and he's unavailable to do an RFR
> > right now, and I want this fixed asap. I intend to put him as
> > author for this change.
> >
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8177963
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev/
> > Testing:
> > local testing, new test, jprt
> >
> > Thanks,
> >   Thomas
> I'm surprised the type of the "name" argument for add_task_timestamp
> is not "const char*".

Fixed.

> I dislike the casts of GCTaskTimeStampEntries to uint (from uintx),
> introduced so it can be MIN2'd with _time_stamp_index. Could
> GCTaskTimeStampEntries be given a type of uint? Certainly the code
> doesn't really support values that won't fit in uint, since
> _time_stamp_index will overflow.

Done. Initially I thought of that as well, but didn't because I wanted
to keep the changes minimal. Since you are also pointing this out...

> I think it would be helpful for print_task_time_stamps to include
> some indication of overflow in the header it prints.

Done.

> But related to both the cast (for MIN2) and the printing, do we
> really need _time_stamp_index to be the actual number of
> add_task_timestamp calls? Could we instead just have
> add_task_timestamp do nothing (other than warn the first time) when
> the buffer is full?

Continuing to count gives an indication on how large the user needs to
size this buffer without too many retries.

If you think this is a problem (because of overflow after 4M tasks), I
can change that again.

New webrevs:
http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)

Testing:
test case, jprt

Thanks,
  Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Kim Barrett
> On Apr 5, 2017, at 7:26 AM, Thomas Schatzl <[hidden email]> wrote:
> Continuing to count gives an indication on how large the user needs to
> size this buffer without too many retries.

A different approach to reporting needed size when overflow occurs
would be to limit _time_stamp_index by GCTaskTimeStampEntries
and add an overflow counter.  But what you have is okay with me too.

> New webrevs:
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)

Looks good.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
Hi,


On Wed, 2017-04-05 at 11:59 -0400, Kim Barrett wrote:

> >
> > On Apr 5, 2017, at 7:26 AM, Thomas Schatzl <[hidden email]
> > om> wrote:
> > Continuing to count gives an indication on how large the user needs
> > to
> > size this buffer without too many retries.
> A different approach to reporting needed size when overflow occurs
> would be to limit _time_stamp_index by GCTaskTimeStampEntries
> and add an overflow counter.  But what you have is okay with me too.
>
> >
> > New webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
> Looks good.
>

  thanks for your review.

Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

sangheon.kim@oracle.com
In reply to this post by Thomas Schatzl
Hi Thomas,

On 04/05/2017 02:17 AM, Thomas Schatzl wrote:

> Hi Sangheon,
>
> On Tue, 2017-04-04 at 12:40 -0700, sangheon wrote:
>> Hi Thomas,
>>
>> Thank you for fixing this!
>>
>> http://cr.openjdk.java.net/~tschatzl/8177963/webrev/src/share/vm/gc/p
>> arallel/gcTaskThread.cpp.frames.html
>> 62   uint time = MIN2(_time_stamp_index, (uint)GCTaskTimeStampEntries
>> - 1);
>> Variable name seems not appropriate. 'index' instead of 'time'?
> Kind of fixed after incorporating Kim's suggestions.
OK.

>
>>     68   _time_stamp_index++;
>> We would face overflow here, so increase only if it is less than
>> GCTaskTimeStampEntries?
>> (Because you want to re-use the last buffer when buffers overflow, if
>> not line 58 also should be '>=')
>> Or simply rotate it, _time_stamp_index%=GCTaskTimeStampEntries
> I do not really care either way, we just should not crash. In this
> version I implemented Kim's suggestion to ignore all further time
> stamps.
OK.

>
>> http://cr.openjdk.java.net/~tschatzl/8177963/webrev/src/share/vm/gc/p
>> arallel/gcTaskThread.hpp.frames.html
>> You already updated all other files except this.
> Fixed.
>
> New webrevs at:
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
webrev.1 seems good to me.

Thanks,
Sangheon


>
> Thanks,
>    Thomas
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
Hi Sangheon,

On Wed, 2017-04-05 at 10:47 -0700, sangheon wrote:
> Hi Thomas,
>
[...]
> > New webrevs at:
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
> webrev.1 seems good to me.

  thanks for your review.

Thomas

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Srinivas Ramakrishna
Re webrev.1: Good, but I think you shouldn't print the overflow message each time overflow occurs while writing, but only at the end when the timestamp entries are printed. That should suffice to let user know an overflow occurred and how big a size they may need (that shd be part of the overflow message).

-- ramki

On Wed, Apr 5, 2017 at 11:16 AM, Thomas Schatzl <[hidden email]> wrote:
Hi Sangheon,

On Wed, 2017-04-05 at 10:47 -0700, sangheon wrote:
> Hi Thomas,
>
[...]
> > New webrevs at:
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
> webrev.1 seems good to me.

  thanks for your review.

Thomas


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Srinivas Ramakrishna
And thanks for picking this up and rolling it with yr fix, Thomas!

On Wed, Apr 5, 2017 at 4:38 PM, Srinivas Ramakrishna <[hidden email]> wrote:
Re webrev.1: Good, but I think you shouldn't print the overflow message each time overflow occurs while writing, but only at the end when the timestamp entries are printed. That should suffice to let user know an overflow occurred and how big a size they may need (that shd be part of the overflow message).

-- ramki

On Wed, Apr 5, 2017 at 11:16 AM, Thomas Schatzl <[hidden email]> wrote:
Hi Sangheon,

On Wed, 2017-04-05 at 10:47 -0700, sangheon wrote:
> Hi Thomas,
>
[...]
> > New webrevs at:
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
> webrev.1 seems good to me.

  thanks for your review.

Thomas



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
In reply to this post by Srinivas Ramakrishna
Hi Ramki,

On Wed, 2017-04-05 at 16:38 -0700, Srinivas Ramakrishna wrote:
> Re webrev.1: Good, but I think you shouldn't print the overflow
> message each time overflow occurs while writing, but only at the end

The message appears only once during writing (per thread).

> when the timestamp entries are printed. That should suffice to let
> user know an overflow occurred and how big a size they may need (that
> shd be part of the overflow message).

The _task_time_stamp counter always contains the total number of
entries that the GC wanted to write.

Now Kim suggested to explicitly print how many messages too much were
written at the end, but then the user would need to manually add the
two values (GCTaskTimeStampEntries plus the number of overflowed
message) together manually.

I can remove the single overflow message while writing if you want, I
see it as a courtesy to tell the user what to change.

Please suggest your preferences.

Thanks,
  Thomas

>
> -- ramki
>
> On Wed, Apr 5, 2017 at 11:16 AM, Thomas Schatzl <thomas.schatzl@oracl
> e.com> wrote:
> > Hi Sangheon,
> >
> > On Wed, 2017-04-05 at 10:47 -0700, sangheon wrote:
> > > Hi Thomas,
> > >
> > [...]
> > > > New webrevs at:
> > > > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1
> > (diff)
> > > > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
> > > webrev.1 seems good to me.
> >
> >   thanks for your review.
> >
> > Thomas
> >
> >
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Kim Barrett
> On Apr 6, 2017, at 3:56 AM, Thomas Schatzl <[hidden email]> wrote:
>
> Now Kim suggested to explicitly print how many messages too much were
> written at the end, but then the user would need to manually add the
> two values (GCTaskTimeStampEntries plus the number of overflowed
> message) together manually.

Not exactly.  I suggested limiting the index to the buffer size and separately
tracking overflow.  How that gets reported is a different matter; I was thinking
we would do that addition for the user when we print the message :)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Srinivas Ramakrishna
In reply to this post by Thomas Schatzl


On Thu, Apr 6, 2017 at 12:56 AM, Thomas Schatzl <[hidden email]> wrote:
Hi Ramki,

On Wed, 2017-04-05 at 16:38 -0700, Srinivas Ramakrishna wrote:
> Re webrev.1: Good, but I think you shouldn't print the overflow
> message each time overflow occurs while writing, but only at the end

The message appears only once during writing (per thread).

OK, I see that now. Misread previously.
This looks fine to me.

-- ramki
 

> when the timestamp entries are printed. That should suffice to let
> user know an overflow occurred and how big a size they may need (that
> shd be part of the overflow message).

The _task_time_stamp counter always contains the total number of
entries that the GC wanted to write.

Now Kim suggested to explicitly print how many messages too much were
written at the end, but then the user would need to manually add the
two values (GCTaskTimeStampEntries plus the number of overflowed
message) together manually.

I can remove the single overflow message while writing if you want, I
see it as a courtesy to tell the user what to change.

Please suggest your preferences.

Thanks,
  Thomas

>
> -- ramki
>
> On Wed, Apr 5, 2017 at 11:16 AM, Thomas Schatzl <thomas.schatzl@oracl
> e.com> wrote:
> > Hi Sangheon,
> >
> > On Wed, 2017-04-05 at 10:47 -0700, sangheon wrote:
> > > Hi Thomas,
> > >
> > [...]
> > > > New webrevs at:
> > > > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.0_to_1
> > (diff)
> > > > http://cr.openjdk.java.net/~tschatzl/8177963/webrev.1 (full)
> > > webrev.1 seems good to me.
> >
> >   thanks for your review.
> >
> > Thomas
> >
> >

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [9] RFR (S): 8177963: Parallel GC fails fast when per-thread task log overflows

Thomas Schatzl
In reply to this post by Kim Barrett
Hi Kim,

On Thu, 2017-04-06 at 12:38 -0400, Kim Barrett wrote:

> >
> > On Apr 6, 2017, at 3:56 AM, Thomas Schatzl <[hidden email]
> > om> wrote:
> >
> > Now Kim suggested to explicitly print how many messages too much
> > were written at the end, but then the user would need to manually
> > add the two values (GCTaskTimeStampEntries plus the number of
> > overflowed message) together manually.
> Not exactly.  I suggested limiting the index to the buffer size and
> separately tracking overflow.  How that gets reported is a different
> matter; I was thinking we would do that addition for the user when we
> print the message :)

  okay, you are right :)

Thomas

Loading...