RFR(XS) 8191369: NMT: Enhance thread stack tracking

classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
This is Linux only enhancement for now, can be extended for other
platforms. (See bug for details)

Current implementation, thread stack is tracked when thread is created,
its available stack space is tagged as reserved and committed.

However, this is not how stack works. Stack pages are committed on
demand, so this approach overstates its memory usage.

This enhancement gathers thread stack usage at NMT query time, so that
it can report more accurate usage.

Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html


Example:

Summary shows thread stacks only use small faction of reserved space.

-                    Thread (reserved=41216KB, committed=632KB)
                             (thread #41)
                             (stack: reserved=41032KB, committed=448KB)
                             (malloc=137KB #222)
                             (arena=47KB #76)


Detail tracking shows stack guards and actually used/committed stack area.

[0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for Thread
Stack from
     [0x00007f67c0b57d5c] JavaThread::run()+0x2c
     [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112

         [0x00007f66e95e7000 - 0x00007f66e95eb000] committed 16KB from
             [0x00007f67c08b7319] os::pd_create_stack_guard_pages(char*,
unsigned long)+0x29
             [0x00007f67c0b56851] JavaThread::create_stack_guard_pages()
[clone .part.125]+0xa1
             [0x00007f67c0b57d6e] JavaThread::run()+0x3e
             [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112

         [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB

Thanks,

-Zhengyu

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

David Holmes
Hi Zhengyu,

I can't review this in detail as I'm not familiar with NMT workings, but
have a couple of comments.

On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
> This is Linux only enhancement for now, can be extended for other
> platforms. (See bug for details)

I'm concerned about unnecessary platform skew. I added some comments to
the bug. I think the mincore trick can be used on Solaris and AIX as
well - but not on OS X / BSD. It may be that VirtualQuery can be used on
Windows to get the same information - but I'm unclear if the memory
states support what you're looking for. It would be good to extend this
to other platforms that will support it.

And on a minor note can you please correct the existing spelling error
in get_stack_commited_bottom. :)

> Current implementation, thread stack is tracked when thread is created,
> its available stack space is tagged as reserved and committed.
>
> However, this is not how stack works. Stack pages are committed on
> demand, so this approach overstates its memory usage.
>
> This enhancement gathers thread stack usage at NMT query time, so that
> it can report more accurate usage.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>
>
> Example:
>
> Summary shows thread stacks only use small faction of reserved space.
>
> -                    Thread (reserved=41216KB, committed=632KB)
>                              (thread #41)
>                              (stack: reserved=41032KB, committed=448KB)
>                              (malloc=137KB #222)
>                              (arena=47KB #76)
>
>
> Detail tracking shows stack guards and actually used/committed stack area.
>
> [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for Thread
> Stack from
>      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>
>          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed 16KB from
>              [0x00007f67c08b7319] os::pd_create_stack_guard_pages(char*,
> unsigned long)+0x29
>              [0x00007f67c0b56851] JavaThread::create_stack_guard_pages()
> [clone .part.125]+0xa1
>              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>              [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>
>          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB

Can we capture this in a test so we can tell that the tracking has improved?

Thanks,
David
-----

> Thanks,
>
> -Zhengyu
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Thomas Stüfe-2
On Thu, Nov 16, 2017 at 7:58 AM, David Holmes <[hidden email]>
wrote:

> Hi Zhengyu,
>
> I can't review this in detail as I'm not familiar with NMT workings, but
> have a couple of comments.
>
> On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>
>> This is Linux only enhancement for now, can be extended for other
>> platforms. (See bug for details)
>>
>
> I'm concerned about unnecessary platform skew. I added some comments to
> the bug. I think the mincore trick can be used on Solaris and AIX as well -
> but not on OS X / BSD. It may be that VirtualQuery can be used on Windows
> to get the same information - but I'm unclear if the memory states support
> what you're looking for. It would be good to extend this to other platforms
> that will support it.
>

Just 5c:

I think this is definitely useful even with only the Linux implementation
provided.

On AIX we have commit-on-touch and hence mincore may actually force commit
the thread :P - would have to try it but have no high hopes. Well.

On Windows this is possible, we already use VirtualQuery in our fork to
print out the commit-state of the current thread stack in case of an error.

..Thomas


> And on a minor note can you please correct the existing spelling error in
> get_stack_commited_bottom. :)
>
> Current implementation, thread stack is tracked when thread is created,
>> its available stack space is tagged as reserved and committed.
>>
>> However, this is not how stack works. Stack pages are committed on
>> demand, so this approach overstates its memory usage.
>>
>> This enhancement gathers thread stack usage at NMT query time, so that it
>> can report more accurate usage.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>>
>>
>> Example:
>>
>> Summary shows thread stacks only use small faction of reserved space.
>>
>> -                    Thread (reserved=41216KB, committed=632KB)
>>                              (thread #41)
>>                              (stack: reserved=41032KB, committed=448KB)
>>                              (malloc=137KB #222)
>>                              (arena=47KB #76)
>>
>>
>> Detail tracking shows stack guards and actually used/committed stack area.
>>
>> [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for Thread
>> Stack from
>>      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>>      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>
>>          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed 16KB from
>>              [0x00007f67c08b7319] os::pd_create_stack_guard_pages(char*,
>> unsigned long)+0x29
>>              [0x00007f67c0b56851] JavaThread::create_stack_guard_pages()
>> [clone .part.125]+0xa1
>>              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>>              [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>
>>          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB
>>
>
> Can we capture this in a test so we can tell that the tracking has
> improved?
>
> Thanks,
> David
> -----
>
> Thanks,
>>
>> -Zhengyu
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Severin Gehwolf
On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:

> On Thu, Nov 16, 2017 at 7:58 AM, David Holmes <[hidden email]>
> wrote:
>
> > Hi Zhengyu,
> >
> > I can't review this in detail as I'm not familiar with NMT workings, but
> > have a couple of comments.
> >
> > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
> >
> > > This is Linux only enhancement for now, can be extended for other
> > > platforms. (See bug for details)
> > >
> >
> > I'm concerned about unnecessary platform skew. I added some comments to
> > the bug. I think the mincore trick can be used on Solaris and AIX as well -
> > but not on OS X / BSD. It may be that VirtualQuery can be used on Windows
> > to get the same information - but I'm unclear if the memory states support
> > what you're looking for. It would be good to extend this to other platforms
> > that will support it.
> >
>
> Just 5c:
>
> I think this is definitely useful even with only the Linux implementation
> provided.

+1

While plaform skew is a valid concern, I'm doubtful it's "unnecessary".
I think there is value to get one platform fixed and then look at other
platforms one by one as follow-up bugs. For some platforms it might be
hard to do (or undesirable) as Thomas says.

Thanks,
Severin

> On AIX we have commit-on-touch and hence mincore may actually force commit
> the thread :P - would have to try it but have no high hopes. Well.
>
> On Windows this is possible, we already use VirtualQuery in our fork to
> print out the commit-state of the current thread stack in case of an error.
>
> ..Thomas
>
>
> > And on a minor note can you please correct the existing spelling error in
> > get_stack_commited_bottom. :)
> >
> > Current implementation, thread stack is tracked when thread is created,
> > > its available stack space is tagged as reserved and committed.
> > >
> > > However, this is not how stack works. Stack pages are committed on
> > > demand, so this approach overstates its memory usage.
> > >
> > > This enhancement gathers thread stack usage at NMT query time, so that it
> > > can report more accurate usage.
> > >
> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
> > > Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
> > >
> > >
> > > Example:
> > >
> > > Summary shows thread stacks only use small faction of reserved space.
> > >
> > > -                    Thread (reserved=41216KB, committed=632KB)
> > >                              (thread #41)
> > >                              (stack: reserved=41032KB, committed=448KB)
> > >                              (malloc=137KB #222)
> > >                              (arena=47KB #76)
> > >
> > >
> > > Detail tracking shows stack guards and actually used/committed stack area.
> > >
> > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for Thread
> > > Stack from
> > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
> > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
> > >
> > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed 16KB from
> > >              [0x00007f67c08b7319] os::pd_create_stack_guard_pages(char*,
> > > unsigned long)+0x29
> > >              [0x00007f67c0b56851] JavaThread::create_stack_guard_pages()
> > > [clone .part.125]+0xa1
> > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
> > >              [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
> > >
> > >          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB
> > >
> >
> > Can we capture this in a test so we can tell that the tracking has
> > improved?
> >
> > Thanks,
> > David
> > -----
> >
> > Thanks,
> > >
> > > -Zhengyu
> > >
> > >

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Thomas Stüfe-2
On Thu, Nov 16, 2017 at 10:23 AM, Severin Gehwolf <[hidden email]>
wrote:

> On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:
> > On Thu, Nov 16, 2017 at 7:58 AM, David Holmes <[hidden email]>
> > wrote:
> >
> > > Hi Zhengyu,
> > >
> > > I can't review this in detail as I'm not familiar with NMT workings,
> but
> > > have a couple of comments.
> > >
> > > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
> > >
> > > > This is Linux only enhancement for now, can be extended for other
> > > > platforms. (See bug for details)
> > > >
> > >
> > > I'm concerned about unnecessary platform skew. I added some comments to
> > > the bug. I think the mincore trick can be used on Solaris and AIX as
> well -
> > > but not on OS X / BSD. It may be that VirtualQuery can be used on
> Windows
> > > to get the same information - but I'm unclear if the memory states
> support
> > > what you're looking for. It would be good to extend this to other
> platforms
> > > that will support it.
> > >
> >
> > Just 5c:
> >
> > I think this is definitely useful even with only the Linux implementation
> > provided.
>
> +1
>
> While plaform skew is a valid concern, I'm doubtful it's "unnecessary".
> I think there is value to get one platform fixed and then look at other
> platforms one by one as follow-up bugs. For some platforms it might be
> hard to do (or undesirable) as Thomas says.
>
> Thanks,
> Severin
>

I volunteer to implement this for windows if it can be done as
non-time-critical follow up, if no-one else has stepped up in until then.

..Thomas



> > On AIX we have commit-on-touch and hence mincore may actually force
> commit
> > the thread :P - would have to try it but have no high hopes. Well.
> >
> > On Windows this is possible, we already use VirtualQuery in our fork to
> > print out the commit-state of the current thread stack in case of an
> error.
> >
> > ..Thomas
> >
> >
> > > And on a minor note can you please correct the existing spelling error
> in
> > > get_stack_commited_bottom. :)
> > >
> > > Current implementation, thread stack is tracked when thread is created,
> > > > its available stack space is tagged as reserved and committed.
> > > >
> > > > However, this is not how stack works. Stack pages are committed on
> > > > demand, so this approach overstates its memory usage.
> > > >
> > > > This enhancement gathers thread stack usage at NMT query time, so
> that it
> > > > can report more accurate usage.
> > > >
> > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
> > > > Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
> > > >
> > > >
> > > > Example:
> > > >
> > > > Summary shows thread stacks only use small faction of reserved space.
> > > >
> > > > -                    Thread (reserved=41216KB, committed=632KB)
> > > >                              (thread #41)
> > > >                              (stack: reserved=41032KB,
> committed=448KB)
> > > >                              (malloc=137KB #222)
> > > >                              (arena=47KB #76)
> > > >
> > > >
> > > > Detail tracking shows stack guards and actually used/committed stack
> area.
> > > >
> > > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for Thread
> > > > Stack from
> > > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
> > > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
> > > >
> > > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed 16KB
> from
> > > >              [0x00007f67c08b7319] os::pd_create_stack_guard_
> pages(char*,
> > > > unsigned long)+0x29
> > > >              [0x00007f67c0b56851] JavaThread::create_stack_
> guard_pages()
> > > > [clone .part.125]+0xa1
> > > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
> > > >              [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
> > > >
> > > >          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB
> > > >
> > >
> > > Can we capture this in a test so we can tell that the tracking has
> > > improved?
> > >
> > > Thanks,
> > > David
> > > -----
> > >
> > > Thanks,
> > > >
> > > > -Zhengyu
> > > >
> > > >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Andrew Dinn
In reply to this post by Zhengyu Gu-2
On 15/11/17 22:25, Zhengyu Gu wrote:

> This is Linux only enhancement for now, can be extended for other
> platforms. (See bug for details)
>
> Current implementation, thread stack is tracked when thread is created,
> its available stack space is tagged as reserved and committed.
>
> However, this is not how stack works. Stack pages are committed on
> demand, so this approach overstates its memory usage.
>
> This enhancement gathers thread stack usage at NMT query time, so that
> it can report more accurate usage.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html

The fix looks good to me.

It was very handy that get_stack_commited_bottom was already present in
os_linux.cpp for other reasons -- although as David says it really ought
to be spelled correctly with a double t in committed :-)

It is a good idea to try to extend this to other Unix-variant OSes that
support mincore and to Windows (by whatever other means) but I don't
think that should stop this being added to Linux now. This is too useful
and important to omit because . . .

Footprint is a major concern for cloud deployments (where memory use is
part of costing) and Linux underpins a significant proportion of cloud
deployments. More accurate measurement of JVM footprint on Linux is
going to be a great help to Java users when it comes to identifying and
reducing the cost of cloud deployments.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
In reply to this post by David Holmes
Hi David,

Thanks for the suggestions.

On 11/16/2017 01:58 AM, David Holmes wrote:

> Hi Zhengyu,
>
> I can't review this in detail as I'm not familiar with NMT workings, but
> have a couple of comments.
>
> On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>> This is Linux only enhancement for now, can be extended for other
>> platforms. (See bug for details)
>
> I'm concerned about unnecessary platform skew. I added some comments to
> the bug. I think the mincore trick can be used on Solaris and AIX as
> well - but not on OS X / BSD. It may be that VirtualQuery can be used on
> Windows to get the same information - but I'm unclear if the memory
> states support what you're looking for. It would be good to extend this
> to other platforms that will support it.

Windows is definitely doable, I can add Windows support if it helps the
case. But I don't have access to other platforms.

>
> And on a minor note can you please correct the existing spelling error
> in get_stack_commited_bottom. :)
>
Will do.

Thanks,

-Zhengyu


>> Current implementation, thread stack is tracked when thread is
>> created, its available stack space is tagged as reserved and committed.
>>
>> However, this is not how stack works. Stack pages are committed on
>> demand, so this approach overstates its memory usage.
>>
>> This enhancement gathers thread stack usage at NMT query time, so that
>> it can report more accurate usage.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>>
>>
>> Example:
>>
>> Summary shows thread stacks only use small faction of reserved space.
>>
>> -                    Thread (reserved=41216KB, committed=632KB)
>>                              (thread #41)
>>                              (stack: reserved=41032KB, committed=448KB)
>>                              (malloc=137KB #222)
>>                              (arena=47KB #76)
>>
>>
>> Detail tracking shows stack guards and actually used/committed stack
>> area.
>>
>> [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for Thread
>> Stack from
>>      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>>      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>
>>          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed 16KB from
>>              [0x00007f67c08b7319]
>> os::pd_create_stack_guard_pages(char*, unsigned long)+0x29
>>              [0x00007f67c0b56851]
>> JavaThread::create_stack_guard_pages() [clone .part.125]+0xa1
>>              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>>              [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>
>>          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB
>
> Can we capture this in a test so we can tell that the tracking has
> improved?
>
> Thanks,
> David
> -----
>
>> Thanks,
>>
>> -Zhengyu
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
In reply to this post by Thomas Stüfe-2
Hi Thomas,

On 11/16/2017 04:45 AM, Thomas Stüfe wrote:

>
>
> On Thu, Nov 16, 2017 at 10:23 AM, Severin Gehwolf <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:
>     > On Thu, Nov 16, 2017 at 7:58 AM, David Holmes <[hidden email] <mailto:[hidden email]>>
>     > wrote:
>     >
>     > > Hi Zhengyu,
>     > >
>     > > I can't review this in detail as I'm not familiar with NMT workings, but
>     > > have a couple of comments.
>     > >
>     > > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>     > >
>     > > > This is Linux only enhancement for now, can be extended for other
>     > > > platforms. (See bug for details)
>     > > >
>     > >
>     > > I'm concerned about unnecessary platform skew. I added some comments to
>     > > the bug. I think the mincore trick can be used on Solaris and AIX as well -
>     > > but not on OS X / BSD. It may be that VirtualQuery can be used on Windows
>     > > to get the same information - but I'm unclear if the memory states support
>     > > what you're looking for. It would be good to extend this to other platforms
>     > > that will support it.
>     > >
>     >
>     > Just 5c:
>     >
>     > I think this is definitely useful even with only the Linux implementation
>     > provided.
>
>     +1
>
>     While plaform skew is a valid concern, I'm doubtful it's "unnecessary".
>     I think there is value to get one platform fixed and then look at other
>     platforms one by one as follow-up bugs. For some platforms it might be
>     hard to do (or undesirable) as Thomas says.
>
>     Thanks,
>     Severin
>
>
> I volunteer to implement this for windows if it can be done as
> non-time-critical follow up, if no-one else has stepped up in until then.

I can work on Windows solution today if it helps the case :-)

Thanks,

-Zhengyu


>
> ..Thomas
>
>
>      > On AIX we have commit-on-touch and hence mincore may actually
>     force commit
>      > the thread :P - would have to try it but have no high hopes. Well.
>      >
>      > On Windows this is possible, we already use VirtualQuery in our
>     fork to
>      > print out the commit-state of the current thread stack in case of
>     an error.
>      >
>      > ..Thomas
>      >
>      >
>      > > And on a minor note can you please correct the existing
>     spelling error in
>      > > get_stack_commited_bottom. :)
>      > >
>      > > Current implementation, thread stack is tracked when thread is
>     created,
>      > > > its available stack space is tagged as reserved and committed.
>      > > >
>      > > > However, this is not how stack works. Stack pages are
>     committed on
>      > > > demand, so this approach overstates its memory usage.
>      > > >
>      > > > This enhancement gathers thread stack usage at NMT query
>     time, so that it
>      > > > can report more accurate usage.
>      > > >
>      > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>     <https://bugs.openjdk.java.net/browse/JDK-8191369>
>      > > > Webrev:
>     http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>     <http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html>
>      > > >
>      > > >
>      > > > Example:
>      > > >
>      > > > Summary shows thread stacks only use small faction of
>     reserved space.
>      > > >
>      > > > -                    Thread (reserved=41216KB, committed=632KB)
>      > > >                              (thread #41)
>      > > >                              (stack: reserved=41032KB,
>     committed=448KB)
>      > > >                              (malloc=137KB #222)
>      > > >                              (arena=47KB #76)
>      > > >
>      > > >
>      > > > Detail tracking shows stack guards and actually
>     used/committed stack area.
>      > > >
>      > > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for
>     Thread
>      > > > Stack from
>      > > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>      > > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>      > > >
>      > > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed
>     16KB from
>      > > >              [0x00007f67c08b7319]
>     os::pd_create_stack_guard_pages(char*,
>      > > > unsigned long)+0x29
>      > > >              [0x00007f67c0b56851]
>     JavaThread::create_stack_guard_pages()
>      > > > [clone .part.125]+0xa1
>      > > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>      > > >              [0x00007f67c08bbea2]
>     thread_native_entry(Thread*)+0x112
>      > > >
>      > > >          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB
>      > > >
>      > >
>      > > Can we capture this in a test so we can tell that the tracking has
>      > > improved?
>      > >
>      > > Thanks,
>      > > David
>      > > -----
>      > >
>      > > Thanks,
>      > > >
>      > > > -Zhengyu
>      > > >
>      > > >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Thomas Stüfe-2
On Thu, Nov 16, 2017 at 2:25 PM, Zhengyu Gu <[hidden email]> wrote:

> Hi Thomas,
>
> On 11/16/2017 04:45 AM, Thomas Stüfe wrote:
>
>>
>>
>> On Thu, Nov 16, 2017 at 10:23 AM, Severin Gehwolf <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:
>>     > On Thu, Nov 16, 2017 at 7:58 AM, David Holmes <
>> [hidden email] <mailto:[hidden email]>>
>>
>>     > wrote:
>>     >
>>     > > Hi Zhengyu,
>>     > >
>>     > > I can't review this in detail as I'm not familiar with NMT
>> workings, but
>>     > > have a couple of comments.
>>     > >
>>     > > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>>     > >
>>     > > > This is Linux only enhancement for now, can be extended for
>> other
>>     > > > platforms. (See bug for details)
>>     > > >
>>     > >
>>     > > I'm concerned about unnecessary platform skew. I added some
>> comments to
>>     > > the bug. I think the mincore trick can be used on Solaris and AIX
>> as well -
>>     > > but not on OS X / BSD. It may be that VirtualQuery can be used on
>> Windows
>>     > > to get the same information - but I'm unclear if the memory
>> states support
>>     > > what you're looking for. It would be good to extend this to other
>> platforms
>>     > > that will support it.
>>     > >
>>     >
>>     > Just 5c:
>>     >
>>     > I think this is definitely useful even with only the Linux
>> implementation
>>     > provided.
>>
>>     +1
>>
>>     While plaform skew is a valid concern, I'm doubtful it's
>> "unnecessary".
>>     I think there is value to get one platform fixed and then look at
>> other
>>     platforms one by one as follow-up bugs. For some platforms it might be
>>     hard to do (or undesirable) as Thomas says.
>>
>>     Thanks,
>>     Severin
>>
>>
>> I volunteer to implement this for windows if it can be done as
>> non-time-critical follow up, if no-one else has stepped up in until then.
>>
>
> I can work on Windows solution today if it helps the case :-)
>
> Thanks,
>
> -Zhengyu
>
>
>
:) Sure, fine by me!

..Thomas


>
>
>> ..Thomas
>>
>>
>>      > On AIX we have commit-on-touch and hence mincore may actually
>>     force commit
>>      > the thread :P - would have to try it but have no high hopes. Well.
>>      >
>>      > On Windows this is possible, we already use VirtualQuery in our
>>     fork to
>>      > print out the commit-state of the current thread stack in case of
>>     an error.
>>      >
>>      > ..Thomas
>>      >
>>      >
>>      > > And on a minor note can you please correct the existing
>>     spelling error in
>>      > > get_stack_commited_bottom. :)
>>      > >
>>      > > Current implementation, thread stack is tracked when thread is
>>     created,
>>      > > > its available stack space is tagged as reserved and committed.
>>      > > >
>>      > > > However, this is not how stack works. Stack pages are
>>     committed on
>>      > > > demand, so this approach overstates its memory usage.
>>      > > >
>>      > > > This enhancement gathers thread stack usage at NMT query
>>     time, so that it
>>      > > > can report more accurate usage.
>>      > > >
>>      > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>>     <https://bugs.openjdk.java.net/browse/JDK-8191369>
>>      > > > Webrev:
>>     http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>>     <http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html>
>>      > > >
>>      > > >
>>      > > > Example:
>>      > > >
>>      > > > Summary shows thread stacks only use small faction of
>>     reserved space.
>>      > > >
>>      > > > -                    Thread (reserved=41216KB, committed=632KB)
>>      > > >                              (thread #41)
>>      > > >                              (stack: reserved=41032KB,
>>     committed=448KB)
>>      > > >                              (malloc=137KB #222)
>>      > > >                              (arena=47KB #76)
>>      > > >
>>      > > >
>>      > > > Detail tracking shows stack guards and actually
>>     used/committed stack area.
>>      > > >
>>      > > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for
>>     Thread
>>      > > > Stack from
>>      > > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>>      > > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>      > > >
>>      > > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed
>>     16KB from
>>      > > >              [0x00007f67c08b7319]
>>     os::pd_create_stack_guard_pages(char*,
>>      > > > unsigned long)+0x29
>>      > > >              [0x00007f67c0b56851]
>>     JavaThread::create_stack_guard_pages()
>>      > > > [clone .part.125]+0xa1
>>      > > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>>      > > >              [0x00007f67c08bbea2]
>>     thread_native_entry(Thread*)+0x112
>>      > > >
>>      > > >          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed
>> 4KB
>>      > > >
>>      > >
>>      > > Can we capture this in a test so we can tell that the tracking
>> has
>>      > > improved?
>>      > >
>>      > > Thanks,
>>      > > David
>>      > > -----
>>      > >
>>      > > Thanks,
>>      > > >
>>      > > > -Zhengyu
>>      > > >
>>      > > >
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
In reply to this post by Andrew Dinn
Thanks for the quick review, Andrew!

-Zhengyu

On 11/16/2017 04:48 AM, Andrew Dinn wrote:

> On 15/11/17 22:25, Zhengyu Gu wrote:
>> This is Linux only enhancement for now, can be extended for other
>> platforms. (See bug for details)
>>
>> Current implementation, thread stack is tracked when thread is created,
>> its available stack space is tagged as reserved and committed.
>>
>> However, this is not how stack works. Stack pages are committed on
>> demand, so this approach overstates its memory usage.
>>
>> This enhancement gathers thread stack usage at NMT query time, so that
>> it can report more accurate usage.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>
> The fix looks good to me.
>
> It was very handy that get_stack_commited_bottom was already present in
> os_linux.cpp for other reasons -- although as David says it really ought
> to be spelled correctly with a double t in committed :-)
>
> It is a good idea to try to extend this to other Unix-variant OSes that
> support mincore and to Windows (by whatever other means) but I don't
> think that should stop this being added to Linux now. This is too useful
> and important to omit because . . .
>
> Footprint is a major concern for cloud deployments (where memory use is
> part of costing) and Linux underpins a significant proportion of cloud
> deployments. More accurate measurement of JVM footprint on Linux is
> going to be a great help to Java users when it comes to identifying and
> reducing the cost of cloud deployments.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Andrew Haley
In reply to this post by David Holmes
On 16/11/17 06:58, David Holmes wrote:

> I can't review this in detail as I'm not familiar with NMT workings, but
> have a couple of comments.
>
> On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>> This is Linux only enhancement for now, can be extended for other
>> platforms. (See bug for details)
>
> I'm concerned about unnecessary platform skew. I added some comments to
> the bug. I think the mincore trick can be used on Solaris and AIX as
> well - but not on OS X / BSD. It may be that VirtualQuery can be used on
> Windows to get the same information - but I'm unclear if the memory
> states support what you're looking for. It would be good to extend this
> to other platforms that will support it.

Sure, but as OpenJDK becomes more open and hopefully more diverse, it
becomes less and less practical to keep everything in step.  I favour
the classic free software approach, which is to encourage people to
create new features and everyone else can fill in the gaps, depending
on how much time and effort is available.  It's this approach that
allows GCC to support a vast range of targets (and hosts) while not
holding back development.  I have consistently argued for this in the
Governing Board.

So, let's get this done.  It's very important for container memory
monitoring, which right now usually means Linux.

--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
In reply to this post by Thomas Stüfe-2
Hi,

I added Windows support and updated bug to make this enhancement Linux
and Windows only.

As Thomas suggested, other platforms can be addressed by followup fixes.

Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/

I reran hotspot_tier1_runtime on Linux x64 and Windows 10 (fastdebug +
release)

Thanks,

-Zhengyu



On 11/16/2017 04:45 AM, Thomas Stüfe wrote:

>
>
> On Thu, Nov 16, 2017 at 10:23 AM, Severin Gehwolf <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:
>     > On Thu, Nov 16, 2017 at 7:58 AM, David Holmes <[hidden email] <mailto:[hidden email]>>
>     > wrote:
>     >
>     > > Hi Zhengyu,
>     > >
>     > > I can't review this in detail as I'm not familiar with NMT workings, but
>     > > have a couple of comments.
>     > >
>     > > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>     > >
>     > > > This is Linux only enhancement for now, can be extended for other
>     > > > platforms. (See bug for details)
>     > > >
>     > >
>     > > I'm concerned about unnecessary platform skew. I added some comments to
>     > > the bug. I think the mincore trick can be used on Solaris and AIX as well -
>     > > but not on OS X / BSD. It may be that VirtualQuery can be used on Windows
>     > > to get the same information - but I'm unclear if the memory states support
>     > > what you're looking for. It would be good to extend this to other platforms
>     > > that will support it.
>     > >
>     >
>     > Just 5c:
>     >
>     > I think this is definitely useful even with only the Linux implementation
>     > provided.
>
>     +1
>
>     While plaform skew is a valid concern, I'm doubtful it's "unnecessary".
>     I think there is value to get one platform fixed and then look at other
>     platforms one by one as follow-up bugs. For some platforms it might be
>     hard to do (or undesirable) as Thomas says.
>
>     Thanks,
>     Severin
>
>
> I volunteer to implement this for windows if it can be done as
> non-time-critical follow up, if no-one else has stepped up in until then.
>
> ..Thomas
>
>
>      > On AIX we have commit-on-touch and hence mincore may actually
>     force commit
>      > the thread :P - would have to try it but have no high hopes. Well.
>      >
>      > On Windows this is possible, we already use VirtualQuery in our
>     fork to
>      > print out the commit-state of the current thread stack in case of
>     an error.
>      >
>      > ..Thomas
>      >
>      >
>      > > And on a minor note can you please correct the existing
>     spelling error in
>      > > get_stack_commited_bottom. :)
>      > >
>      > > Current implementation, thread stack is tracked when thread is
>     created,
>      > > > its available stack space is tagged as reserved and committed.
>      > > >
>      > > > However, this is not how stack works. Stack pages are
>     committed on
>      > > > demand, so this approach overstates its memory usage.
>      > > >
>      > > > This enhancement gathers thread stack usage at NMT query
>     time, so that it
>      > > > can report more accurate usage.
>      > > >
>      > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>     <https://bugs.openjdk.java.net/browse/JDK-8191369>
>      > > > Webrev:
>     http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>     <http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html>
>      > > >
>      > > >
>      > > > Example:
>      > > >
>      > > > Summary shows thread stacks only use small faction of
>     reserved space.
>      > > >
>      > > > -                    Thread (reserved=41216KB, committed=632KB)
>      > > >                              (thread #41)
>      > > >                              (stack: reserved=41032KB,
>     committed=448KB)
>      > > >                              (malloc=137KB #222)
>      > > >                              (arena=47KB #76)
>      > > >
>      > > >
>      > > > Detail tracking shows stack guards and actually
>     used/committed stack area.
>      > > >
>      > > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for
>     Thread
>      > > > Stack from
>      > > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>      > > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>      > > >
>      > > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed
>     16KB from
>      > > >              [0x00007f67c08b7319]
>     os::pd_create_stack_guard_pages(char*,
>      > > > unsigned long)+0x29
>      > > >              [0x00007f67c0b56851]
>     JavaThread::create_stack_guard_pages()
>      > > > [clone .part.125]+0xa1
>      > > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>      > > >              [0x00007f67c08bbea2]
>     thread_native_entry(Thread*)+0x112
>      > > >
>      > > >          [0x00007f66e96e7000 - 0x00007f66e96e8000] committed 4KB
>      > > >
>      > >
>      > > Can we capture this in a test so we can tell that the tracking has
>      > > improved?
>      > >
>      > > Thanks,
>      > > David
>      > > -----
>      > >
>      > > Thanks,
>      > > >
>      > > > -Zhengyu
>      > > >
>      > > >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
Sorry, I have to withdraw this code review request, since the stack
sizes reported on Linux do not match /proc/<pid>/smaps.

I am debugging the problem, will submit new request later.

Thanks,

-Zhengyu

On 11/16/2017 01:30 PM, Zhengyu Gu wrote:

> Hi,
>
> I added Windows support and updated bug to make this enhancement Linux
> and Windows only.
>
> As Thomas suggested, other platforms can be addressed by followup fixes.
>
> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>
> I reran hotspot_tier1_runtime on Linux x64 and Windows 10 (fastdebug +
> release)
>
> Thanks,
>
> -Zhengyu
>
>
>
> On 11/16/2017 04:45 AM, Thomas Stüfe wrote:
>>
>>
>> On Thu, Nov 16, 2017 at 10:23 AM, Severin Gehwolf <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:
>>     > On Thu, Nov 16, 2017 at 7:58 AM, David Holmes
>> <[hidden email] <mailto:[hidden email]>>
>>     > wrote:
>>     >
>>     > > Hi Zhengyu,
>>     > >
>>     > > I can't review this in detail as I'm not familiar with NMT
>> workings, but
>>     > > have a couple of comments.
>>     > >
>>     > > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>>     > >
>>     > > > This is Linux only enhancement for now, can be extended for
>> other
>>     > > > platforms. (See bug for details)
>>     > > >
>>     > >
>>     > > I'm concerned about unnecessary platform skew. I added some
>> comments to
>>     > > the bug. I think the mincore trick can be used on Solaris and
>> AIX as well -
>>     > > but not on OS X / BSD. It may be that VirtualQuery can be used
>> on Windows
>>     > > to get the same information - but I'm unclear if the memory
>> states support
>>     > > what you're looking for. It would be good to extend this to
>> other platforms
>>     > > that will support it.
>>     > >
>>     >
>>     > Just 5c:
>>     >
>>     > I think this is definitely useful even with only the Linux
>> implementation
>>     > provided.
>>
>>     +1
>>
>>     While plaform skew is a valid concern, I'm doubtful it's
>> "unnecessary".
>>     I think there is value to get one platform fixed and then look at
>> other
>>     platforms one by one as follow-up bugs. For some platforms it
>> might be
>>     hard to do (or undesirable) as Thomas says.
>>
>>     Thanks,
>>     Severin
>>
>>
>> I volunteer to implement this for windows if it can be done as
>> non-time-critical follow up, if no-one else has stepped up in until then.
>>
>> ..Thomas
>>
>>
>>      > On AIX we have commit-on-touch and hence mincore may actually
>>     force commit
>>      > the thread :P - would have to try it but have no high hopes. Well.
>>      >
>>      > On Windows this is possible, we already use VirtualQuery in our
>>     fork to
>>      > print out the commit-state of the current thread stack in case of
>>     an error.
>>      >
>>      > ..Thomas
>>      >
>>      >
>>      > > And on a minor note can you please correct the existing
>>     spelling error in
>>      > > get_stack_commited_bottom. :)
>>      > >
>>      > > Current implementation, thread stack is tracked when thread is
>>     created,
>>      > > > its available stack space is tagged as reserved and committed.
>>      > > >
>>      > > > However, this is not how stack works. Stack pages are
>>     committed on
>>      > > > demand, so this approach overstates its memory usage.
>>      > > >
>>      > > > This enhancement gathers thread stack usage at NMT query
>>     time, so that it
>>      > > > can report more accurate usage.
>>      > > >
>>      > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>>     <https://bugs.openjdk.java.net/browse/JDK-8191369>
>>      > > > Webrev:
>>     http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>>     <http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html>
>>      > > >
>>      > > >
>>      > > > Example:
>>      > > >
>>      > > > Summary shows thread stacks only use small faction of
>>     reserved space.
>>      > > >
>>      > > > -                    Thread (reserved=41216KB,
>> committed=632KB)
>>      > > >                              (thread #41)
>>      > > >                              (stack: reserved=41032KB,
>>     committed=448KB)
>>      > > >                              (malloc=137KB #222)
>>      > > >                              (arena=47KB #76)
>>      > > >
>>      > > >
>>      > > > Detail tracking shows stack guards and actually
>>     used/committed stack area.
>>      > > >
>>      > > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for
>>     Thread
>>      > > > Stack from
>>      > > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>>      > > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>      > > >
>>      > > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed
>>     16KB from
>>      > > >              [0x00007f67c08b7319]
>>     os::pd_create_stack_guard_pages(char*,
>>      > > > unsigned long)+0x29
>>      > > >              [0x00007f67c0b56851]
>>     JavaThread::create_stack_guard_pages()
>>      > > > [clone .part.125]+0xa1
>>      > > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>>      > > >              [0x00007f67c08bbea2]
>>     thread_native_entry(Thread*)+0x112
>>      > > >
>>      > > >          [0x00007f66e96e7000 - 0x00007f66e96e8000]
>> committed 4KB
>>      > > >
>>      > >
>>      > > Can we capture this in a test so we can tell that the
>> tracking has
>>      > > improved?
>>      > >
>>      > > Thanks,
>>      > > David
>>      > > -----
>>      > >
>>      > > Thanks,
>>      > > >
>>      > > > -Zhengyu
>>      > > >
>>      > > >
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

David Holmes
In reply to this post by Andrew Haley
Hi Andrew,

On 17/11/2017 1:46 AM, Andrew Haley wrote:

> On 16/11/17 06:58, David Holmes wrote:
>
>> I can't review this in detail as I'm not familiar with NMT workings, but
>> have a couple of comments.
>>
>> On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>>> This is Linux only enhancement for now, can be extended for other
>>> platforms. (See bug for details)
>>
>> I'm concerned about unnecessary platform skew. I added some comments to
>> the bug. I think the mincore trick can be used on Solaris and AIX as
>> well - but not on OS X / BSD. It may be that VirtualQuery can be used on
>> Windows to get the same information - but I'm unclear if the memory
>> states support what you're looking for. It would be good to extend this
>> to other platforms that will support it.
>
> Sure, but as OpenJDK becomes more open and hopefully more diverse, it
> becomes less and less practical to keep everything in step.  I favour
> the classic free software approach, which is to encourage people to
> create new features and everyone else can fill in the gaps, depending
> on how much time and effort is available.  It's this approach that
> allows GCC to support a vast range of targets (and hosts) while not
> holding back development.  I have consistently argued for this in the
> Governing Board.

For OpenJDK I hope there are sufficient safety-nets in place to ensure
such an approach does not lead to a product where you never know what
will and won't work on each platform. Whenever we have such a case we
should at least file sub-tasks for supplying the functionality on the
missing platforms. We can then determine whether any such missing
functionality is critical to fix before a release.

David
-----

> So, let's get this done.  It's very important for container memory
> monitoring, which right now usually means Linux.
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Thomas Stüfe-2
On Fri, Nov 17, 2017 at 2:46 AM, David Holmes <[hidden email]>
wrote:

> Hi Andrew,
>
> On 17/11/2017 1:46 AM, Andrew Haley wrote:
>
>> On 16/11/17 06:58, David Holmes wrote:
>>
>> I can't review this in detail as I'm not familiar with NMT workings, but
>>> have a couple of comments.
>>>
>>> On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>>>
>>>> This is Linux only enhancement for now, can be extended for other
>>>> platforms. (See bug for details)
>>>>
>>>
>>> I'm concerned about unnecessary platform skew. I added some comments to
>>> the bug. I think the mincore trick can be used on Solaris and AIX as
>>> well - but not on OS X / BSD. It may be that VirtualQuery can be used on
>>> Windows to get the same information - but I'm unclear if the memory
>>> states support what you're looking for. It would be good to extend this
>>> to other platforms that will support it.
>>>
>>
>> Sure, but as OpenJDK becomes more open and hopefully more diverse, it
>> becomes less and less practical to keep everything in step.  I favour
>> the classic free software approach, which is to encourage people to
>> create new features and everyone else can fill in the gaps, depending
>> on how much time and effort is available.  It's this approach that
>> allows GCC to support a vast range of targets (and hosts) while not
>> holding back development.  I have consistently argued for this in the
>> Governing Board.
>>
>
> For OpenJDK I hope there are sufficient safety-nets in place to ensure
> such an approach does not lead to a product where you never know what will
> and won't work on each platform. Whenever we have such a case we should at
> least file sub-tasks for supplying the functionality on the missing
> platforms. We can then determine whether any such missing functionality is
> critical to fix before a release.
>
>
I agree. Also, I find that for platform-independent monitoring facilities
like NMT, I really just want to be able to trust the numbers without having
to peek at the implementation.

..Thomas


> David
> -----
>
>
> So, let's get this done.  It's very important for container memory
>> monitoring, which right now usually means Linux.
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Andrew Haley
In reply to this post by David Holmes
On 17/11/17 01:46, David Holmes wrote:

>> Sure, but as OpenJDK becomes more open and hopefully more diverse, it
>> becomes less and less practical to keep everything in step.  I favour
>> the classic free software approach, which is to encourage people to
>> create new features and everyone else can fill in the gaps, depending
>> on how much time and effort is available.  It's this approach that
>> allows GCC to support a vast range of targets (and hosts) while not
>> holding back development.  I have consistently argued for this in the
>> Governing Board.
>
> For OpenJDK I hope there are sufficient safety-nets in place to ensure
> such an approach does not lead to a product where you never know what
> will and won't work on each platform.

I don't believe that's even desirable for OpenJDK.  Inevitably some
minority platforms won't get that much attention, but as long as every
platform is Java-compatible (in the TCK sense) we're doing what we
should.

> Whenever we have such a case we should at least file sub-tasks for
> supplying the functionality on the missing platforms. We can then
> determine whether any such missing functionality is critical to fix
> before a release.

Fair enough.  But with quick-fire releases we'll inevitably have to
slip some features on some platforms.

--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Andrew Haley
In reply to this post by Thomas Stüfe-2
On 17/11/17 06:54, Thomas Stüfe wrote:
> I agree. Also, I find that for platform-independent monitoring facilities
> like NMT, I really just want to be able to trust the numbers without having
> to peek at the implementation.

Right now you can't trust the numbers.  We're talking about fixing that
on some platforms.

--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Thomas Stüfe-2
On Fri, Nov 17, 2017 at 10:36 AM, Andrew Haley <[hidden email]> wrote:

> On 17/11/17 06:54, Thomas Stüfe wrote:
> > I agree. Also, I find that for platform-independent monitoring facilities
> > like NMT, I really just want to be able to trust the numbers without
> having
> > to peek at the implementation.
>
> Right now you can't trust the numbers.  We're talking about fixing that
> on some platforms.
>
>
I did not argue against the patch. If you read the thread, you'll see that
I was in favour of it even if with only the Linux implementation provided.

But I'd like to see unimplemented platforms tracked in follow up issues as
David proposed. Because it is easy to loose track of the many things to
port. In this case, we'd need a follow up issue for Solaris and AIX; the
latter you can assign to me.

Best Regards, Thomas


> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Zhengyu Gu-2
In reply to this post by Zhengyu Gu-2
There were two bugs that resulted NMT report incorrect thread stacks.

1) NMT did not handle overlapped regions correctly.

2) There are two issues with Linux: get_stack_commited_bottom()

  * The method name is misleading, it does not return *committed*
bottom, but *mapped* bottom, where address space is valid, not necessary
committed.
    I renamed this function to get_stack_mapped_bottom() with option to
return committed bottom with proper parameter.

  * The binary search algorithm mishandled edge cases.

The test case for verifying fixes for above two issues can be found
here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c


As mentioned in early email, new patch also contains implementation for
Windows.

Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/

Test:

   Reran hotspot_tier1 tests on Windows x64 and Linux x64 (fastdebug and
release)

Thanks,

-Zhengyu


On 11/16/2017 03:42 PM, Zhengyu Gu wrote:

> Sorry, I have to withdraw this code review request, since the stack
> sizes reported on Linux do not match /proc/<pid>/smaps.
>
> I am debugging the problem, will submit new request later.
>
> Thanks,
>
> -Zhengyu
>
> On 11/16/2017 01:30 PM, Zhengyu Gu wrote:
>> Hi,
>>
>> I added Windows support and updated bug to make this enhancement Linux
>> and Windows only.
>>
>> As Thomas suggested, other platforms can be addressed by followup fixes.
>>
>> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>>
>> I reran hotspot_tier1_runtime on Linux x64 and Windows 10 (fastdebug +
>> release)
>>
>> Thanks,
>>
>> -Zhengyu
>>
>>
>>
>> On 11/16/2017 04:45 AM, Thomas Stüfe wrote:
>>>
>>>
>>> On Thu, Nov 16, 2017 at 10:23 AM, Severin Gehwolf
>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>>     On Thu, 2017-11-16 at 08:14 +0100, Thomas Stüfe wrote:
>>>     > On Thu, Nov 16, 2017 at 7:58 AM, David Holmes
>>> <[hidden email] <mailto:[hidden email]>>
>>>     > wrote:
>>>     >
>>>     > > Hi Zhengyu,
>>>     > >
>>>     > > I can't review this in detail as I'm not familiar with NMT
>>> workings, but
>>>     > > have a couple of comments.
>>>     > >
>>>     > > On 16/11/2017 8:25 AM, Zhengyu Gu wrote:
>>>     > >
>>>     > > > This is Linux only enhancement for now, can be extended for
>>> other
>>>     > > > platforms. (See bug for details)
>>>     > > >
>>>     > >
>>>     > > I'm concerned about unnecessary platform skew. I added some
>>> comments to
>>>     > > the bug. I think the mincore trick can be used on Solaris and
>>> AIX as well -
>>>     > > but not on OS X / BSD. It may be that VirtualQuery can be
>>> used on Windows
>>>     > > to get the same information - but I'm unclear if the memory
>>> states support
>>>     > > what you're looking for. It would be good to extend this to
>>> other platforms
>>>     > > that will support it.
>>>     > >
>>>     >
>>>     > Just 5c:
>>>     >
>>>     > I think this is definitely useful even with only the Linux
>>> implementation
>>>     > provided.
>>>
>>>     +1
>>>
>>>     While plaform skew is a valid concern, I'm doubtful it's
>>> "unnecessary".
>>>     I think there is value to get one platform fixed and then look at
>>> other
>>>     platforms one by one as follow-up bugs. For some platforms it
>>> might be
>>>     hard to do (or undesirable) as Thomas says.
>>>
>>>     Thanks,
>>>     Severin
>>>
>>>
>>> I volunteer to implement this for windows if it can be done as
>>> non-time-critical follow up, if no-one else has stepped up in until
>>> then.
>>>
>>> ..Thomas
>>>
>>>
>>>      > On AIX we have commit-on-touch and hence mincore may actually
>>>     force commit
>>>      > the thread :P - would have to try it but have no high hopes.
>>> Well.
>>>      >
>>>      > On Windows this is possible, we already use VirtualQuery in our
>>>     fork to
>>>      > print out the commit-state of the current thread stack in case of
>>>     an error.
>>>      >
>>>      > ..Thomas
>>>      >
>>>      >
>>>      > > And on a minor note can you please correct the existing
>>>     spelling error in
>>>      > > get_stack_commited_bottom. :)
>>>      > >
>>>      > > Current implementation, thread stack is tracked when thread is
>>>     created,
>>>      > > > its available stack space is tagged as reserved and
>>> committed.
>>>      > > >
>>>      > > > However, this is not how stack works. Stack pages are
>>>     committed on
>>>      > > > demand, so this approach overstates its memory usage.
>>>      > > >
>>>      > > > This enhancement gathers thread stack usage at NMT query
>>>     time, so that it
>>>      > > > can report more accurate usage.
>>>      > > >
>>>      > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>>>     <https://bugs.openjdk.java.net/browse/JDK-8191369>
>>>      > > > Webrev:
>>>     http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html
>>>     <http://cr.openjdk.java.net/~zgu/8191369/webrev.00/index.html>
>>>      > > >
>>>      > > >
>>>      > > > Example:
>>>      > > >
>>>      > > > Summary shows thread stacks only use small faction of
>>>     reserved space.
>>>      > > >
>>>      > > > -                    Thread (reserved=41216KB,
>>> committed=632KB)
>>>      > > >                              (thread #41)
>>>      > > >                              (stack: reserved=41032KB,
>>>     committed=448KB)
>>>      > > >                              (malloc=137KB #222)
>>>      > > >                              (arena=47KB #76)
>>>      > > >
>>>      > > >
>>>      > > > Detail tracking shows stack guards and actually
>>>     used/committed stack area.
>>>      > > >
>>>      > > > [0x00007f66e95e7000 - 0x00007f66e96e8000] reserved 1028KB for
>>>     Thread
>>>      > > > Stack from
>>>      > > >      [0x00007f67c0b57d5c] JavaThread::run()+0x2c
>>>      > > >      [0x00007f67c08bbea2] thread_native_entry(Thread*)+0x112
>>>      > > >
>>>      > > >          [0x00007f66e95e7000 - 0x00007f66e95eb000] committed
>>>     16KB from
>>>      > > >              [0x00007f67c08b7319]
>>>     os::pd_create_stack_guard_pages(char*,
>>>      > > > unsigned long)+0x29
>>>      > > >              [0x00007f67c0b56851]
>>>     JavaThread::create_stack_guard_pages()
>>>      > > > [clone .part.125]+0xa1
>>>      > > >              [0x00007f67c0b57d6e] JavaThread::run()+0x3e
>>>      > > >              [0x00007f67c08bbea2]
>>>     thread_native_entry(Thread*)+0x112
>>>      > > >
>>>      > > >          [0x00007f66e96e7000 - 0x00007f66e96e8000]
>>> committed 4KB
>>>      > > >
>>>      > >
>>>      > > Can we capture this in a test so we can tell that the
>>> tracking has
>>>      > > improved?
>>>      > >
>>>      > > Thanks,
>>>      > > David
>>>      > > -----
>>>      > >
>>>      > > Thanks,
>>>      > > >
>>>      > > > -Zhengyu
>>>      > > >
>>>      > > >
>>>
>>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XS) 8191369: NMT: Enhance thread stack tracking

Andrew Dinn
Hi Zhengyu,


On 17/11/17 20:43, Zhengyu Gu wrote:

> There were two bugs that resulted NMT report incorrect thread stacks.
>
> 1) NMT did not handle overlapped regions correctly.
>
> 2) There are two issues with Linux: get_stack_commited_bottom()
>
>  * The method name is misleading, it does not return *committed* bottom,
> but *mapped* bottom, where address space is valid, not necessary committed.
>    I renamed this function to get_stack_mapped_bottom() with option to
> return committed bottom with proper parameter.
>
>  * The binary search algorithm mishandled edge cases.

Hmm, yes looking through the mincore man page again you are right that
you need to check the returned bit vector to see if a vmem page is
committed to a physical page.

I'm not sure about the logic of the test when you hit the the
ENOMEM/uncommited case.

+    mincore_return_value = mincore(test_addr, page_sz, vec);
+
+    if (mincore_return_value == -1 || (committed_only && (vec[0] &
0x01) == 0)) {
+      // Page is not mapped/committed go up
+      // to find first mapped/committed page
+      if (errno != EAGAIN || (committed_only && (vec[0] & 0x01) == 0)) {
+        assert(mincore_return_value != -1 || errno == ENOMEM,
"Unexpected mincore errno");

Should that not be

  +      if ((mincore_return_value != -1 && errno != EAGAIN) ||
(committed_only && (vec[0] & 0x01) == 0)) {

A successful call to mincore is not guaranteed to reset errno and it
might by chance already have value EAGAIN before the call to mincore. If
we happened to hit a mapped, committed page first time then the vec bit
will be 1 and the code will loop and retest the same address. Unlikely
but possible?


The adjustment to the edge case logic now looks correct.

> The test case for verifying fixes for above two issues can be found
> here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c

Yes, that verifies the edge case handling e.g. I ran it with 256 pages
and with 127/8/9 mapped and it gave the right results.

> As mentioned in early email, new patch also contains implementation for
> Windows.
>
> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>
> Test:
>
>   Reran hotspot_tier1 tests on Windows x64 and Linux x64 (fastdebug and
> release)
I have no idea about the validity of the Windows fix but the Linux one
looks good modulo that if condition.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
123