Determining "GC" memory area size

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Determining "GC" memory area size

Glyn Normington
tl;dr: How is the size of the "GC" memory area (as reported by NMT)
determined?

The open source project I work on is running Java applications in Linux
containers which result in processes being killed when the container's
defined memory size, essentially in terms of pages of RAM, is exceeded.
When this happens, users don't get any reasonable feedback to know whether
the heap, metaspace, etc. is the problem and what to do about it.

We have two components which attempt to help with this situation:

1. Java memory calculator (
https://github.com/cloudfoundry/java-buildpack-memory-calculator)

This takes the container memory size together with an estimate of the
number of loaded classes and threads consumed by the application and sets
various JVM memory settings such as heap, metaspace, etc. The goal is to
prevent the JVM from using more memory than the container has available, so
that container OOM does not occur and if the JVM runs out of memory, it
does so in a diagnosable way.

2. jvmkill JVMTI agent (https://github.com/cloudfoundry/jvmkill)

When the JVM hits a resource exhaustion event, due either to lack of memory
or threads, this agent prints various diagnostics to help the user decide
what needs to be done to avoid the problem in future. If a threshold is
exceeded, the agent then kills the JVM, otherwise the agent returns to the
JVM and allows OutOfMemoryError to be thrown.

One of our users recently found (see [1] below for details) that the memory
calculator is not taking the "GC" memory area into account. Consequently, a
JVM can exceed the container's memory size which means the user doesn't get
any helpful diagnostics from either jvmkill or an OutOfMemoryError. Using
NMT, the user observed that "GC" memory seems to be about 5% of the heap
size for heaps of a few GB in size.

Can anyone here tell me how the GC memory area size is determined? If there
is documentation, so much the better as we'd prefer not to depend on code
details that might flux arbitrarily.

--
Regards,
Glyn
PS. Apologies for cross-posting this from hotspot-gc-use - got no reply
there.
Reply | Threaded
Open this post in threaded view
|

Re: Determining "GC" memory area size

Aleksey Shipilev-4
On 11/10/2017 06:06 PM, Glyn Normington wrote:
> Can anyone here tell me how the GC memory area size is determined? If there
> is documentation, so much the better as we'd prefer not to depend on code
> details that might flux arbitrarily.

NMT reports all allocations from GC code (with mtGC tag) as "GC". This includes, notably, Java heap
itself, and all auxiliary GC data structures. If you grep the OpenJDK source for "mtGC", you can get
the idea what allocations are tagged. Also, "detailed" NMT mode would give you allocation stacks,
which can also give some insight what those particular allocated bits are coming from.

The size of GC data structures is dependent on GC in question (and, quite probably, depends on GC
implementation *version*, with swings back and forth), and thus is not trivially deducible. For
simpler collectors, like Serial and Parallel, it would probably be dominated by Card Table (1/512-th
of heap size). In G1, it is most probably mark bitmaps (2/64-th of heap size), plus other
fine-grained remembered sets (which might get large, but maybe there is a high bound?). Etc.

Thanks,
-Aleksey


Reply | Threaded
Open this post in threaded view
|

Re: Determining "GC" memory area size

Zhengyu Gu-2
In reply to this post by Glyn Normington
Hi Glyn,

>
> Can anyone here tell me how the GC memory area size is determined? If there
> is documentation, so much the better as we'd prefer not to depend on code
> details that might flux arbitrarily.
>

GC memory is mainly data structures used by GC runtime. It can be varied
by collector used, size of the Java heap, the number of GC threads and
etc. and, of course, the application itself.

Some are *fixed* costs, which can be estimated. E.g. two marking bitmaps
used by G1, each costs 1/64 of heap size (assuming default object
alignment).

Some are *semi-fixed*, e.g. taskqueue's fixed cost is about 1M for each
queue on 64-bits VM, but it can overflow. And the number of task queues
is proportional to the number of GC threads.

Then there are factors from application itself,  such as object mutation
rate, inter-generation/region references, etc.


I don't see a single formula for estimating GC memory size. If you are
using G1, the biggest overhead comes from 2 bitmaps (1/32 * heap size).

Thanks,

-Zhengyu



Reply | Threaded
Open this post in threaded view
|

Re: Determining "GC" memory area size

Zhengyu Gu-2
In reply to this post by Aleksey Shipilev-4


On 11/10/2017 01:47 PM, Aleksey Shipilev wrote:
> On 11/10/2017 06:06 PM, Glyn Normington wrote:
>> Can anyone here tell me how the GC memory area size is determined? If there
>> is documentation, so much the better as we'd prefer not to depend on code
>> details that might flux arbitrarily.
>
> NMT reports all allocations from GC code (with mtGC tag) as "GC". This includes, notably, Java heap
> itself, and all auxiliary GC data structures. If you grep the OpenJDK source for "mtGC", you can get
> the idea what allocations are tagged. Also, "detailed" NMT mode would give you allocation stacks,
> which can also give some insight what those particular allocated bits are coming from.

Java heap actually is *not* tagged as GC memory, but Java heap.

-Zhengyu

>
> The size of GC data structures is dependent on GC in question (and, quite probably, depends on GC
> implementation *version*, with swings back and forth), and thus is not trivially deducible. For
> simpler collectors, like Serial and Parallel, it would probably be dominated by Card Table (1/512-th
> of heap size). In G1, it is most probably mark bitmaps (2/64-th of heap size), plus other
> fine-grained remembered sets (which might get large, but maybe there is a high bound?). Etc.
>
> Thanks,
> -Aleksey
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Determining "GC" memory area size

Aleksey Shipilev-4
On 11/10/2017 08:11 PM, Zhengyu Gu wrote:

>
>
> On 11/10/2017 01:47 PM, Aleksey Shipilev wrote:
>> On 11/10/2017 06:06 PM, Glyn Normington wrote:
>>> Can anyone here tell me how the GC memory area size is determined? If there
>>> is documentation, so much the better as we'd prefer not to depend on code
>>> details that might flux arbitrarily.
>>
>> NMT reports all allocations from GC code (with mtGC tag) as "GC". This includes, notably, Java heap
>> itself, and all auxiliary GC data structures. If you grep the OpenJDK source for "mtGC", you can get
>> the idea what allocations are tagged. Also, "detailed" NMT mode would give you allocation stacks,
>> which can also give some insight what those particular allocated bits are coming from.
>
> Java heap actually is *not* tagged as GC memory, but Java heap.

Right. Sorry for the confusion.

-Aleksey

Reply | Threaded
Open this post in threaded view
|

Re: Determining "GC" memory area size

yumin qi-2
In reply to this post by Zhengyu Gu-2
Hi, Glyn
http://openjdk.java.net/jeps/8182070
Upon its implementation, Java can get information of container and
confiture its own parameters for running in the container, more container
friendly.

Yumin

On Fri, Nov 10, 2017 at 11:11 AM, Zhengyu Gu <[hidden email]> wrote:

>
>
> On 11/10/2017 01:47 PM, Aleksey Shipilev wrote:
>
>> On 11/10/2017 06:06 PM, Glyn Normington wrote:
>>
>>> Can anyone here tell me how the GC memory area size is determined? If
>>> there
>>> is documentation, so much the better as we'd prefer not to depend on code
>>> details that might flux arbitrarily.
>>>
>>
>> NMT reports all allocations from GC code (with mtGC tag) as "GC". This
>> includes, notably, Java heap
>> itself, and all auxiliary GC data structures. If you grep the OpenJDK
>> source for "mtGC", you can get
>> the idea what allocations are tagged. Also, "detailed" NMT mode would
>> give you allocation stacks,
>> which can also give some insight what those particular allocated bits are
>> coming from.
>>
>
> Java heap actually is *not* tagged as GC memory, but Java heap.
>
> -Zhengyu
>
>
>> The size of GC data structures is dependent on GC in question (and, quite
>> probably, depends on GC
>> implementation *version*, with swings back and forth), and thus is not
>> trivially deducible. For
>> simpler collectors, like Serial and Parallel, it would probably be
>> dominated by Card Table (1/512-th
>> of heap size). In G1, it is most probably mark bitmaps (2/64-th of heap
>> size), plus other
>> fine-grained remembered sets (which might get large, but maybe there is a
>> high bound?). Etc.
>>
>> Thanks,
>> -Aleksey
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Determining "GC" memory area size

Glyn Normington
Thanks Aleksey, Zhenghu, and Yumin: that's very helpful. I'll start another
thread on what I believe is an additional requirement for JEPS 8182070.

Regards,
Glyn

On Fri, Nov 10, 2017 at 9:45 PM, yumin qi <[hidden email]> wrote:

> Hi, Glyn
> http://openjdk.java.net/jeps/8182070
> Upon its implementation, Java can get information of container and
> confiture its own parameters for running in the container, more container
> friendly.
>
> Yumin
>
> On Fri, Nov 10, 2017 at 11:11 AM, Zhengyu Gu <[hidden email]> wrote:
>
>>
>>
>> On 11/10/2017 01:47 PM, Aleksey Shipilev wrote:
>>
>>> On 11/10/2017 06:06 PM, Glyn Normington wrote:
>>>
>>>> Can anyone here tell me how the GC memory area size is determined? If
>>>> there
>>>> is documentation, so much the better as we'd prefer not to depend on
>>>> code
>>>> details that might flux arbitrarily.
>>>>
>>>
>>> NMT reports all allocations from GC code (with mtGC tag) as "GC". This
>>> includes, notably, Java heap
>>> itself, and all auxiliary GC data structures. If you grep the OpenJDK
>>> source for "mtGC", you can get
>>> the idea what allocations are tagged. Also, "detailed" NMT mode would
>>> give you allocation stacks,
>>> which can also give some insight what those particular allocated bits
>>> are coming from.
>>>
>>
>> Java heap actually is *not* tagged as GC memory, but Java heap.
>>
>> -Zhengyu
>>
>>
>>> The size of GC data structures is dependent on GC in question (and,
>>> quite probably, depends on GC
>>> implementation *version*, with swings back and forth), and thus is not
>>> trivially deducible. For
>>> simpler collectors, like Serial and Parallel, it would probably be
>>> dominated by Card Table (1/512-th
>>> of heap size). In G1, it is most probably mark bitmaps (2/64-th of heap
>>> size), plus other
>>> fine-grained remembered sets (which might get large, but maybe there is
>>> a high bound?). Etc.
>>>
>>> Thanks,
>>> -Aleksey
>>>
>>>
>>>
>


--
Regards,
Glyn