large memory GC question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

large memory GC question

Alan Stange-2
Hello all,

I was experimenting a bit with the JVM on a large memory system to which I have access, and I have a few questions.

The machine in question is a 72CPU E7-8895 x64 box running Solaris 11.3 with 1536GB of ram.    The machine has been up for some time, is idle with plenty of free memory and lots used up by ZFS file system caches.

So I wrote a simple Java program to populate a large array with Strings each 512 chars long (ie about 1K including the Java object overhead) and the randomly replace elements of the array with new String instances.   I could run this program up to size of about 350GB (jdk-9 G1GC) before the JVM would throw an error that it was unable to mmap large memory regions of 400GB in size.  

So I have a few questions/comments:

1) clearly G1GC in jdk9 is much better than G1GC in jdk8.  I suspect you know this already but in even the simplest testing it seemed clear that much has been improved.


2) is it really the case that the GC requires a large single mmap block to function?  This seems somewhat alarming to me as it's not clear how the OS is meant to


3) Is there a programmatic way, via some method call in Runtime.getRuntime() perhaps,  to collect the total cpu time used in GC in the current process?


4) I have access to this machine for some limited testing, so I'm hoping to run a JVM at 1TB of heap size....
 

Here's the basic code block in question.   sizeMB is 256 * 1024, so the code is targetted to consume a bit more than 256GB

    public void build() {
        System.out.println("build");
        int size = sizeMB * 1024;
        list = new ArrayList<>(sizeMB * 1024);

        int count = 0;
        for (int i = 0; i < size; i++)
            list.add(i, getString());
    }

and here is how it fails:


[2.465s][info][gc] Using G1
build
[21.050s][info][gc] GC(0) Pause Young (G1 Evacuation Pause) 8608M->8585M(131072M) 13948.694ms
[23.073s][info][gc] GC(1) Pause Young (G1 Evacuation Pause) 14281M->14425M(131072M) 669.327ms
[25.120s][info][gc] GC(2) Pause Young (G1 Evacuation Pause) 20121M->20528M(131072M) 684.821ms
[27.261s][info][gc] GC(3) Pause Young (G1 Evacuation Pause) 26224M->26624M(131072M) 728.889ms
[34.630s][info][gc] GC(4) Pause Young (G1 Evacuation Pause) 32320M->32704M(393216M) 5983.418ms
[55.567s][info][gc] GC(5) Pause Young (G1 Evacuation Pause) 51520M->51888M(393216M) 9636.493ms
[74.421s][info][gc] GC(6) Pause Young (G1 Evacuation Pause) 69072M->69456M(393216M) 14334.621ms
[85.382s][info][gc] GC(7) Pause Young (G1 Evacuation Pause) 86640M->87056M(393216M) 6535.972ms
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 365072220160 bytes for committing reserved memory.
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f2b40000000, 365072220160, 0) failed; error='Resource unavailable, try again' (errno=11
)
# An error report file with more information is saved as:
# /home/stange/IdeaProjects/GCScale/hs_err_pid23974.log


The JVM is run with -Xmx1234g -Xms128g -verbose:gc -XX:+UseG1GC



Reply | Threaded
Open this post in threaded view
|

Re: large memory GC question

Thomas Schatzl
Hi,

On Thu, 2016-12-22 at 19:50 -0500, Alan Stange wrote:

> Hello all,
>
> I was experimenting a bit with the JVM on a large memory system to
> which I have access, and I have a few questions.
>
> The machine in question is a 72CPU E7-8895 x64 box running Solaris
> 11.3 with 1536GB of ram.    The machine has been up for some time, is
> idle with plenty of free memory and lots used up by ZFS file system
> caches.
>
> So I wrote a simple Java program to populate a large array with
> Strings each 512 chars long (ie about 1K including the Java object
> overhead) and the randomly replace elements of the array with new
> String instances.   I could run this program up to size of about
> 350GB (jdk-9 G1GC) before the JVM would throw an error that it was
> unable to mmap large memory regions of 400GB in size.   
>
> So I have a few questions/comments:
>
> 1) clearly G1GC in jdk9 is much better than G1GC in jdk8.  I suspect
> you know this already but in even the simplest testing it seemed
> clear that much has been improved.

Thanks :) There has been some work on better supporting huge heaps in
jdk9.

> 2) is it really the case that the GC requires a large single mmap
> block to function?  This seems somewhat alarming to me as it's not
> clear how the OS is meant to 

Yes, in Hotspot the Java heap must be contained in a single virtual
memory range at this time.

TB range (2^40) seems to still be within addressable memory space (48
bits on that e7?). If that were a problem, it would probably already
fail when reserving the memory too.

> 3) Is there a programmatic way, via some method call in
> Runtime.getRuntime() perhaps,  to collect the total cpu time used in
> GC in the current process?

Unfortunately, none that I know of. One could probably look for gc
threads in /proc/<pid>/lwp/<gc-threads>/lwpusage and count them
together.

The GC threads in G1 are all called "G1 <something>", I guess using
that information to filter out these threads (I think the proc
directory should contain that information somewhere). G1 at the moment
allocates all threads upfront, so you can get the gc thread ids once at
startup. 

I also filed JDK-8171970.

> 4) I have access to this machine for some limited testing, so I'm
> hoping to run a JVM at 1TB of heap size....

We have been doing tests with the JVM with applications using multi-TB
heaps on Solaris/SPARC.

> Here's the basic code block in question.   sizeMB is 256 * 1024, so
> the code is targetted to consume a bit more than 256GB
>
>     public void build() {
>         System.out.println("build");
>         int size = sizeMB * 1024;
>         list = new ArrayList<>(sizeMB * 1024);
>
>         int count = 0;
>         for (int i = 0; i < size; i++)
>             list.add(i, getString());
>     }
>
> and here is how it fails:
>
>
> [2.465s][info][gc] Using G1
> build
> [21.050s][info][gc] GC(0) Pause Young (G1 Evacuation Pause) 8608M-
> >8585M(131072M) 13948.694ms
> [23.073s][info][gc] GC(1) Pause Young (G1 Evacuation Pause) 14281M-
> >14425M(131072M) 669.327ms
> [25.120s][info][gc] GC(2) Pause Young (G1 Evacuation Pause) 20121M-
> >20528M(131072M) 684.821ms
> [27.261s][info][gc] GC(3) Pause Young (G1 Evacuation Pause) 26224M-
> >26624M(131072M) 728.889ms
> [34.630s][info][gc] GC(4) Pause Young (G1 Evacuation Pause) 32320M-
> >32704M(393216M) 5983.418ms
> [55.567s][info][gc] GC(5) Pause Young (G1 Evacuation Pause) 51520M-
> >51888M(393216M) 9636.493ms
> [74.421s][info][gc] GC(6) Pause Young (G1 Evacuation Pause) 69072M-
> >69456M(393216M) 14334.621ms
> [85.382s][info][gc] GC(7) Pause Young (G1 Evacuation Pause) 86640M-
> >87056M(393216M) 6535.972ms
> #
> # There is insufficient memory for the Java Runtime Environment to
> continue.
> # Native memory allocation (mmap) failed to map 365072220160 bytes
> for committing reserved memory.
> Java HotSpot(TM) 64-Bit Server VM warning: INFO:
> os::commit_memory(0x00007f2b40000000, 365072220160, 0) failed;
> error='Resource unavailable, try again' (errno=11
> )
> # An error report file with more information is saved as:
> # /home/stange/IdeaProjects/GCScale/hs_err_pid23974.log
>
>
> The JVM is run with -Xmx1234g -Xms128g -verbose:gc -XX:+UseG1GC

Does allocating all memory upfront (with -Xms == -Xmx) work?

Maybe there is a problem with the configuration of swap space? Solaris
policies about how and when to fail committing memory (and what counts
against current memory usage) are different to other OSes, particularly
with swap configured not covering available memory completely.
Unfortunately I am not an expert in configuring Solaris, so I can't
help a lot here.

Thanks,
  Thomas

Reply | Threaded
Open this post in threaded view
|

Re: large memory GC question

Alan Stange-2
Hello,

thank you for your quick response.   

I did a few more experiments and determined the issue was ZFS ARC cache consuming essentially all the memory on the system.    We use a script from Oracle to set user_reserve_hint_pct to force the size of the ARC cache to a smaller size and then the test code was able to run with a >1TB heap size.

Which I guess leads to another issue we've observed on other smaller Solaris systems.    Every so often a Java application will fail to start due to a lack of memory...memory that is being held by the ARC cache and not being free'd.     My experience with UFS back in the day was that the kernel would steal pages from the UFS file system cache and make them available to user applications when there was a request to expand the address space of a process.    It'a really clear now that ZFS doesn't behave the same way.     The large memory system under test had nothing running on it and all the memory was being held by the ZFS ARC and the kernel didn't "take" any memory from the ARC cache.

Is this an expected interaction between ZFS ARC and large memory requests from applications like Java which want memory in large chunks?     I realize this is getting somewhat removed from hotspot gc, but would appreciate your thoughts.

-- Alan

PS Happy New Year everyone!






On Fri, Dec 23, 2016 at 8:40 AM, Thomas Schatzl <[hidden email]> wrote:
Hi,

On Thu, 2016-12-22 at 19:50 -0500, Alan Stange wrote:
> Hello all,
>
> I was experimenting a bit with the JVM on a large memory system to
> which I have access, and I have a few questions.
>
> The machine in question is a 72CPU E7-8895 x64 box running Solaris
> 11.3 with 1536GB of ram.    The machine has been up for some time, is
> idle with plenty of free memory and lots used up by ZFS file system
> caches.
>
> So I wrote a simple Java program to populate a large array with
> Strings each 512 chars long (ie about 1K including the Java object
> overhead) and the randomly replace elements of the array with new
> String instances.   I could run this program up to size of about
> 350GB (jdk-9 G1GC) before the JVM would throw an error that it was
> unable to mmap large memory regions of 400GB in size.   
>
> So I have a few questions/comments:
>
> 1) clearly G1GC in jdk9 is much better than G1GC in jdk8.  I suspect
> you know this already but in even the simplest testing it seemed
> clear that much has been improved.

Thanks :) There has been some work on better supporting huge heaps in
jdk9.

> 2) is it really the case that the GC requires a large single mmap
> block to function?  This seems somewhat alarming to me as it's not
> clear how the OS is meant to 

Yes, in Hotspot the Java heap must be contained in a single virtual
memory range at this time.

TB range (2^40) seems to still be within addressable memory space (48
bits on that e7?). If that were a problem, it would probably already
fail when reserving the memory too.

> 3) Is there a programmatic way, via some method call in
> Runtime.getRuntime() perhaps,  to collect the total cpu time used in
> GC in the current process?

Unfortunately, none that I know of. One could probably look for gc
threads in /proc/<pid>/lwp/<gc-threads>/lwpusage and count them
together.

The GC threads in G1 are all called "G1 <something>", I guess using
that information to filter out these threads (I think the proc
directory should contain that information somewhere). G1 at the moment
allocates all threads upfront, so you can get the gc thread ids once at
startup. 

I also filed JDK-8171970.

> 4) I have access to this machine for some limited testing, so I'm
> hoping to run a JVM at 1TB of heap size....

We have been doing tests with the JVM with applications using multi-TB
heaps on Solaris/SPARC.

> Here's the basic code block in question.   sizeMB is 256 * 1024, so
> the code is targetted to consume a bit more than 256GB
>
>     public void build() {
>         System.out.println("build");
>         int size = sizeMB * 1024;
>         list = new ArrayList<>(sizeMB * 1024);
>
>         int count = 0;
>         for (int i = 0; i < size; i++)
>             list.add(i, getString());
>     }
>
> and here is how it fails:
>
>
> [2.465s][info][gc] Using G1
> build
> [21.050s][info][gc] GC(0) Pause Young (G1 Evacuation Pause) 8608M-
> >8585M(131072M) 13948.694ms
> [23.073s][info][gc] GC(1) Pause Young (G1 Evacuation Pause) 14281M-
> >14425M(131072M) 669.327ms
> [25.120s][info][gc] GC(2) Pause Young (G1 Evacuation Pause) 20121M-
> >20528M(131072M) 684.821ms
> [27.261s][info][gc] GC(3) Pause Young (G1 Evacuation Pause) 26224M-
> >26624M(131072M) 728.889ms
> [34.630s][info][gc] GC(4) Pause Young (G1 Evacuation Pause) 32320M-
> >32704M(393216M) 5983.418ms
> [55.567s][info][gc] GC(5) Pause Young (G1 Evacuation Pause) 51520M-
> >51888M(393216M) 9636.493ms
> [74.421s][info][gc] GC(6) Pause Young (G1 Evacuation Pause) 69072M-
> >69456M(393216M) 14334.621ms
> [85.382s][info][gc] GC(7) Pause Young (G1 Evacuation Pause) 86640M-
> >87056M(393216M) 6535.972ms
> #
> # There is insufficient memory for the Java Runtime Environment to
> continue.
> # Native memory allocation (mmap) failed to map 365072220160 bytes
> for committing reserved memory.
> Java HotSpot(TM) 64-Bit Server VM warning: INFO:
> os::commit_memory(0x00007f2b40000000, 365072220160, 0) failed;
> error='Resource unavailable, try again' (errno=11
> )
> # An error report file with more information is saved as:
> # /home/stange/IdeaProjects/GCScale/hs_err_pid23974.log
>
>
> The JVM is run with -Xmx1234g -Xms128g -verbose:gc -XX:+UseG1GC

Does allocating all memory upfront (with -Xms == -Xmx) work?

Maybe there is a problem with the configuration of swap space? Solaris
policies about how and when to fail committing memory (and what counts
against current memory usage) are different to other OSes, particularly
with swap configured not covering available memory completely.
Unfortunately I am not an expert in configuring Solaris, so I can't
help a lot here.

Thanks,
  Thomas