Re: RFR: 8259886 : Improve SSL session cache performance and scalability [v6]

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8259886 : Improve SSL session cache performance and scalability [v6]

djelinski
> Under certain load, MemoryCache operations take a substantial fraction of the time needed to complete SSL handshakes. This series of patches improves performance characteristics of MemoryCache, at the cost of a functional change: expired entries are no longer guaranteed to be removed before live ones. Unused entries are still removed before used ones, and cache performance no longer depends on its capacity.
>
> First patch in the series contains a benchmark that can be run with `make test TEST="micro:CacheBench"`.
> Baseline results before any MemoryCache changes:
> Benchmark       (size)  (timeout)  Mode  Cnt     Score    Error  Units
> CacheBench.put   20480      86400  avgt   25    83.653 ?  6.269  us/op
> CacheBench.put   20480          0  avgt   25     0.107 ?  0.001  us/op
> CacheBench.put  204800      86400  avgt   25  2057.781 ? 35.942  us/op
> CacheBench.put  204800          0  avgt   25     0.108 ?  0.001  us/op
> there's a nonlinear performance drop between 20480 and 204800 entries, probably attributable to CPU cache thrashing. Beyond 204800 entries the cache scales more linearly.
>
> Benchmark results after the 2nd and 3rd patches are pretty similar, so I'll only copy one:
> Benchmark       (size)  (timeout)  Mode  Cnt  Score   Error  Units
> CacheBench.put   20480      86400  avgt   25  0.146 ? 0.002  us/op
> CacheBench.put   20480          0  avgt   25  0.108 ? 0.002  us/op
> CacheBench.put  204800      86400  avgt   25  0.150 ? 0.001  us/op
> CacheBench.put  204800          0  avgt   25  0.106 ? 0.001  us/op
> The third patch improves worst-case times on a mostly idle cache by scattering removal of expired entries over multiple `put` calls. It does not affect performance of an overloaded cache.
>
> The 4th patch removes all code that clears cached values before handing them over to the GC. [This comment](https://github.com/openjdk/jdk/commit/5859a0320334bfb6b46b62eb16b4c387641f4a2a#diff-c6bd583a97fbc4f471621fee7eab37c63718cdb6932ce357fa403cfda4b32b6fL346) stated that clearing values was supposed to be a GC performance optimization. It wasn't. Benchmark results after that commit:
> Benchmark       (size)  (timeout)  Mode  Cnt  Score   Error  Units
> CacheBench.put   20480      86400  avgt   25  0.113 ? 0.001  us/op
> CacheBench.put   20480          0  avgt   25  0.075 ? 0.002  us/op
> CacheBench.put  204800      86400  avgt   25  0.116 ? 0.001  us/op
> CacheBench.put  204800          0  avgt   25  0.072 ? 0.001  us/op
> I wasn't expecting that much of an improvement, and don't know how to explain it.
>
> The 40ns difference between cache with and without a timeout can be attributed to 2 `System.currentTimeMillis()` calls; they were pretty slow on my VM.

djelinski has updated the pull request incrementally with one additional commit since the last revision:

  Update copyright year

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2255/files
  - new: https://git.openjdk.java.net/jdk/pull/2255/files/f9bc386a..d5c39a45

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2255&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2255&range=04-05

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2255.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2255/head:pull/2255

PR: https://git.openjdk.java.net/jdk/pull/2255