RFR: JDK-8261644: NMT: Simplifications and cleanups

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: JDK-8261644: NMT: Simplifications and cleanups

Thomas Stuefe
Hi,

may I please have reviews for this RFE?

While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.

- de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)

- `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.

- As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.

- removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user

- made `AllocationSite` immutable

- removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed

- removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`

- There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that.

- `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.

- Also removed the `toSkip` default value. Instead, I added an explicit default constructor.

- Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler.

-------------

Commit messages:
 - start

Changes: https://git.openjdk.java.net/jdk/pull/2539/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8261644
  Stats: 325 lines in 10 files changed: 98 ins; 177 del; 50 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2539.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2539/head:pull/2539

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups

Daniel D.Daugherty
On Fri, 12 Feb 2021 07:16:45 GMT, Thomas Stuefe <[hidden email]> wrote:

> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that.
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler.

@tstuefe - Thanks for keeping cleanups separated from bug fixes. That always
makes the PR reviews easier. I don't see any information about what testing was
done on this PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups

Thomas Stuefe
In reply to this post by Thomas Stuefe
On Fri, 12 Feb 2021 07:16:45 GMT, Thomas Stuefe <[hidden email]> wrote:

> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that.
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler.
> ----
> Tests:
> - github GA
> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
> - Full nightlies at SAP are scheduled

> @tstuefe - Thanks for keeping cleanups separated from bug fixes. That always
> makes the PR reviews easier. I don't see any information about what testing was
> done on this PR.

Thanks Dan. I updated the description. I'm very careful after JDK-8261520. Manual tests went through, nightlies at SAP are scheduled. BTW this patch is a preparation for the fix for JDK-8261520.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups

Coleen Phillimore-3
In reply to this post by Thomas Stuefe
On Fri, 12 Feb 2021 07:16:45 GMT, Thomas Stuefe <[hidden email]> wrote:

> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that.
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>
> ----
> Tests:
> - github GA
> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
> - Full nightlies at SAP are scheduled

These changes look great. Thank you!

src/hotspot/share/services/mallocSiteTable.cpp line 141:

> 139:   while (head != NULL && (*pos_idx) <= MAX_BUCKET_LENGTH) {
> 140:     MallocSite* site = head->data();
> 141:     if (site->equals(key, flags)) {

Does this now assert when it used to just return false if memflags didn't match?

src/hotspot/share/services/threadStackTracker.hpp line 44:

> 42:     _base(base),
> 43:     _size(size)
> 44:   {}

nit: can you put {} on the same line as _size ?

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups

Thomas Stuefe
On Fri, 12 Feb 2021 22:17:43 GMT, Coleen Phillimore <[hidden email]> wrote:

>> Hi,
>>
>> may I please have reviews for this RFE?
>>
>> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>>
>> There should be no behavioral changes in this patch.
>>
>> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>>
>> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>>
>> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>>
>> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>>
>> - made `AllocationSite` immutable
>>
>> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>>
>> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>>
>> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>>
>> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>>
>> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>>
>> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>>
>> ----
>> Tests:
>> - github GA
>> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
>> - Full nightlies at SAP are scheduled
>
> src/hotspot/share/services/mallocSiteTable.cpp line 141:
>
>> 139:   while (head != NULL && (*pos_idx) <= MAX_BUCKET_LENGTH) {
>> 140:     MallocSite* site = head->data();
>> 141:     if (site->equals(key, flags)) {
>
> Does this now assert when it used to just return false if memflags didn't match?

Yes, and it seems this actually fired tonight on s390x. Which indicates that the call stack frame skipping is off and we do skip too much. I opened https://bugs.openjdk.java.net/browse/JDK-8261556 to track this, and here I will restore the old behavior for now. That is also cleaner since this RFE should not have brought behavioral changes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v2]

Thomas Stuefe
In reply to this post by Thomas Stuefe
> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> There should be no behavioral changes in this patch.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>
> ----
> Tests:
> - github GA
> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
> - Full nightlies at SAP are scheduled

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

  Restore old comparison behavior in MallocSiteTable

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2539/files
  - new: https://git.openjdk.java.net/jdk/pull/2539/files/892759ea..2b06cdf6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=00-01

  Stats: 11 lines in 2 files changed: 0 ins; 9 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2539.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2539/head:pull/2539

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v3]

Thomas Stuefe
In reply to this post by Thomas Stuefe
> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> There should be no behavioral changes in this patch.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>
> ----
> Tests:
> - github GA
> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
> - Full nightlies at SAP are scheduled

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

  Constructor brackets

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2539/files
  - new: https://git.openjdk.java.net/jdk/pull/2539/files/2b06cdf6..74155d15

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=01-02

  Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2539.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2539/head:pull/2539

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v3]

Thomas Stuefe
In reply to this post by Coleen Phillimore-3
On Fri, 12 Feb 2021 22:21:35 GMT, Coleen Phillimore <[hidden email]> wrote:

>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   Constructor brackets
>
> src/hotspot/share/services/threadStackTracker.hpp line 44:
>
>> 42:     _base(base),
>> 43:     _size(size)
>> 44:   {}
>
> nit: can you put {} on the same line as _size ?

Done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v4]

Thomas Stuefe
In reply to this post by Thomas Stuefe
> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> There should be no behavioral changes in this patch.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>
> ----
> Tests:
> - github GA
> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
> - Full nightlies at SAP are scheduled

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

  reduce diff

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2539/files
  - new: https://git.openjdk.java.net/jdk/pull/2539/files/74155d15..0e126c31

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2539&range=02-03

  Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2539.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2539/head:pull/2539

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v4]

Zhengyu Gu-3
On Sat, 13 Feb 2021 05:43:56 GMT, Thomas Stuefe <[hidden email]> wrote:

>> Hi,
>>
>> may I please have reviews for this RFE?
>>
>> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>>
>> There should be no behavioral changes in this patch.
>>
>> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>>
>> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>>
>> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>>
>> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>>
>> - made `AllocationSite` immutable
>>
>> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>>
>> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>>
>> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>>
>> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>>
>> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>>
>> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>>
>> ----
>> Tests:
>> - github GA
>> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
>> - Full nightlies at SAP are scheduled
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>
>   reduce diff

Looks good to me

-------------

Marked as reviewed by zgu (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v4]

Coleen Phillimore-3
In reply to this post by Thomas Stuefe
On Sat, 13 Feb 2021 05:43:56 GMT, Thomas Stuefe <[hidden email]> wrote:

>> Hi,
>>
>> may I please have reviews for this RFE?
>>
>> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>>
>> There should be no behavioral changes in this patch.
>>
>> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>>
>> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>>
>> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>>
>> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>>
>> - made `AllocationSite` immutable
>>
>> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>>
>> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>>
>> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>>
>> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>>
>> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>>
>> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>>
>> ----
>> Tests:
>> - github GA
>> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
>> - Full nightlies at SAP are scheduled
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>
>   reduce diff

Looks good!

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Re: RFR: JDK-8261644: NMT: Simplifications and cleanups [v4]

Thomas Stuefe
On Fri, 19 Feb 2021 02:43:14 GMT, Coleen Phillimore <[hidden email]> wrote:

>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   reduce diff
>
> Looks good!

Thanks Coleen and Zhengyu!

-------------

PR: https://git.openjdk.java.net/jdk/pull/2539
Reply | Threaded
Open this post in threaded view
|

Integrated: JDK-8261644: NMT: Simplifications and cleanups

Thomas Stuefe
In reply to this post by Thomas Stuefe
On Fri, 12 Feb 2021 07:16:45 GMT, Thomas Stuefe <[hidden email]> wrote:

> Hi,
>
> may I please have reviews for this RFE?
>
> While working on NMT I found a number of possible cleanups and simplifications. I avoided mixing these cleanups with fixed and instead put them into this cleanup RFE.
>
> There should be no behavioral changes in this patch.
>
> - de-templatize `AllocationSite<E>` since E was used as simple data holder for child classes; the same effect can be had with traditional inheritance with less and clearer code (also IDEs get less confused)
>
> - `AllocationSite` child classes `SimpleThreadStackSite`, `VirtualMemoryAllocationSite`, `MallocSite` were simplified.
>
> - As for `SimpleThreadStackSite`, we can get rid of the separate data holder class `SimpleThreadStack` entirely by merging its members directly into `SimpleThreadStackSite`. In theory we could do the same for the data holder classes `MemoryCounter` and `VirtualMemory` for `MallocSite` and `VirtualMemoryAllocationSite` too but this would cause larger ripples so I stopped there.
>
> - removed the SimpleThreadStackSite(address base, size_t size) constructor (the one not taking a call stack) by slightly rewriting its sole user
>
> - made `AllocationSite` immutable
>
> - removed unused default constructors from `MallocSite` and `MallocSiteHashTableEntry` since they were not needed
>
> - removed unused methods `set_callsite()`, `hash()`, `equals()` from `MallocSiteHashTableEntry`
>
> - There was a subtle incorrectness where `AllocationSite::equals()` would only compare callstack and disregard the MEMFLAGS member. Theoretically, if two callstacks end with the same lowest frame, they should always reference the same single allocation, so that's okay. But if the call stack capturing was not precise enough (eg skipping too many low frames) we may accidentally lump several allocation sites together which could have different MEMFLAGS. I added an assert to check that. (_Update: seems this assert really fires on s390x, so this is a real problem. I opened [1] to track this and restored the old behavior._).
>
> - `NativeCallStack`: Removed the `fillStack` argument from the first constructor to avoid having to evaluate it in this hot constructor. Its true in almost all cases.
>
> - Also removed the `toSkip` default value. Instead, I added an explicit default constructor.
>
> - Moved the malloc site table tuning statistics printing from memtracker.cpp down into a new function `MallocSiteTable::print_tuning_statistics()`. When implemented inside `MallocSiteTable`, that coding does not need a walker object anymore and becomes a lot simpler. In particular, we don't have to rely on implicit knowledge about walking order, which made the code complex and was vulnerable against subtle errors. New code is more compact and simpler. Before removing the old implementation, I ran both statistics side by side for a couple of scenarios (eg really bad hash code implementations) and the output was identical.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8261556
>
> ----
> Tests:
> - github GA
> - manual NMT jtreg tests (including the currently disabled runtime/NMT/CheckForProperDetailStackTrace.java)
> - Full nightlies at SAP are scheduled

This pull request has now been integrated.

Changeset: 5caf686c
Author:    Thomas Stuefe <[hidden email]>
URL:       https://git.openjdk.java.net/jdk/commit/5caf686c
Stats:     312 lines in 10 files changed: 89 ins; 178 del; 45 mod

8261644: NMT: Simplifications and cleanups

Reviewed-by: coleenp, zgu

-------------

PR: https://git.openjdk.java.net/jdk/pull/2539