Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Gerard Ziemski-2
On Tue, 2 Feb 2021 11:59:08 GMT, Anton Kozlov <[hidden email]> wrote:

>> Please review the implementation of JEP 391: macOS/AArch64 Port.
>>
>> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64.
>>
>> Major changes are in:
>> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818)
>> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819)
>> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough.
>> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941)
>
> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>
>   support macos_aarch64 in hsdis

Changes requested by gziemski (Committer).

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 390:

> 388:       store_and_inc(_to, from_obj, NativeStack::intSpace);
> 389:
> 390:       _num_int_args++;

`pass_byte()` and  `pass_short()` use only one `_num_int_args++;` after the `if else` but other methods use 2 of them inside `if else` branches.

We should be consistent.

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 404:

> 402:     } else {
> 403:       store_and_inc(_to, from_obj, NativeStack::longSpace);
> 404:       _num_int_args++;

`pass_byte()` and  `pass_short()` use only one `_num_int_args++;` after the `if else` but other methods use it 2 of them inside `if else`

We should be consistent.

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 418:

> 416:     } else {
> 417:       store_and_inc(_to, (*from_addr == 0) ? (intptr_t)NULL : (intptr_t) from_addr, wordSize);
> 418:       _num_int_args++;

`pass_byte()` and  `pass_short()` use only one `_num_int_args++;` after the `if else` but other methods use it 2 of them inside `if else`

We should be consistent.

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 433:

> 431:       store_and_inc(_to, from_obj, NativeStack::floatSpace);
> 432:
> 433:       _num_fp_args++;

`pass_byte()` and  `pass_short()` use only one `_num_int_args++;` after the `if else` but other methods use it 2 of them inside `if else`

We should be consistent.

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 448:

> 446:     } else {
> 447:       store_and_inc(_to, from_obj, NativeStack::doubleSpace);
> 448:       _num_fp_args++;

`pass_byte()` and  `pass_short()` use only one `_num_int_args++;` after the `if else` but other methods use it 2 of them inside `if else`

We should be consistent.

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5271:

> 5269: //
> 5270: void MacroAssembler::get_thread(Register dst) {
> 5271:   RegSet saved_regs = RegSet::range(r0, r1) + BSD_ONLY(RegSet::range(r2, r17)) + lr - dst;

The comment needs to be updated, since on BSD we also seem to clobber r2,r17 ?

src/hotspot/os/posix/signals_posix.cpp line 1297:

> 1295:   kern_return_t kr;
> 1296:   kr = task_set_exception_ports(mach_task_self(),
> 1297:                                 EXC_MASK_BAD_ACCESS | EXC_MASK_BAD_INSTRUCTION | EXC_MASK_ARITHMETIC,

Could someone elaborate on why we need to add `EXC_MASK_BAD_INSTRUCTION` to the mask here?

src/hotspot/cpu/aarch64/vm_version_aarch64.hpp line 93:

> 91:     CPU_MARVELL   = 'V',
> 92:     CPU_INTEL     = 'i',
> 93:     CPU_APPLE     = 'a',

The `ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile` has 8538 pages, can we be more specific and point to the particular section of the document where the CPU codes are defined?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Daniel D.Daugherty
On Tue, 2 Feb 2021 11:59:08 GMT, Anton Kozlov <[hidden email]> wrote:

>> Please review the implementation of JEP 391: macOS/AArch64 Port.
>>
>> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64.
>>
>> Major changes are in:
>> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818)
>> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819)
>> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough.
>> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941)
>
> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>
>   support macos_aarch64 in hsdis

For platform files that were copied from other ports to this port, if the file wasn't
changed I presume the copyright years are left alone. If the file required changes
for this port, I expect the year to be updated to 2021. How are you verifying that
these copyright years are being properly managed on the new files?

For the new W^X helpers, e.g., WXWriteFromExecSetter, a short comment
explaining why one was landed in a particular place would help reviewers.
Also see my comment about creating a new ThreadToNativeWithWXExecFromVM
helper.

I'm stopping my review with all the src/hotspot files done for now.

make/autoconf/flags.m4 line 140:

> 138:     else
> 139:       MACOSX_VERSION_MIN=10.12.0
> 140:     fi

Not something that needs to be addressed here, but these changes
illustrate that our collective use of macOSX/MACOSX/MacOSX names
are tied to the fact that the macOS major version number was at 10
for a very long time.

@magicus - Do we have an RFE to rename MACOSX or are we sticking
with it and evolving our interpretation of the 'X' from '10' to */splat/asterik?

make/common/NativeCompilation.gmk line 1178:

> 1176:                 endif
> 1177:                 # This only works if the openjdk_codesign identity is present on the system. Let
> 1178:                 # silently fail otherwise.

Might want to add a comment here:
    #  The '-f' option will replace an existing signature if one exists.

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 41:

> 39: #define __ _masm->
> 40:
> 41: //describe amount of space in bytes occupied by type on native stack

nit - needs a space after `//`

src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp line 83:

> 81: // On macos/aarch64 native stack is packed, int/float are using only 4 bytes
> 82: // on stack. Natural alignment for types are still in place,
> 83: // for example double/long should be 8 bytes alligned

nit typo: s/alligned/aligned/
And sentence should end with a period.

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 323:

> 321:   str(zr, Address(rthread, JavaThread::last_Java_pc_offset()));
> 322:
> 323:   str(zr, Address(rthread, JavaFrameAnchor::saved_fp_address_offset()));

I don't think this switch from `JavaThread::saved_fp_address_offset()`
to `JavaFrameAnchor::saved_fp_address_offset()` is correct since
`rthread` is still used and is a JavaThread*. The new code will give you:

    `rthread` + offset of the `saved_fp_address` field in a JavaFrameAnchor

The old code gave you:

    `rthread` + offset of the `saved_fp_address` field in the JavaFrameAnchor field in the JavaThread

Those are not the same things.

src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 31:

> 29: #include "asm/assembler.inline.hpp"
> 30: #include "oops/compressedOops.hpp"
> 31: #include "runtime/vm_version.hpp"

It's not clear why this include needed to be added.

src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 810:

> 808: #ifdef __APPLE__
> 809:           // Less-than word types are stored one after another.
> 810:           // The code unable to handle this, bailout.

Perhaps:  // The code is unable to handle this so bailout.

src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 837:

> 835: #ifdef __APPLE__
> 836:           // Less-than word types are stored one after another.
> 837:           // The code unable to handle this, bailout.

Perhaps: // The code is unable to handle this so bailout.

src/hotspot/os/aix/os_aix.cpp line 69:

> 67: #include "runtime/sharedRuntime.hpp"
> 68: #include "runtime/statSampler.hpp"
> 69: #include "runtime/stubRoutines.inline.hpp"

It is not clear why this include to inline-include change was made.

src/hotspot/os/linux/gc/z/zPhysicalMemoryBacking_linux.cpp line 37:

> 35: #include "runtime/init.hpp"
> 36: #include "runtime/os.hpp"
> 37: #include "runtime/stubRoutines.inline.hpp"

It is not clear why this include to inline-include change was made.

src/hotspot/os/windows/os_windows.cpp line 65:

> 63: #include "runtime/sharedRuntime.hpp"
> 64: #include "runtime/statSampler.hpp"
> 65: #include "runtime/stubRoutines.inline.hpp"

It is not clear why this include to inline-include change was made.

src/hotspot/os_cpu/bsd_aarch64/atomic_bsd_aarch64.hpp line 59:

> 57: }
> 58:
> 59: // __attribute__((unused)) on dest is to get rid of spurious GCC warnings.

Is the GCC comment appropriate? Does this file get built
with GCC or just Apple compilers?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 92:

> 90: # define DEFAULT_MAIN_THREAD_STACK_PAGES 2048
> 91: # define OS_X_10_9_0_KERNEL_MAJOR_VERSION 13
> 92: #endif

Does this work around for Maveriks (10.9.0) need to be here?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 103:

> 101:   // 10.5 UNIX03 member name prefixes
> 102:   #define DU3_PREFIX(s, m) __ ## s.__ ## m
> 103: # else

Does this work around for macOS 10.5 need to be here?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 195:

> 193: frame os::get_sender_for_C_frame(frame* fr) {
> 194:   return frame(fr->link(), fr->link(), fr->sender_pc());
> 195: }

Is this file going to be built by GCC or just macOS compilers?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 221:

> 219:     assert(sig == info->si_signo, "bad siginfo");
> 220:   }
> 221: */

Should this code be deleted? From here and from where it was copied
from if it is also commented out there...

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 363:

> 361:     address pc = os::Posix::ucontext_get_pc(uc);
> 362:
> 363:     if (pc != addr && uc->context_esr == 0x9200004F) { //TODO: figure out what this value means

Is this TODO going to be resolved by this port?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 374:

> 372:
> 373:       last_addr = (address) -1;
> 374:     } else if (pc == addr && uc->context_esr == 0x8200000f) { //TODO: figure out what this value means

Is this TODO going to be resolved by this port?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 435:

> 433: //    |                        |\  Java thread created by VM does not have glibc
> 434: //    |    glibc guard page    | - guard, attached Java thread usually has
> 435: //    |                        |/  1 glibc guard page.

Is this code going to be built by GCC (with glibc) or will only
macOS compilers and libraries be used?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 486:

> 484:         }
> 485:       }
> 486:     }

This appears to be a mix for Mavericks (10.9) and 10.12
work arounds. Is this code needed by this project?

src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 45:

> 43:   // Atomically copy 64 bits of data
> 44:   static void atomic_copy64(const volatile void *src, volatile void *dst) {
> 45:     *(jlong *) dst = *(const jlong *) src;

Is this construct actually atomic on aarch64?

src/hotspot/os_cpu/bsd_aarch64/thread_bsd_aarch64.cpp line 43:

> 41:   assert(Thread::current() == this, "caller must be current thread");
> 42:   return pd_get_top_frame(fr_addr, ucontext, isInJava);
> 43: }

Is AsyncGetCallTrace() being supported by this port?

src/hotspot/os_cpu/aix_ppc/os_aix_ppc.hpp line 43:

> 41:
> 42: private:
> 43:

'private' usually has once space in front of it.
Also why the blank line after it?

src/hotspot/os_cpu/aix_ppc/os_aix_ppc.hpp line 46:

> 44:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 45:
> 46: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/bsd_x86/os_bsd_x86.hpp line 41:

> 39:
> 40: private:
> 41:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/bsd_x86/os_bsd_x86.hpp line 44:

> 42:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 43:
> 44: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/bsd_zero/os_bsd_zero.hpp line 57:

> 55:
> 56: private:
> 57:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/bsd_zero/os_bsd_zero.hpp line 60:

> 58:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 59:
> 60: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.hpp line 43:

> 41:
> 42: private:
> 43:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.hpp line 46:

> 44:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 45:
> 46: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/linux_arm/os_linux_arm.hpp line 74:

> 72:
> 73: private:
> 74:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/linux_arm/os_linux_arm.hpp line 77:

> 75:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 76:
> 77: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/linux_ppc/os_linux_ppc.hpp line 36:

> 34:
> 35: private:
> 36:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/linux_ppc/os_linux_ppc.hpp line 39:

> 37:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 38:
> 39: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/linux_s390/os_linux_s390.hpp line 35:

> 33:
> 34: private:
> 35:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/linux_s390/os_linux_s390.hpp line 38:

> 36:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 37:
> 38: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/linux_x86/os_linux_x86.hpp line 54:

> 52:
> 53: private:
> 54:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/linux_x86/os_linux_x86.hpp line 57:

> 55:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 56:
> 57: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/linux_zero/os_linux_zero.hpp line 94:

> 92:
> 93: private:
> 94:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/linux_zero/os_linux_zero.hpp line 97:

> 95:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 96:
> 97: private:

I think this should 'public' (with one space in front of it).

src/hotspot/os_cpu/windows_aarch64/os_windows_aarch64.hpp line 37:

> 35:
> 36: private:
> 37:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/windows_aarch64/os_windows_aarch64.hpp line 40:

> 38:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 39:
> 40: public:

'public' usually has one space in front of it.

src/hotspot/os_cpu/windows_x86/os_windows_x86.hpp line 47:

> 45:
> 46: private:
> 47:

'private' usually has one space in front of it.
Also, why the blank line after it?

src/hotspot/os_cpu/windows_x86/os_windows_x86.hpp line 50:

> 48:   static void current_thread_enable_wx_impl(WXMode mode) { }
> 49:
> 50: public:

'public' usually has one space in front of it.

src/hotspot/share/gc/shared/oopStorage.cpp line 40:

> 38: #include "runtime/os.hpp"
> 39: #include "runtime/safepoint.hpp"
> 40: #include "runtime/stubRoutines.inline.hpp"

The reason for switching from include to inline-include is not clear.

src/hotspot/share/logging/logStream.hpp line 35:

> 33: class LogStream : public outputStream {
> 34:   // see test/hotspot/gtest/logging/test_logStream.cpp
> 35:   friend class LogStreamTest;

It's not clear why this change is made for this port.

src/hotspot/share/prims/jni.cpp line 3930:

> 3928:
> 3929:   // Go to the execute mode, the initial state of the thread on creation.
> 3930:   // Use os interface as the thread is not a java one anymore.

Perhaps: s/not a java one anymore./not a JavaThread anymore./

src/hotspot/share/prims/nativeEntryPoint.cpp line 45:

> 43:   guarantee(status == JNI_OK && !env->ExceptionOccurred(),
> 44:             "register jdk.internal.invoke.NativeEntryPoint natives");
> 45: JNI_END

I thought that jcheck caught a missing new-line?

src/hotspot/share/runtime/globals.hpp line 1855:

> 1853:                                                                             \
> 1854:   product(intx, UnguardOnExecutionViolation, 0,                             \
> 1855:           "Unguard page and retry on no-execute fault "                     \

Taking away the "(Win32 only)" and not replacing it with new text feels wrong.
I can't imagine that this flag works on any additional platforms except for
macos-aarch64.

src/hotspot/share/runtime/os.cpp line 58:

> 56: #include "runtime/osThread.hpp"
> 57: #include "runtime/sharedRuntime.hpp"
> 58: #include "runtime/stubRoutines.inline.hpp"

The reason for switching from include to inline include is not clear.

src/hotspot/share/runtime/objectMonitor.cpp line 52:

> 50: #include "runtime/safepointMechanism.inline.hpp"
> 51: #include "runtime/sharedRuntime.hpp"
> 52: #include "runtime/stubRoutines.inline.hpp"

The reason for switching from include to inline include is not clear.

src/hotspot/share/runtime/stubRoutines.cpp line 34:

> 32: #include "runtime/timerTrace.hpp"
> 33: #include "runtime/sharedRuntime.hpp"
> 34: #include "runtime/stubRoutines.inline.hpp"

The reason for switching from include to inline include is not clear.

src/hotspot/share/runtime/stubRoutines.inline.hpp line 1:

> 1: /*

NOW I understand the reason for switching from include to inline-include.
Is there a reason that this change is part of this project and not extracted
into a separate RFE. That would reduce the number of files touched by
this project.

src/hotspot/share/runtime/thread.cpp line 3991:

> 3989:       JavaThread* thread = JavaThread::current();
> 3990:       ThreadToNativeFromVM ttn(thread);
> 3991:       Thread::WXExecFromWriteSetter wx_exec;

I saw somewhere in this review a comment about why this new
WXExecFromWriteSetter helper isn't folded into ThreadToNativeFromVM
and I understand that not every current ThreadToNativeFromVM needs
the new helper. If the vast majority of the ThreadToNativeFromVM
locations need WXExecFromWriteSetter, then perhaps those locations
that currently have a ThreadToNativeFromVM followed by the new
WXExecFromWriteSetter should use a new helper:

    ThreadToNativeWithWXExecFromVM

so we'll see changes from:

    ThreadToNativeFromVM -> ThreadToNativeWithWXExecFromVM

where we need them and hopefully a short comment can be added
at the same time to explain the need for WXExec. This will allow us
to easily distinguish ThreadToNativeFromVM locations that DO NOT
need WXExec from those that DO need it.

src/java.base/macosx/native/libjli/java_md_macosx.m line 210:

> 208:     if (preferredJVM == NULL) {
> 209: #if defined(__i386__)
> 210:         preferredJVM = "client";

#if defined(__i386__)
        preferredJVM = "client";
Not your bug, but Oracle/OpenJDK never supported 32-bit __i386__ on macOS.

-------------

Changes requested by dcubed (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Gerard Ziemski-2
On Wed, 3 Feb 2021 09:14:24 GMT, Andrew Haley <[hidden email]> wrote:

>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5271:
>>
>>> 5269: //
>>> 5270: void MacroAssembler::get_thread(Register dst) {
>>> 5271:   RegSet saved_regs = RegSet::range(r0, r1) + BSD_ONLY(RegSet::range(r2, r17)) + lr - dst;
>>
>> The comment needs to be updated, since on BSD we also seem to clobber r2,r17 ?
>
> Surely this should be
>
>  saved_regs = RegSet::range(r0, r1) BSD_ONLY(+ RegSet::range(r2, r17)) + lr - dst;```
>
> Shouldn't it?

Interesting, I wonder why it has built successfully on Linux. I'm going to fix this under as JDK-8261072

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Gerard Ziemski-2
In reply to this post by Gerard Ziemski-2
On Wed, 3 Feb 2021 20:04:18 GMT, Gerard Ziemski <[hidden email]> wrote:

>> See comment above about `gdb`, the same applies to `lldb` today. The AArch64 backend uses `SIGILL` (~= `EXC_MASK_BAD_INSTRUCTION`) to initiate a deoptimization. Without this change you cannot continue debugging once you the debuggee receives `SIGILL`. This wasn't needed before as x86 doesn't use `SIGILL`.
>
> Part of the comment said `This work-around is not necessary for 10.5+, as CrashReporter no longer intercedes on caught fatal signals.` so I thought it was no longer needed, but it sounds like the part about `gdb` still applies then.
>
> We should update the comment to just say the `gdb` relevant part perhaps (and evaluate which of the EXC_MASK_BAD_ACCESS | EXC_MASK_BAD_INSTRUCTION | EXC_MASK_ARITHMETIC) we actually need for gdb:
>
>  `// gdb installs both standard BSD signal handlers, and mach exception`
>  `// handlers. By replacing the existing task exception handler, we disable gdb's mach`
>  `// exception handling, while leaving the standard BSD signal handlers functional.`
>
> Do you know if this also apply to `lldb` or is it `gdb` only specific? How do you run `gdb` on macOS nowadays anyhow?

To answer my own question, it seems that code is still needed on `x86_64` for `lldb` with `EXC_MASK_BAD_ACCESS` or we keep tripping over `EXC_BAD_ACCESS`

Remaining questions:

a) why we need `EXC_MASK_ARITHMETIC` ?
b) we hit `signal SIGSEGV` in debugger even with the code in place, any way to avoid that?
c) does `BSD aarch64` need only `EXC_MASK_BAD_INSTRUCTION` or does it need `EXC_MASK_BAD_ACCESS` as well?
d) can we `#ifdef` the `EXC_MASK_BAD_INSTRUCTION` part of the mask only to apply to `aarch64`?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Andrew Haley
In reply to this post by Daniel D.Daugherty
On 2/3/21 8:01 PM, Anton Kozlov wrote:
> The basic principle has not changed: when we execute JVM code (owned by libjvm.so, starting from JVM entry function), we switch to Write state. When we leave JVM to execute generated or JNI code, we switch to Executable state. I would like to highlight that JVM code does not mean the VM state of the java thread. After @stefank's suggestion, I could also drop a few W^X state switches, so now it should be more clear that switches are tied to JVM entry functions.

I haven't got a MacOS AArch64 system right now. Is it possible to
enable W^X in Linux in order to kick the tyres?

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Andrew Haley
In reply to this post by Daniel D.Daugherty
On 2/3/21 8:11 PM, Mikael Vidstedt wrote:
> Out of curiosity, have you looked at the performance of the W^X state transition? In particular I'm wondering if the cost is effectively negligible so doing it unconditionally on JVM entry is a no-brainer and just easier/cleaner than the alternatives, or if there are reasons to look at only doing the transition if/when needed (perhaps do it lazily and revert back to X when leaving the JVM?).

I has to change page permissions, doesn't it? That's never going to be
cheap, given that it requires a TLB teardown broadcast across all cores.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Daniel D.Daugherty
On Tue, 2 Feb 2021 22:56:55 GMT, Daniel D. Daugherty <[hidden email]> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   support macos_aarch64 in hsdis
>
> src/hotspot/share/runtime/stubRoutines.inline.hpp line 1:
>
>> 1: /*
>
> NOW I understand the reason for switching from include to inline-include.
> Is there a reason that this change is part of this project and not extracted
> into a separate RFE. That would reduce the number of files touched by
> this project.

Makes sense, thanks. I'll do this as JDK-8261075.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Vladimir Kempik-3
In reply to this post by Daniel D.Daugherty
On Thu, 4 Feb 2021 14:27:53 GMT, Andrew Haley <[hidden email]> wrote:

> > > You read my mind, Andrew. Unless, of course, it's optimized to leverage the fact that it's thread-specific..
> >
> >
> > it's thread-specific
> > https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon
> > > Because pthread_jit_write_protect_np changes only the current thread’s permissions, avoid accessing the same memory region from multiple threads. Giving multiple threads access to the same memory region opens up a potential attack vector, in which one thread has write access and another has executable access to the same region.
>
> Umm, so how does patching work? We don't even know if other threads are executing the code we need to patch.

I thought java can handle that scenario in usual (non W^X systems) just fine, so we just believe jvm did everything right and it's safe to rewrite some code at specific moment.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Daniel D.Daugherty
On Wed, 3 Feb 2021 20:08:28 GMT, Mikael Vidstedt <[hidden email]> wrote:

> Out of curiosity, have you looked at the performance of the W^X state transition?

Earlier it was possible to disable W^X protection (unfortunately, I don't know a way to do this now). We compared Renaissance results with W^X transitions like the proposed one vs. no transitions with the protection disabled once. Results were identical for a small and large number of iterations.

From the other hand, I've used https://github.com/AntonKozlov/macos-aarch64-transition-bench to estimate the cost of the transition.
I re-did measurements with the current implementation and on consumer hardware:

testJNI            thrpt   25   277997000.151 ±   4095685.956  ops/s
testJniNanoTime    thrpt   25    17851098.010 ±    119489.599  ops/s
testNanoTime       thrpt   25    78007491.762 ±    628455.971  ops/s
testNothing        thrpt   25  1724298829.088 ± 100537565.068  ops/s
testTwoStateAndWX  thrpt   25    21958839.057 ±    210490.755  ops/s
testWX             thrpt   25    23299813.266 ±    149837.302  ops/s

There is an overhead, but it does not look like blocking the first implementation. I'm not refusing future optimizations like enabling W only when necessary. But for now, I don't have a robust and maintainable solution for this, sorry.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Vladimir Kempik-3
On Thu, 4 Feb 2021 15:13:49 GMT, Anton Kozlov <[hidden email]> wrote:

>> Out of curiosity, have you looked at the performance of the W^X state transition? In particular I'm wondering if the cost is effectively negligible so doing it unconditionally on JVM entry is a no-brainer and just easier/cleaner than the alternatives, or if there are reasons to look at only doing the transition if/when needed (perhaps do it lazily and revert back to X when leaving the JVM?).
>
>> Out of curiosity, have you looked at the performance of the W^X state transition?
>
> Earlier it was possible to disable W^X protection (unfortunately, I don't know a way to do this now). We compared Renaissance results with W^X transitions like the proposed one vs. no transitions with the protection disabled once. Results were identical for a small and large number of iterations.
>
> From the other hand, I've used https://github.com/AntonKozlov/macos-aarch64-transition-bench to estimate the cost of the transition.
> I re-did measurements with the current implementation and on consumer hardware:
>
> testJNI            thrpt   25   277997000.151 ±   4095685.956  ops/s
> testJniNanoTime    thrpt   25    17851098.010 ±    119489.599  ops/s
> testNanoTime       thrpt   25    78007491.762 ±    628455.971  ops/s
> testNothing        thrpt   25  1724298829.088 ± 100537565.068  ops/s
> testTwoStateAndWX  thrpt   25    21958839.057 ±    210490.755  ops/s
> testWX             thrpt   25    23299813.266 ±    149837.302  ops/s
>
> There is an overhead, but it does not look like blocking the first implementation. I'm not refusing future optimizations like enabling W only when necessary. But for now, I don't have a robust and maintainable solution for this, sorry.

> _Mailing list message from [erik.joelsson at oracle.com](mailto:[hidden email]) on [2d-dev](mailto:[hidden email]):_
>
> On 2021-01-26 04:44, Magnus Ihse Bursie wrote:
>
> > On 2021-01-26 13:09, Vladimir Kempik wrote:
> > > On Tue, 26 Jan 2021 12:02:02 GMT, Alan Hayward
> > > <github.com+4146708+a74nh at openjdk.org> wrote:
> > > > AIUI, the configure line needs passing a prebuilt
> > > > JavaNativeFoundation framework
> > > > ie:
> > > > `--with-extra-ldflags='-F
> > > > /Applications/Xcode.app/Contents/SharedFrameworks/ContentDeliveryServices.framework/Versions/A/itms/java/Frameworks/'`
> > > > Otherwise there will be missing _JNFNative* functions.
> > > > Is this the long term plan? Or will eventually the required code be
> > > > moved into JDK and/or the xcode one automatically get picked up by
> > > > the configure scripts?
> > > > There is ongoing work by P. Race to eliminate dependence on JNF at all
> > > > How far has that work come? Otherwise the logic should be added to
> > > > configure to look for this framework automatically, and provide a way
> > > > to override it/set it if not found.
> >
> >
> > I don't think it's OK to publish a new port that cannot build
> > out-of-the-box without hacks like this.
>
> My understanding is that Apple chose to not provide JNF for aarch64, so
> if you want to build OpenJDK, you first need to build JNF yourself (it's
> available in github). Phil is working on removing this dependency
> completely, which will solve this issue [1].
>
> In the meantime, I don't think we should rely on finding JNF in
> unsupported locations inside Xcode. Could we wait with integrating this
> port until it can be built without such hacks? If not, then adding
> something in the documentation on how to get a working JNF would at
> least be needed.
>
> /Erik
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8257852

This doesn't seem to be an issue anymore, After P.Race have finished with JDK-8257852, Macarm port can be build without extra ld flags. J2Demo works fine as example.
This can be checked if one merges pull/2200 branch into his local copy of master branch.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Daniel D. Daugherty
In reply to this post by Daniel D.Daugherty
On 2/5/21 4:51 AM, Magnus Ihse Bursie wrote:

> On Tue, 2 Feb 2021 21:20:52 GMT, Daniel D. Daugherty <[hidden email]> wrote:
>
>>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>>
>>>    support macos_aarch64 in hsdis
>> make/autoconf/flags.m4 line 140:
>>
>>> 138:     else
>>> 139:       MACOSX_VERSION_MIN=10.12.0
>>> 140:     fi
>> Not something that needs to be addressed here, but these changes
>> illustrate that our collective use of macOSX/MACOSX/MacOSX names
>> are tied to the fact that the macOS major version number was at 10
>> for a very long time.
>>
>> @magicus - Do we have an RFE to rename MACOSX or are we sticking
>> with it and evolving our interpretation of the 'X' from '10' to */splat/asterik?
> @dcubed-ojdk There is no RFE to renaming "macosx" to "macos". I'm not sure it should be done. We can't follow all marketing trends (Apple recently renamed iOS to iPadOS for the iPad; we can't keep adapting to such schemes). Personally, I like the new name without the "x", but we had already spent some time trying to find and fix all (or at least, most) instances of "osx" in the code, that I don't really think it's worth the effort.
>
> If you can drill up enough enthusiasm for such a project, and get any objections down to minimum, I can help implementing it. But I won't be spearheading it.
>
>> make/common/NativeCompilation.gmk line 1178:
>>
>>> 1176:                 endif
>>> 1177:                 # This only works if the openjdk_codesign identity is present on the system. Let
>>> 1178:                 # silently fail otherwise.
>> Might want to add a comment here:
>>      #  The '-f' option will replace an existing signature if one exists.
> We're not really in the habit of adding comments for various command line options. Normally, you can check these with "man" if you are uncertain. If they do something surprising, sure, but here it's more of a "it's needed on aarch64 to work at all", so I don't think a comment will be anything but added clutter.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/2200

@magicus - I'm good with both of these answers. I personally like 'macosx'.

Dan
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Gerard Ziemski-2
On Wed, 3 Feb 2021 23:29:30 GMT, Gerard Ziemski <[hidden email]> wrote:

>> using ` ```c ` https://docs.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks
>>
>> I was wrong about `SIGFPE` / `EXC_MASK_ARITHMETIC`, it's used on i386, x86_64:
>> https://github.com/openjdk/jdk/blob/2be60e37e0e433141b2e3d3e32f8e638a4888e3a/src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp#L467-L524
>> and aarch64:
>> https://github.com/AntonKozlov/jdk/blob/80827176cbc5f0dd26003cf234a8076f3f557928/src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp#L309-L323
>> (What happened with the formatting here, ugh?)
>>
>> Your suggestion sounds good otherwise. @AntonKozlov, do you mind to integrate that?
>
> So it should be:
>
> #if defined(__APPLE__)
>   // lldb (gdb) installs both standard BSD signal handlers, and mach exception
>   // handlers. By replacing the existing task exception handler, we disable lldb's mach
>   // exception handling, while leaving the standard BSD signal handlers functional.
>   //
>   // EXC_MASK_BAD_ACCESS needed by all architectures for NULL ptr checking
>   // EXC_MASK_ARITHMETIC needed by all architectures for div by 0 checking
>   // EXC_MASK_BAD_INSTRUCTION needed by aarch64 to initiate deoptimization
>   kern_return_t kr;
>   kr = task_set_exception_ports(mach_task_self(),
>                                 EXC_MASK_BAD_ACCESS | EXC_MASK_ARITHMETIC
>                                 AARCH64_ONLY(| EXC_MASK_BAD_INSTRUCTION),
>                                 MACH_PORT_NULL,
>                                 EXCEPTION_STATE_IDENTITY,
>                                 MACHINE_THREAD_STATE);
>
>   assert(kr == KERN_SUCCESS, "could not set mach task signal handler");
> #endif

Thanks! I've updated the PR with this code, with extra indentation of `AARCH64_ONLY(...)` line, since this is continuation of the first parameter.

I'll fix the formatting in os_bsd_arch64.cpp along other changes to `bsd_aarch64` directory

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Gerard Ziemski-2
On Tue, 2 Feb 2021 18:35:51 GMT, Gerard Ziemski <[hidden email]> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   support macos_aarch64 in hsdis
>
> src/hotspot/cpu/aarch64/vm_version_aarch64.hpp line 93:
>
>> 91:     CPU_MARVELL   = 'V',
>> 92:     CPU_INTEL     = 'i',
>> 93:     CPU_APPLE     = 'a',
>
> The `ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile` has 8538 pages, can we be more specific and point to the particular section of the document where the CPU codes are defined?

They are defined in 13.2.95. MIDR_EL1, Main ID Register. Apple's code is not there, but "Arm can assign codes that are not published in this manual. All values not assigned by Arm are reserved and must not be used.". I assume the value was obtained by digging around https://github.com/apple/darwin-xnu/blob/main/osfmk/arm/cpuid.h#L62

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Vladimir Kempik-3
In reply to this post by Daniel D.Daugherty
On Tue, 2 Feb 2021 22:07:15 GMT, Daniel D. Daugherty <[hidden email]> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   support macos_aarch64 in hsdis
>
> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 195:
>
>> 193: frame os::get_sender_for_C_frame(frame* fr) {
>> 194:   return frame(fr->link(), fr->link(), fr->sender_pc());
>> 195: }
>
> Is this file going to be built by GCC or just macOS compilers?

there is no support for compiling java with gcc on macos since about jdk11, only clang.
considering this and the absence of gcc for macos_m1, the answer is - just macOS compilers.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Florian Weimer-4
In reply to this post by Daniel D.Daugherty
On Fri, 12 Feb 2021 12:22:44 GMT, Vladimir Kempik <[hidden email]> wrote:

>> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 435:
>>
>>> 433: //    |                        |\  Java thread created by VM does not have glibc
>>> 434: //    |    glibc guard page    | - guard, attached Java thread usually has
>>> 435: //    |                        |/  1 glibc guard page.
>>
>> Is this code going to be built by GCC (with glibc) or will only
>> macOS compilers and libraries be used?
>
> only macos comiplers

The comment is also wrong for glibc: The AArch64 ABI requires a 64 KiB guard region independently of page size, otherwise `-fstack-clash-protection` is not reliable.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Vladimir Kempik-3
In reply to this post by Daniel D.Daugherty
On Tue, 2 Feb 2021 22:14:42 GMT, Daniel D. Daugherty <[hidden email]> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   support macos_aarch64 in hsdis
>
> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 486:
>
>> 484:         }
>> 485:       }
>> 486:     }
>
> This appears to be a mix for Mavericks (10.9) and 10.12
> work arounds. Is this code needed by this project?

I wasn't able to replicate JDK-8020753 and JDK-8186286. So will remove these workaround
Gerard, 8020753 was originally your fix, do you know if it still needed on intel-mac ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Daniel D.Daugherty
On Wed, 3 Feb 2021 09:11:50 GMT, Andrew Haley <[hidden email]> wrote:

>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 323:
>>
>>> 321:   str(zr, Address(rthread, JavaThread::last_Java_pc_offset()));
>>> 322:
>>> 323:   str(zr, Address(rthread, JavaFrameAnchor::saved_fp_address_offset()));
>>
>> I don't think this switch from `JavaThread::saved_fp_address_offset()`
>> to `JavaFrameAnchor::saved_fp_address_offset()` is correct since
>> `rthread` is still used and is a JavaThread*. The new code will give you:
>>
>>     `rthread` + offset of the `saved_fp_address` field in a JavaFrameAnchor
>>
>> The old code gave you:
>>
>>     `rthread` + offset of the `saved_fp_address` field in the JavaFrameAnchor field in the JavaThread
>>
>> Those are not the same things.
>
> I agree, I don't understand why this change was made.

Wow, this is scary. I don't understand how I've merged JDK-8257882 like this. I've reviewed cpu/aarch64 changes again, there is nothing suspicious besides this. Thank you very much for catching, fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Vladimir Kempik-3
In reply to this post by Daniel D.Daugherty
On Tue, 2 Feb 2021 22:08:14 GMT, Daniel D. Daugherty <[hidden email]> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   support macos_aarch64 in hsdis
>
> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 221:
>
>> 219:     assert(sig == info->si_signo, "bad siginfo");
>> 220:   }
>> 221: */
>
> Should this code be deleted? From here and from where it was copied
> from if it is also commented out there...

Thanks, will fix in bsd_aarch64 soon, as for bsd_x86 I've filled new bug and pr - https://github.com/openjdk/jdk/pull/2547

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Bernhard Urban-Forster-2
In reply to this post by Daniel D.Daugherty
On Fri, 12 Feb 2021 12:22:09 GMT, Vladimir Kempik <[hidden email]> wrote:

>> Where did this come from - some snippet/example/tech note code? Maybe other people can help figure it out if we provide more info.
>
> This is the version of w^x on-demand switch implemented by microsoft guys. This is enabled only for debug builds.
> @lewurm could you comment here please

Those values can be observed in the debugger, but aren't documented or defined in header files.

This mode was useful for the initial bootstrap of the platform (it helped with missing W^X transitions), but shouldn't be required anymore today. I'm fine with removing it altogether.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9]

Anton Kozlov-2
In reply to this post by Daniel D.Daugherty
On Tue, 2 Feb 2021 22:42:22 GMT, Daniel D. Daugherty <[hidden email]> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>>
>>   support macos_aarch64 in hsdis
>
> src/hotspot/share/logging/logStream.hpp line 35:
>
>> 33: class LogStream : public outputStream {
>> 34:   // see test/hotspot/gtest/logging/test_logStream.cpp
>> 35:   friend class LogStreamTest;
>
> It's not clear why this change is made for this port.

This was done for previous implementation of W^X, for gtests be able to access this test. This not required anymore, this hunk was reverted.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2200
12