[RFC] Make "requires arm" also include AArch64 in module map files

Hi all,

While reviewing http://reviews.llvm.org/D10423, I ran into a problem with our _Builtin_Intrinsics module map file. It contains a submodule for arm intrinsics:

explicit module arm {
requires arm

This doesn’t work as intended, because today only 32bit arm targets publish the “arm” target feature, but this module is supposed to be used on AArch64 as well (e.g. for the arm_neon.h header). The target features that we can use in a module “requires” declaration are controlled by TargetInfo::hasFeature(). Only module map parsing currently checks for the arm feature.

I propose that we rename the current “arm” feature to “arm32” and take over the name “arm” to mean “arm32 || aarch64”.

Pros

  • Mirrors the x86/x86_32/x86_64 feature names and functionality
  • Anecdotally, this better matches user expectation of what “requires arm” means

Cons

  • This is a backwards-incompatible change if any modules rely on “requires arm” failing on aarch64 targets.

Would this break existing modules for anyone who cannot just add a “requires !aarch64” to their module map? Or does anyone feel strongly that this is the wrong solution?

The other possibilities I considered, but felt were worse solutions:
(a) Keep the meaning of “arm” and invent a new feature to mean “arm || aarch64”. Mostly I felt this left the existing name unnecessarily confusing.
(b) accepting the state of the compiler and duplicating the “arm” submodule. In theory, isBetterKnownHeader will allow this to “just work” for #includes, but users would have to spell their @imports differently for different targets, or
(c) adding some general feature to spell disjunctions in module requires decls (this didn’t seem worth it since I can’t think of any other users for such a feature).

Ben

Hi all,

While reviewing http://reviews.llvm.org/D10423, I ran into a problem with our _Builtin_Intrinsics module map file. It contains a submodule for arm intrinsics:

explicit module arm {
requires arm

This doesn’t work as intended, because today only 32bit arm targets publish the “arm” target feature, but this module is supposed to be used on AArch64 as well (e.g. for the arm_neon.h header). The target features that we can use in a module “requires” declaration are controlled by TargetInfo::hasFeature(). Only module map parsing currently checks for the arm feature.

I propose that we rename the current “arm” feature to “arm32” and take over the name “arm” to mean “arm32 || aarch64”.

Pros

  • Mirrors the x86/x86_32/x86_64 feature names and functionality
  • Anecdotally, this better matches user expectation of what “requires arm” means

Cons

  • This is a backwards-incompatible change if any modules rely on “requires arm” failing on aarch64 targets.

Would this break existing modules for anyone who cannot just add a “requires !aarch64” to their module map? Or does anyone feel strongly that this is the wrong solution?

I agree that this is the right solution, assuming it doesn’t break anyone.

>
> Hi all,
>
> While reviewing http://reviews.llvm.org/D10423, I ran into a problem
with our _Builtin_Intrinsics module map file. It contains a submodule for
arm intrinsics:
>
> explicit module arm {
> requires arm
>
> This doesn’t work as intended, because today only 32bit arm targets
publish the “arm” target feature, but this module is supposed to be used on
AArch64 as well (e.g. for the arm_neon.h header). The target features that
we can use in a module “requires” declaration are controlled by
TargetInfo::hasFeature(). Only module map parsing currently checks for the
arm feature.
>
> I propose that we rename the current “arm” feature to “arm32” and take
over the name “arm” to mean “arm32 || aarch64”.
>
> Pros
> * Mirrors the x86/x86_32/x86_64 feature names and functionality
> * Anecdotally, this better matches user expectation of what “requires
arm” means
>
> Cons
> * This is a backwards-incompatible change if any modules rely on
“requires arm” failing on aarch64 targets.
>
> Would this break existing modules for anyone who cannot just add a
“requires !aarch64” to their module map? Or does anyone feel strongly that
this is the wrong solution?

I agree that this is the right solution, assuming it doesn't break anyone.

Agreed. Realistically I don't expect anybody besides Apple to be shipping
modules in a situation where ambiguity with aarch64 could be an issue, so
the breakage (if any) should hopefully be limited to apple systems Ben can
test.

-- Sean Silva

I don't see anything wrong with this, and being a new thing, we have
some degrees of freedom. But would be good to match what other
toolchains are doing for the similar effect, or have a good reason not
to. Especially related to names and what they mean. We don't want name
nightmares in the future.

cheers,
--renato

I propose that we rename the current “arm” feature to “arm32” and take over
the name “arm” to mean “arm32 || aarch64”.

I don't see anything wrong with this, and being a new thing, we have
some degrees of freedom. But would be good to match what other
toolchains are doing for the similar effect, or have a good reason not
to. Especially related to names and what they mean. We don't want name
nightmares in the future.

I agree. I looked around for some precedent here but didn’t find much. Maybe someone who’s more familiar with this space knows of some?

For “arm”:
* The closest I found is that GCC has a function attribute "target(arm)” that means to use the “A32” ISA, but this is to distinguish it from thumb (T32), not from aarch64/A64 it looks like.
* I don’t think this should stop us from using “arm” to mean arm32 || arm64.

For “arm32”:
* I made this name up, and haven’t found any useful precedent for this or any other name. The ISA is called A32, but I think the T32 ISA would also count as “arm32” in my scheme…
* Since no is actually asking for this feature, I suppose we could just not add it. Users who wanted arm32 could say “requires arm, !aarch64” for now and we could reconsider this once there is motivation

Ben

Hi all,

While reviewing http://reviews.llvm.org/D10423, I ran into a problem with our _Builtin_Intrinsics module map file. It contains a submodule for arm intrinsics:

explicit module arm {
requires arm

This doesn’t work as intended, because today only 32bit arm targets publish the “arm” target feature, but this module is supposed to be used on AArch64 as well (e.g. for the arm_neon.h header). The target features that we can use in a module “requires” declaration are controlled by TargetInfo::hasFeature(). Only module map parsing currently checks for the arm feature.

I propose that we rename the current “arm” feature to “arm32” and take over the name “arm” to mean “arm32 || aarch64”.

Pros

  • Mirrors the x86/x86_32/x86_64 feature names and functionality
  • Anecdotally, this better matches user expectation of what “requires arm” means

Cons

  • This is a backwards-incompatible change if any modules rely on “requires arm” failing on aarch64 targets.

Would this break existing modules for anyone who cannot just add a “requires !aarch64” to their module map? Or does anyone feel strongly that this is the wrong solution?

I agree that this is the right solution, assuming it doesn’t break anyone.

Agreed. Realistically I don’t expect anybody besides Apple to be shipping modules in a situation where ambiguity with aarch64 could be an issue, so the breakage (if any) should hopefully be limited to apple systems Ben can test.

The Darwin SDKs don’t use the “arm” requirement anywhere. The only case I’ve come across at all is the one in the clang builtin modules.

>
> Hi all,
>
> While reviewing http://reviews.llvm.org/D10423, I ran into a problem
with our _Builtin_Intrinsics module map file. It contains a submodule for
arm intrinsics:
>
> explicit module arm {
> requires arm
>
> This doesn’t work as intended, because today only 32bit arm targets
publish the “arm” target feature, but this module is supposed to be used on
AArch64 as well (e.g. for the arm_neon.h header). The target features that
we can use in a module “requires” declaration are controlled by
TargetInfo::hasFeature(). Only module map parsing currently checks for the
arm feature.
>
> I propose that we rename the current “arm” feature to “arm32” and take
over the name “arm” to mean “arm32 || aarch64”.
>
> Pros
> * Mirrors the x86/x86_32/x86_64 feature names and functionality
> * Anecdotally, this better matches user expectation of what “requires
arm” means
>
> Cons
> * This is a backwards-incompatible change if any modules rely on
“requires arm” failing on aarch64 targets.
>
> Would this break existing modules for anyone who cannot just add a
“requires !aarch64” to their module map? Or does anyone feel strongly that
this is the wrong solution?

I agree that this is the right solution, assuming it doesn't break anyone.

Agreed. Realistically I don't expect anybody besides Apple to be shipping
modules in a situation where ambiguity with aarch64 could be an issue, so
the breakage (if any) should hopefully be limited to apple systems Ben can
test.

The Darwin SDKs don’t use the “arm” requirement anywhere. The only case
I’ve come across at all is the one in the clang builtin modules.

Ok, then since the builtin module will be in sync with its corresponding
clang, we should not have any compatibility concern here.

-- Sean Silva

For “arm”:
* The closest I found is that GCC has a function attribute "target(arm)” that means to use the “A32” ISA, but this is to distinguish it from thumb (T32), not from aarch64/A64 it looks like.
* I don’t think this should stop us from using “arm” to mean arm32 || arm64.

"arm" has been used to mean the architecture, as well as the
instruction set. Since Thumb operates on the same registers, flags and
can inter-operate with ARM, including Thumb in "arm" is reasonable.
Even v8 AArch32, which has some small differences, is an acceptable
match for "arm". So much so that they're all in the same back-end on
all compilers.

But AArch64 is quite a different beast. They're in completely separate
worlds. Even though AArch64 has a compatibility mode (AArch32), it may
not be usable if the system is not setup properly (4K pages instead of
64K pages). So, if joining both in the same modules means sometimes
AArch32 code will be selected during 64-bit execution, this may break
many things. The other way around will *never* work.

I don't think we should get "arm" to mean AArch64.

For “arm32”:
* I made this name up, and haven’t found any useful precedent for this or any other name. The ISA is called A32, but I think the T32 ISA would also count as “arm32” in my scheme…

I'm not a big fan of "arm32" either, probably because I don't think
"arm" should ever mean AArch64. I'd recommend you have a good talk
with the ARM guys before any naming decision.

* Since no is actually asking for this feature, I suppose we could just not add it. Users who wanted arm32 could say “requires arm, !aarch64” for now and we could reconsider this once there is motivation

Expectation vs reality. When I "import arm", I expect to get ARM,
maybe even Thumb, but never AArch64.

cheers,
-renato

For “arm”:
* The closest I found is that GCC has a function attribute "target(arm)” that means to use the “A32” ISA, but this is to distinguish it from thumb (T32), not from aarch64/A64 it looks like.
* I don’t think this should stop us from using “arm” to mean arm32 || arm64.

"arm" has been used to mean the architecture, as well as the
instruction set. Since Thumb operates on the same registers, flags and
can inter-operate with ARM, including Thumb in "arm" is reasonable.
Even v8 AArch32, which has some small differences, is an acceptable
match for "arm". So much so that they're all in the same back-end on
all compilers.

But AArch64 is quite a different beast. They're in completely separate
worlds. Even though AArch64 has a compatibility mode (AArch32), it may
not be usable if the system is not setup properly (4K pages instead of
64K pages). So, if joining both in the same modules means sometimes
AArch32 code will be selected during 64-bit execution, this may break
many things. The other way around will *never* work.

I don't think we should get "arm" to mean AArch64.

For “arm32”:
* I made this name up, and haven’t found any useful precedent for this or any other name. The ISA is called A32, but I think the T32 ISA would also count as “arm32” in my scheme…

I'm not a big fan of "arm32" either, probably because I don't think
"arm" should ever mean AArch64. I'd recommend you have a good talk
with the ARM guys before any naming decision.

+ Tim

Do you have any thoughts on the specific target feature names for module maps I’m proposing (changing “arm” to include AArch64 and adding “arm32”)? As I’ve said, my own (naive) expectation was that “arm” would include “AArch64”, and I was surprised when that wasn’t the case.

I agree, "arm" is the one I was most happy with. What we're doing is based on the ARM ACLE document, which covers both AArch32 and AArch64. The underlying architectures are different, but the whole point of this union (and the modules it's being used to implement) is that the programmer shouldn't have to care. It's also the preferred catch-all term by ARM the company. There's just no viable alternative that I can see for the union.

For me, it gets murky when we think about the 32-bit and 64-bit only variants:

  + arm/arm64 and arm/aarch64 are out because of the top-level choice.
  + aarch32/aarch64 would probably be ideal for me, but aarch32 hasn't really caught on elsewhere
  + arm32/arm64 is also possible, especially as Apple are going to be the main consumers.

Anyway, that's my preferred shade of fuschia. Orange spots are also an option.

Cheers.

Tim.

Do you have any thoughts on the specific target feature names for module maps I’m proposing (changing “arm” to include AArch64 and adding “arm32”)? As I’ve said, my own (naive) expectation was that “arm” would include “AArch64”, and I was surprised when that wasn’t the case.

I agree, "arm" is the one I was most happy with. What we're doing is based on the ARM ACLE document, which covers both AArch32 and AArch64. The underlying architectures are different, but the whole point of this union (and the modules it's being used to implement) is that the programmer shouldn't have to care. It's also the preferred catch-all term by ARM the company. There's just no viable alternative that I can see for the union.

For me, it gets murky when we think about the 32-bit and 64-bit only variants:

+ arm/arm64 and arm/aarch64 are out because of the top-level choice.
+ aarch32/aarch64 would probably be ideal for me, but aarch32 hasn't really caught on elsewhere
+ arm32/arm64 is also possible, especially as Apple are going to be the main consumers.

FWIW, we currently have both “arm64” and “aarch64” for the 64 bit variant, so either or both of “arm32”, “aarch32” would fit. Maybe we’d be better off with just “aarch32” so that we avoid introducing a name that isn’t really used anywhere else (arm32).

Ben

Hi Ben,

I prefer aarch32 to refer to the 32-bit state, as that is it's defined
meaning in the ARM glossary, see http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.aeg0014g/ABCDEFGH.html:

"""
AArch32 state:

The ARM 32-bit Execution state that uses 32-bit general purpose registers, and a 32-bit program counter (PC), stack pointer (SP), and link register (LR). AArch32 Execution state provides a choice of two instruction sets, A32 and T32.
In implementations of versions of the ARM architecture before ARMv8, and in the ARM R and M architecture profiles, execution is always in AArch32 state.
"""

In contrast, arm32 doesn't have a definition like that.

Thanks,

Kristof

Thanks for all the feedback everyone. It sounds like we should go with “aarch32” for the 32-bit arm variants and “arm” = aarch32 || aarch64. I’ll probably land this in the next couple of days.

Ben