[RFC] Baremetal target triple normalization

A recent effort [0] to improve the behaviour of a toolchain configured with a non-normalized target triple has exposed that the normalized triple for baremetal targets is in the wrong order. This breaks various build scripts and assumptions in the wild.

For example, normalizing aarch64-none-elf gives the triple aarch64-none-unknown-elf. However, given the form <arch>-<vendor>-<os>-<env>, it’s the OS which is known to be ‘none’ in this case, not the vendor.

Normalized triples of the incorrect form have been emitted from clang for a while so they’ve found their way into various places, for example tests in llvm-project, and ultimately build scripts in the wild.

There is a proposed fix at [1] which reinstates the correct order if ‘none’ is detected. This means that a toolchain built with the triple specified as aarch64-none-elf will do the right thing, and this suits Arm’s uses of "OS = none" in baremetal targets. There may be further fallout to build scripts whilst this lands.

We don’t know how this may be used in the wild by other targets though. It’s expected that targets such as riscv-unknown-elf would retain their normalization as riscv-unknown-unknown-elf after these changes. Only targets with “none” in the name are affected.

Downstream at Arm we’ve had to make a handful of changes to build scripts and cmake files to account for the triple normalization introduced in [0], and we’d have to do the same again for [1].

In light of this, I propose three options.

  1. Land [1] and update triple normalization to flip -unknown-none- to -none-unknown-
    • Consequences: not fully known until this reaches the builders and users.
  2. A) Revert [0] B) Attempt to fix triple normalization C) Once normalization has been established, reland [0].
  3. Limit normalization (edit: of -none-unknown- to -unknown-none-) to Arm architectures.
    • Consequences: The normalization of triples would be different across architectures. This may be the safest thing with respect to avoiding breakage of other architectures, at the expense of having more circumstance-specific quirks.

I’m currently leaning towards (2). Option (3) seems to lead to cross architecture quirks which would be good to avoid.

Comments welcomed.

[0] CompilerRT: Normalize COMPILER_RT_DEFAULT_TARGET_TRIPLE by wzssyqa · Pull Request #89234 · llvm/llvm-project · GitHub
[1] Triple::normalize: Set OS for 3-component triple with none as middle by wzssyqa · Pull Request #89638 · llvm/llvm-project · GitHub

I don’t think that revert [0] is a good idea, since it fixes a bug for even more widely used. Without [0], clang will fail to find libclang_rt if llvm is configured with none-normalized triples.

And, in fact it is quite easy to fix Triple::normalize.
In fact if you decide to normalize aarch64-none-elf to itself, it won’t be difficult.

Your efforts to improve the build process are appreciated, it’s understood that this is fixing a long standing issue which is desirable to fix. Reverting in reaction to discovering fallout (and relanding once accounted for) is an option.

Yes, fixing normalize is easy in isolation. The issue is whether there is a knock on impact for other users (and targets) in the wild.

If the issue has been present for a while, is it necessary to rush a fix once brokenness is revealed?

Changing the build process in this manner caused several independent instances of build scripts to fail, I understand it to be a common way for these toolchains to be configured. At this point I understand we’d be happy with fixing it, and even to accept one more round of breakage to make it correct.

However, the concern is one of doing this without first asking the community and giving them a chance to react. It’s possible that fixing the triple will break other builders or users and we hope to at least give some warning and the chance to course correct if necessary.

1 Like

@wzssyqa could you summarise the high level goal of your changes? Perhaps you can link to an earlier PR or comment you already wrote or link to a build that’s been fixed by your change, because Peter and myself are coming into this mid-stream so it’s hard to get the full picture.

From what I guess, this change might actually make our toolchains simpler in the long term by removing the need to symlink various paths. So if that’s the case I’m quite excited to see it done.

When I start try to work on this is due to that, Debian is using 3-comp triples, which has no vendor section.
So, if we configure llvm with -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu, the libraries will be install into lib/aarch64-linux-gnu. But clang searches the libraries in lib/aarch64-unknown-linux-gnu.

Another goal of new-style, it will be useful, if we want to do something like Debian’s multi arch.
For example, Debian’s armel port uses arm-linux-gnueabi, and armhf uses arm-linux-gnueabihf. If we use the old-style, we will have file conflicts.

Yes. It may break some user in wild, while if we just revert the patch, we will never can find them.

If we do break them, when they cry in future, we can help to fix it.
And in fact, if the user is using -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR, we won’t break it.

Of course, we have another choice: ask clang to find libraries from lib/<LLVM_DEFAULT_TARGET_TRIPLE> always.
But it will make things more complex.

So, the finial question is: should we try to standard the triples?
or just leave them chaoes like current?
GNU has a tool called config.sub, should we have a similar one?

To rephrase, I think what @wzssyqa tries to say here, is that when doing builds with LLVM_ENABLE_PER_TARGET_RUNTIME_DIR enabled, a longstanding issue has been that unless triples are specified in the exact right format, Clang won’t find the libraries, as Clang looks in a directory with normalized triples, like lib/aarch64-unknown-linux-gnu.

Therefore, patches have been merged to normalize triples in the places where they are used for the install directory, when used for LLVM_ENABLE_PER_TARGET_RUNTIME_DIR. (In practice, the normalization has probably extended a bit further than that, but as far as I know, the actual intended effect has been primarily to normalize the triples as they are used for the install directory together with LLVM_ENABLE_PER_TARGET_RUNTIME_DIR.)

In general, yes, using LLVM_ENABLE_PER_TARGET_RUNTIME_DIR can help for that, but if you’re speaking about the compiler-rt libraries with arch suffix, the example you bring up is not actually an issue there. For the old style compiler-rt library names, it actually uses different suffixes based on the float ABI: llvm-project/clang/lib/Driver/ToolChain.cpp at llvmorg-18.1.4 · llvm/llvm-project · GitHub

Nevertheless, I’m sure the LLVM_ENABLE_PER_TARGET_RUNTIME_DIR layout can have many benefits in various situations, but this specific example does already work before that layout.

My understanding of baremetal development might be limited, but my preference is to land [1].
If normalizing aarch64-none-elf to aarch64-unknown-none-elf instead of aarch64-none-unknown-elf moves toward the right direction,
Let’s do it. Accruing special cases/workarounds might lead to more pain in the long term.

The number of none-unknown test changes actually looks manageable.
The worst case is that we notice some backward compatibility story that we don’t account for today,
and we add temporary workarounds for clang --target=aarch64-none-elf to probe lib/aarch64-none-unknown-elf (old LLVM_ENABLE_PER_TARGET_RUNTIME_DIR hierarchy)
beside lib/aarch64-unknown-none-elf.


The description of [1] Triple::normalize: Set OS for 3-component triple with none as middle is difficult to parse.
But from looking at the test changes, the change looks desired.

This may have to be incremental and if there is backward compatibility concern we need to account for it.


I have thought about how to support Debian-style multiarch <arch>-<os>-<env> (no <vendor>):
https://reviews.llvm.org/D110663
My conclusion is that we should use normalized target triples for runtime libraries (e.g. /tmp/Rel/lib/clang/19/lib/x86_64-unknown-linux-gnu/ /tmp/Rel/lib/x86_64-unknown-linux-gnu/) but have some probing code to support GCC installations with un-normalized triples (e.g. /usr/lib/gcc-cross/aarch64-linux-gnu/).

Thanks for the input. I’m now leaning towards landing [1] (option 1) as well. I propose we wait until Wed 1st May (one week from RFC) to land the patch to give a chance for any other comments to come in.

1 Like

[0] has now landed and has a release note in [1]:

The normalization of 3 element target triples where -none- is the middle element has changed. For example, armv7m-none-eabi previously normalized to armv7m-none-unknown-eabi, with none for the vendor and unknown for the operating system. It now normalizes to armv7m-unknown-none-eabi, which has unknown vendor and none operating system.

The affected triples are primarily for bare metal Arm where it is intended that none means that there is no operating system. As opposed to an unknown type of operating system.

This change my cause clang to not find libraries, or libraries to be built at different file system locations. This can be fixed by changing your builds to use the new normalized triple. However, we recommend instead getting the normalized triple from clang itself, as this will make your builds more robust in case of future changes:

$ clang --target=<your target triple> -print-target-triple
<the normalized target triple>

Thanks for the input everyone.

[0] 3-component triple *-none-* is incorrectly normalized to *-none-unknown-* instead of *-unknown-none-* · Issue #89582 · llvm/llvm-project · GitHub
[1] [clang][Docs] Add release note for {target}-none-{environment} triple normalization changes by DavidSpickett · Pull Request #90734 · llvm/llvm-project · GitHub.

1 Like