Triple quirks for finding runtimes - how to improve this?

Currently, the builtins build will emit static libs into per-target directories that are spelled exactly as they are passed to the build (i.e. no canonicalization):

❯ find . -iname '*builtins*a'
./lib/clang/13.0.1/lib/armv7a-unknown-linux-androideabi21/libclang_rt.builtins.a
./lib/clang/13.0.1/lib/x86_64-plex-linux-musl/libclang_rt.builtins.a
./lib/clang/13.0.1/lib/aarch64-plex-linux-musl/libclang_rt.builtins.a
./lib/clang/13.0.1/lib/aarch64-unknown-linux-android21/libclang_rt.builtins.a

However, runtimes will rely on the frontend to find these paths and that fails to find some of these, for example:

❯ ./bin/clang++ --rtlib=compiler-rt -print-libgcc-file-name --target=armv7a-unknown-linux-androideabi21
(..snip..)/lib/clang/13.0.1/lib/linux/libclang_rt.builtins-arm-android.a

This is because the above will use the canonicalized form of the target triple and that eventually hits this code:

  // Special case logic goes here.  At this point Arch, Vendor and OS have the
  // correct values for the computed components.
  std::string NormalizedEnvironment;
  if (Environment == Triple::Android && Components[3].startswith("androideabi")) {
    StringRef AndroidVersion = Components[3].drop_front(strlen("androideabi"));
    if (AndroidVersion.empty()) {
      Components[3] = "android";
    } else {
      NormalizedEnvironment = Twine("android", AndroidVersion).str();
      Components[3] = NormalizedEnvironment;
    }
  }

Because of this, it won’t find the builtins lib, so it falls back to the old layout/name for the lib (but that again doesn’t exist because in fact the libs were created at paths with the complete triples in them).

Dropping eabi seems very intentional, and I guess the idea here is that armv7 on Android only exists with eabi, so spelling it out is redundant. But that breaks the runtimes build for android because of the above difference between actual and computed paths. FWIW, symlinking armv7a-unknown-linux-androideabi21 to armv7a-unknown-linux-android21 makes it output the correct path.

This situation is a bit unfortunate because the error is very confusing (users will only see that the build is looking for lib/clang/13.0.1/lib/linux/libclang_rt.builtins-arm-android.a instead of the right path without any explanation).

I see a couple ways to improve this:

  • provide a way to trace candidate paths, similarly to search paths
  • try the exact triple (without canonicalization) first
  • add a --runtime-target flag that provides the same output, but without canonicalizing the triple (i.e. the old invocation doesn’t change its behavior). This is mentioned in this topic (addressing an adjacent use-case).
1 Like

Sorry, I’ve been meaning to reply to this for a bit.

One other option I didn’t see in your post would be to make the LLVM build install the runtimes to the normalized triple path instead of the triple as it was passed. That wouldn’t solve all issues (e.g. the version number problems from Handling version numbers in per-target runtime directories), but it would at least make the install path logic match Clang’s search logic and avoid e.g. the eabi issue you brought up.

Thanks for the context. I’m wondering if would be helpful to add a utility function to the driver to compute the normalized triples from a given one and rely on that on the CMake side in all cases? i.e.

❯ clang -print-normalized-triple --target=armv7a-unknown-linux-androideabi21
armv7a-unknown-linux-android21

As for the complexity of tracing, it’s probably not too terrible if it’s OK to stop when it finds it (I stepped through the code and there’s a few paths it tries, and it would be trivial to print each time). It might require some refactoring if it would be expected to print all possible paths after one was found.

@phosek mentioned something similar about using the normalized triples in https://discourse.llvm.org/t/rfc-time-to-drop-legacy-runtime-paths/64628/12:

Teach the runtimes build to query the paths from Clang rather than constructing them manually. This would ensure that paths match and would let us remove some of the complexity from our build where we duplicate the path construction logic both in CMake and Clang.

I’m not sure what implementation he had in mind for the Clang side.

1 Like

actually… there is already -print-target-triple that does exactly what I suggested above.

1 Like

This won’t do without some sort of canonicalization:

$ clang -print-target-triple -target amd64-pc-solaris2.11
amd64-pc-solaris2.11
$ clang -print-target-triple -target x86_64-pc-solaris2.11

These are just two different names for the same target, and clang works fine with both. However, you can’t expect compiler-rt to add symlinks for every possible form if you use the output to create the runtime dirs. Perhaps one can use D133407 as a basis for canonicalization?

However, I fully agree that the only viable option is for clang to provide the runtime dir names and compiler-rt to use that, rather than both constantly trying to second-guess each other (and failing).

1 Like

Hmm, that’s odd. Is there a difference between normalization and canonicalization?

-print-target-triple    Print the normalized target triple

whereas the documentation for Triple::normalize seems to imply that it’s the same:

  /// Turn an arbitrary machine specification into the canonical triple form (or
  /// something sensible that the Triple class understands if nothing better can
  /// reasonably be done).  In particular, it handles the common case in which
  /// otherwise valid components are in the wrong order.
  static std::string normalize(StringRef Str);

So I wonder if the above behavior is a bug, or if there should be a clearer definition for normalized and canonical triples.