Changing the way Clang's driver computes the library search paths on Linux

Hello folks,

I’d like to change the way that Clang’s driver computes the library search paths on Linux. The current system is very ad-hoc, and does not match the reality of what GCC does. As Clang’s driver strives for GCC compatibility, I’d like to address this by modeling the GCC behavior is closely as possible. I have attached a patch which does exactly this. To the best of my ability to test a trunk-built GCC through manipulating a fake filesystem’s directory structure, this patch will make Clang search the same set of directories as GCC would. It also simplifies the logic significantly, and makes it less brittle in its assumptions about the underlying filesystem layout.

Unless I hear objections, I plan to commit this as it in every way I can test it makes Clang’s driver strictly more compatible with GCC, and makes it possible to fix several bugs when using Clang as part of a multilib cross-compiling toolchain. If there are specific distros which need special behavior, we should add that predicated on the distribution.

Rafael, I CC-ed you because I know you’ve worked hard on this before, and may have a large number of distributions installed on VMs. If you can test Clang with this patch (or after I commit it) and report places where the behavior doesn’t match GCC’s that would be very helpful.

fix-lib-paths.patch (5.22 KB)

I’ll be thrilled if this resolves issues on Debian.

  • Marc

Marc J. Driftmeyer wrote:

I'll be thrilled if this resolves issues on Debian.

For the time being tho, the clang trunk seems to work fine on debian
if you apply the debian clang patch "11-searchMultiArchLibDir.patch"
("apt-cache source clang" and look for it).

-miles

Can you post this patch here? It would save me some time as I’m not a debian user.

Anyways I’m planning to submit this after testing on a bunch of VMs, so I’ll make sure debian is in working order.

It is, at least, no more broken than it was before. However, there still appears to be no way of specifying the location of crt*.o, and since reverting my patch it no longer respects -B, so cross-compilation toolchains that target Linux are back in the not-working category.

David

If you have a system root with a ‘/lib/crt*.o’ (or a /usr/lib/crt*.o) in it, then I believe --sysroot will work. If it doesn’t I will fix it until it does. That’s because this is the cross-compiling environment I’m working with. I’m not sure its there yet, but I do plan to make sysroot work.

Why sysroot rather than -B? Because sysroot seems much more principled in its behavior, and because the previous toolchain I was working with heavily was a GCC one that used sysroot heavily. It’s only use of -B was to locate auxillary binaries to run during the build, which I believe Clang supports.

I’m still happy for you to propose patches using -B, but they need to not break existing use cases.

It is, at least, no more broken than it was before. However, there still appears to be no way of specifying the location of crt*.o, and since reverting my patch it no longer respects -B, so cross-compilation toolchains that target Linux are back in the not-working category.

If you have a system root with a '/lib/crt*.o' (or a /usr/lib/crt*.o) in it, then I believe --sysroot will work. If it doesn't I will fix it until it does. That's because this is the cross-compiling environment I'm working with. I'm not sure its there yet, but I do plan to make sysroot work.

Why sysroot rather than -B? Because sysroot seems much more principled in its behavior, and because the previous toolchain I was working with heavily was a GCC one that used sysroot heavily. It's only use of -B was to locate auxillary *binaries* to run during the build, which I believe Clang supports.

The Palm PDK, that I am working with, provides a sysroot environment, but it only include platform libraries, not compiler libraries. The crt*.o files are installed in /opt/PalmPDK/arm-gcc/lib/gcc/arm-none-linux-gnueabi/4.3.3/, while the sysroot is /opt/PalmPDK/arm-gcc/sysroot.

-sysroot and -B must therefore both be used. The former to provide the location of the sysroot, the latter twice, to provide the location of the different parts of the GNU toolchain (the assembler / linker in /opt/PalmPDK/arm-gcc/bin and crt*.o in /opt/PalmPDK/arm-gcc/lib/gcc/arm-none-linux-gnueabi/4.3.3/).

Since clang does not accept multiple --sysroot arguments, I can't use --sysroot for this.

I'm still happy for you to propose patches using -B, but they need to not break existing use cases.

It would help if you told me WHAT didn't work with my last patch. I added = to the start of a couple of paths. If -B is not specified, then this should have no effect. If -B is specified and the searched-for files do not exist in that location, then it should still have no effect...

David

Chandler Carruth wrote:

Can you post this patch here? It would save me some time as I'm not a
debian user.

I've added it as an attachment. From the headers, it looks like it's
already in llvm's bugzilla.

Thanks,

-Miles

11-searchMultiArchLibDir.patch (1.1 KB)

FYI, I’ve got VMs set up for most of the relevant distros at this point, and I think the patch with one slight fix should be good to go. I’m planning on committing it, along with improved sysroot support shortly. Once that’s in, I’ll be testing it out on lots of different distros and fixing any fallout.

I’ll try to get to Debian and get the substance of the patch posted rolled in as well.

Thanks,
-Chandler

Chandler Carruth <chandlerc@google.com>
writes:

FYI, I've got VMs set up for most of the relevant distros at this point, and
I think the patch with one slight fix should be good to go. I'm planning on
committing it, along with improved sysroot support shortly. Once that's in,
I'll be testing it out on lots of different distros and fixing any fallout.

I'll try to get to Debian and get the substance of the patch posted rolled
in as well.

Be sure to use a very recent development release ("unstable") of
Debian, because the "multiarch" changes (which tend to break things
like compilers...) are not in the latest stable release.

Thanks,

-Miles

Should not this change include a fs check for /usr/lib/x86_64-linux-gnu/crt1.o as one of the else if options, not to mention the rest of the architectures Debian supports and builds clang against?

Debian clearly has moved to the /usr/lib/architecture-linux-gnu/ approach for i386 and amd64/x86_64.

ARM seems uniquely distinct with their gnu reference:

Armel for Debian is /usr/lib/arm-linux-gnueabi/crt1.0
Armhf for Debian is /usr/lib/arm-linux-gnueabihf/crt1.0

The rest from Alpha to Sparc are as follows

/usr/lib/architecture-linux-gnu/crt1.0… with the exception of FreeBSD on Debian:

/usr/lib/architecture-kfreebsd-gnu/crt1.0

Source: http://packages.debian.org/search?suite=sid&arch=any&searchon=contents&keywords=crt1.o

  • Marc

Currently, I’m trying to get Clang to as closely match the GCC mainline behavior as I can, without breaking existing distros.

I’m hoping to have time to extend this logic to deal with most distros, if not all.

Currently, Clang doesn’t handle Debian’s multiarch setup correctly at all. I’m going to try to get to this, but its on the bottom of my immediate list because we don’t have any reasonable existing support.

For reference, I specifically checked mainline GCC to see whether it cared about crt1.o or not, and it didn’t. It only cared whether the directory existed. I’ve faithfully reproduced that here for better or worse.

Now, keep in mind, I’m currently only poking the library search paths. I’m also going to look at the code to actually locate crt1.o etc, that’s still to come.

Most of this is committed as of r141000. Both detection of GCC installations and the basic system library paths respect system roots, and the pattern matches that of GCC as closely as I could, and appears to be compatible with the layout of most of the distributions I tested it on.

There is more work to be done to make the triple-detection and installation detection more sysroot aware, and more intelligent. There is also work to be done to remove several of the distribution-specific kludges that are no longer needed, and to properly support the new Debian multiarch structure. I’m also seeing oddities in Ubuntu multilib paths that I’m hoping to correct, but these were pre-existing.

If folks see issues, please let me know and I’ll try to respond promptly.

I’m still happy for you to propose patches using -B, but they need to not break existing use cases.

It would help if you told me WHAT didn’t work with my last patch. I added = to the start of a couple of paths. If -B is not specified, then this should have no effect. If -B is specified and the searched-for files do not exist in that location, then it should still have no effect…

When -B was not specified, there was an ‘=’ prefixing those paths as handed to the linker. You can see this by running clang with ‘-###’. That’s why I checked in the test case I did afterward. Also, when -B is not specified, GCC does the exact same thing, passing the ‘=’ prefixed paths down to the linker.

The manual pages for ‘ld’ on Linux mention translating a ‘=’ at the
beginning of the path into a configure time sysroot prefix (this is,
I believe, distinct from the --sysroot flag which ‘ld’ also can
support). I tested this with a normal binutils ‘ld’, a binutils ‘ld’
with the sysroot flag enabled, and gold with the sysroot flag enabled,
and all of them try to open the path ‘=/lib/…/lib32’, No translation
occurs.

I think at the very least inserting an ‘=’ needs to be conditioned on
some indication that it is supported and desired. I’m also curious to
see what toolchain and whan environment cause it to actually make
a difference.

As you can see, just inserting a ‘=’ prefix isn’t viable. There needs to be some conditioning or some translation.

does not do the same thing, in that it does not pass ‘=’ prefixed paths down to the linker

Need caffeine, sorry…

Rafael, I CC-ed you because I know you've worked hard on this before,
and may have a large number of distributions installed on VMs. If you
can test Clang with this patch (or after I commit it) and report places
where the behavior doesn't match GCC's that would be very helpful.

Thanks a lot for working on this. This code suffered a lot from being patched up as I got more and more distros going.

I don't have them all set up any more, but the "interesting" ones are

* OpenSUSE, which has the linker installed with the compiler
* Fedora/RH, which use lib64 in both the 32 and 64 bit distros
* Debian, which uses lib & lib64 on 32 bit systems and lib32 & lib in 64 bit systems.
* newer Debian systems that use lib/<triple>

A crosstool like setup with gcc, libc and binutils installed in one directory.

The way I tested it was by running a small set of linking tasks with -fPIC, -m32, -static etc and checking that the linker command line was the "same" short of reordering of options.

I really like the idea of creating a fake dir structure so that we can add tests for this!

Cheers,
Rafael

Thanks for all your work. It’s very much appreciated.

  • Marc

Chandler Carruth <chandlerc@google.com>
writes:

Most of this is committed as of r141000. Both detection of GCC installations
and the basic system library paths respect system roots, and the pattern
matches that of GCC as closely as I could, and appears to be compatible with
the layout of most of the distributions I tested it on.

There is more work to be done to make the triple-detection and installation
detection more sysroot aware, and more intelligent. There is also work to be
done to remove several of the distribution-specific kludges that are no
longer needed, and to properly support the new Debian multiarch structure.
I'm also seeing oddities in Ubuntu multilib paths that I'm hoping to
correct, but these were pre-existing.

If folks see issues, please let me know and I'll try to respond promptly.

It certainly doesn't work on debian yet (sorry if you already knew
that; it's not so clear from what you wrote above what "properly
support" means).

   $ clang-trunk -o hw ~/src/hw.c
   /usr/bin/ld: error: cannot open crt1.o: No such file or directory
   /usr/bin/ld: error: cannot open crti.o: No such file or directory
   /usr/bin/ld: error: cannot open crtn.o: No such file or directory
   clang: error: linker command failed with exit code 1 (use -v to see invocation)

   $ clang-trunk --version
   clang version 3.0 (http://llvm.org/git/clang.git f4e541ccecfc06e9818fba83e7ac3a072b1d849e)
   Target: x86_64-unknown-linux-gnu
   Thread model: posix

The above "crt*.o" files seem to exist in two places on this system:

   /usr/lib/x86_64-linux-gnu/
   /usr/lib32/

[presumably for use with 64-bit or 32-bit compiles]

-Miles

Sorry I wasn’t more clear. I’m saving making the new Debian multiarch stuff work for a ‘future work’ (ie, later this week). It’s substantially different, and deviates from upstream GCC layout, and I’m focused on getting upstream-like distros and distros that were previously working in good shape first. I’ll extend it for Debian once it’s well factored.