Sanitizers libs in Compiler-RT

That is, I still don't see what the problem is - it's relatively easy to
enable building just the compiler-rt library on ARM and not enable building
sanitizers on ARM.

For some reason, when I added the compiler-rt directory (even before my
CMake changes), the Clang tests *required* the Asan libraries in
lib/clang/3.5/linux, which is why I added it in the first place. I think
this is wrong and should be fixed (though I have no idea how) in the Clang
CMake files.

Huh? What tests are you talking about? I thought that tests for Clang
driver verifiy that the command "clang++ -fsanitize=address ...." produce a
correct linker invocation (with path to
/lib/clang/3.5/linux/libclang_rt.asan-arm.a), but don't actually require
the libraries to be built there. If it's not the case, we should fix that.

In CMake build system you can just run "make compiler-rt".
Not sure how to do this in configure+make w/o "make clean"...

Make compiler-rt doesn't re-make it on changes. Nor does make check-asan
or check-all.

Right now, in my CMake build tree I've ran "make compiler-rt", then changed
compiler-rt/lib/absvsi2.c, then ran "make compiler-rt" again, and saw
that lib/clang/3.5/lib/linux/libclang_rt.x86_64.a was indeed rebuilt.

I was changing the lit.config files, probably they're not in the dependency
graph?

--renato

I got linker errors, since /lib/clang/3.5/linux/libclang_rt.asan-armv7l.a
(or arm for that matter) wasn't there, since "arm" is not a recognized
architecture.

My point is that Clang sanitizer tests should *only* run IFF:

1. Compiler-rt is present AND
2. The architecture is recognized AND
3. The Asan/Lsan/UBsan etc libraries were enabled via some build option
(individually checked).

I may be wrong, but right now, it seems that they all run once compiler-rt
is available.

cheers,
--renato

Correction, not only the lit.config but also the unit-tests within asan
(source files). They didn't trigger their own build either.

cheers,
--renato

Huh? What tests are you talking about? I thought that tests for Clang
driver verifiy that the command "clang++ -fsanitize=address ...." produce a
correct linker invocation (with path to
/lib/clang/3.5/linux/libclang_rt.asan-arm.a), but don't actually require
the libraries to be built there. If it's not the case, we should fix that.

I got linker errors, since /lib/clang/3.5/linux/libclang_rt.asan-armv7l.a
(or arm for that matter) wasn't there, since "arm" is not a recognized
architecture.

What tests under tools/clang/test were failing and with what errors? I
thought that we usually test Clang with -### option, that is, only look at
the
args we're going to pass to linker, not actually invoke it.

My point is that Clang sanitizer tests should *only* run IFF:

1. Compiler-rt is present AND
2. The architecture is recognized AND
3. The Asan/Lsan/UBsan etc libraries were enabled via some build option
(individually checked).

I may be wrong, but right now, it seems that they all run once compiler-rt
is available.

I think that Clang tests just shouldn't depend on the presence of
compiler-rt and should always run.

Right now, in my CMake build tree I've ran "make compiler-rt", then
changed compiler-rt/lib/absvsi2.c, then ran "make compiler-rt" again, and
saw that lib/clang/3.5/lib/linux/libclang_rt.x86_64.a was indeed rebuilt.

I was changing the lit.config files, probably they're not in the
dependency graph?

Correction, not only the lit.config but also the unit-tests within asan
(source files). They didn't trigger their own build either.

How can I reproduce it?

Do you have an ARM box? :wink: I think any non-supported platform (mips?) would
do, but now I'm not sure any more. It takes so long to build things that I
get lost in the process.

I'm posting the Clang patch soon, which should make my compiler-rt changes
harmless when they land. If they work, I think we should just get something
in and trim the problems as they appear, one by one, rather than chase
ghosts in this thread (which is now just us).

Let's close this here, and follow up on the two phab reviews.

cheers,
--renato

I thought some of it still need libunwind (whichever of the various versions you like)? It would be good to get a nice, clean implementation of that functionality (whether based on one with an MIT license if there is such or not) if we don't already have it.

The unwinder in libcxxabi implements both the _Unwind_* functions needed by libcxxabi and the unw_* functions that are the “libunwind” API. There is nothing more needed.

- There is the core runtime library. Historically this was called 'compiler-rt' informally, but perhaps better called 'libclang_rt', which provides the core necessary runtime library facilities to compile C or C++ applications. It's analogous to libgcc but without some of the unwinding code (as I understand it, there may be details I'm wrong about here or glossing over, but it's not relevant to the organization of things).

For some reason, the (generic, language-agnostic) unwind code is in libcxxabi. There was some discussion about moving it into the compiler-rt repository, where it would make sense. No one objected, but I'd rather not move it without a 'yes' from someone who is actually working on the code currently (I'd like to start factoring out the #ifdefs into a cleaner platform / architecture layer).

The logic was that libcxxabi is the biggest client of the unwinder, and well, compiler-rt was already complicated enough :wink: That said, if we had a nice clean, scalable model for organizing all the runtime support libraries, I’d be happy to migrate the unwinder there.

Also, to help explain my bias, at Apple, the unwinder (and libc++abi) are dylibs that ship as part of the OS. They have nothing to do with the compiler.

A couple of other random thoughts about compiler-rt:

* One of the makefile dimensions of complexity is the ability to build optimized, profile, and debug copies of everything. This was once needed at Apple, but no longer is necessary.

* One of the interesting things about compiler-rt is the static library to dynamic library migration (e.g. libgcc.a vs libgcc_s.so, or on Darwin libclang_*.a vs libSystem.dylib). If the shared library ships independently from the compiler, then the compiler may need a .a file that can ship with it that contains any support functions not available in a shared library on the target. Currently, it is a very manual process to figure out which functions are needed where.

* It would be nice if the clang build system could output a list of all possible support functions it might need for compiler being built. That list could drive what parts of compiler-rt need to be built.

So, to me an ideal build system for compiler-rt would not just compile the snippets of code, it would figure out which snippets to build based on what the compiler needs and what the OS needs/provides.

-Nick

Hi Nick,

These features would be great, but I think it's too complex to migrate from
what we have today to that scenario and people won't buy-in that easily.

I think the fact that all sanitizers, profilers and possibly unwinders have
their libraries in there is a good reason for us to have separate builds of
each library and treat "Compiler-RT" as a repository for run-time
libraries, rather than a library in itself.

I'm in favour of naming what we call today "compiler-rt" to "libclang" and
moving libcxxabi into it as "libunwind" or anything relevant, but to softly
migrate the interactions between RT and Clang over time.

A few steps like:

1. Move libraries in, rename, compile everything everytime
2. Separate libraries' build systems, disable via flags / arch support (on
both clang and rt)
3. Get Clang to list *libraries* needed, and make RT's make system to only
build those
4. Get Clang to list *functions* needed, and update RT's make system
accordingly

I think having step 3 would be amazing, but 2 is already good for all
practical purposes. Step 4 is way past *my* needs, but I don't think it's a
bad thing to have.

cheers,
--renato

Hi Nick,

A few of my colleagues and I have been giving these general issues some serious thought - with a view to floating an abstract for a BOF proposal for Edinburgh - seems like a topic with a fair few interests to be resolved.

I think it is probably clear that we can't achieve everything that everyone wants in a single step - but also that having an agreed "end goal" and a possible route to get there is essential if we want to avoid chaos.

I thought some of it still need libunwind (whichever of the various versions you like)? It would be good to get a nice, clean implementation of that functionality (whether based on one with an MIT license if there is such or not) if we don't already have it.

The unwinder in libcxxabi implements both the _Unwind_* functions needed by libcxxabi and the unw_* functions that are the “libunwind” API. There is nothing more needed.

So that component *could* comprise a stand-alone unwind dylib/so/a?

One organizational thought to add:

For us, the sanitizers contain behaviors/syscalls that are not valid on consumer hardware. They perform “debug” functionality and thus can only operate on special developer kits. Additionally, we need to put this type of code into a dynamic library instead of a static library to avoid versioning problems.

I agree that this is all part of compiler_rt as a project/repo, but the sanitizers should remain segregated into their own runtime library.

Alex

I think many, many folks need the sanitizers to be segregated into their
own runtime libraries.

In fact, from a functional perspective, many of the sanitizers' runtimes
are really mutually exclusive and should never even be in the same runtime
library.

- There is the core runtime library. Historically this was called 'compiler-rt' informally, but perhaps better called 'libclang_rt', which provides the core necessary runtime library facilities to compile C or C++ applications. It's analogous to libgcc but without some of the unwinding code (as I understand it, there may be details I'm wrong about here or glossing over, but it's not relevant to the organization of things).

For some reason, the (generic, language-agnostic) unwind code is in libcxxabi. There was some discussion about moving it into the compiler-rt repository, where it would make sense. No one objected, but I'd rather not move it without a 'yes' from someone who is actually working on the code currently (I'd like to start factoring out the #ifdefs into a cleaner platform / architecture layer).

The logic was that libcxxabi is the biggest client of the unwinder, and well, compiler-rt was already complicated enough :wink: That said, if we had a nice clean, scalable model for organizing all the runtime support libraries, I’d be happy to migrate the unwinder there.

On FreeBSD, the main consumers of the unwind library are:

- libgcc_s / compiler-rt (our libgcc_s.so actually uses compiler-rt code here) for the C personality function.
- libcxxrt for the C++ personality function
- libobjc for the Objective-C and Objective-C++ personality functions (libobjcxx on older versions where the C++ ABI library is not dynamically linked).

Thus we'd like to import and contribute to a cleaner unwind library, but for us the code will end up in a libgcc_s replacement, and so having it in the same place as the rest of the libgcc_s replacement code (compiler-rt) seems more obvious.

Having a library as a child of one of its consumers is a bit odd.

Also, to help explain my bias, at Apple, the unwinder (and libc++abi) are dylibs that ship as part of the OS. They have nothing to do with the compiler.

We also ship compiler-rt as part of the OS, not specifically as part of the compiler (the same is true of libgcc* and crt*), so I don't understand this argument.

David

* One of the interesting things about compiler-rt is the static library
to dynamic library migration (e.g. libgcc.a vs libgcc_s.so, or on Darwin
libclang_*.a vs libSystem.dylib). If the shared library ships
independently from the compiler, then the compiler may need a .a file that
can ship with it that contains any support functions not available in a
shared library on the target. Currently, it is a very manual process to
figure out which functions are needed where.

* It would be nice if the clang build system could output a list of all
possible support functions it might need for compiler being built. That
list could drive what parts of compiler-rt need to be built.

So, to me an ideal build system for compiler-rt would not just compile
the snippets of code, it would figure out which snippets to build based on
what the compiler needs and what the OS needs/provides.

Hi Nick,

These features would be great, but I think it's too complex to migrate
from what we have today to that scenario and people won't buy-in that
easily.

I think the fact that all sanitizers, profilers and possibly unwinders
have their libraries in there is a good reason for us to have separate
builds of each library and treat "Compiler-RT" as a repository for run-time
libraries, rather than a library in itself.

I'm in favour of naming what we call today "compiler-rt" to "libclang" and
moving libcxxabi into it as "libunwind" or anything relevant, but to softly
migrate the interactions between RT and Clang over time.

FYI the name "libclang" is already in use <
Choosing the Right Interface for Your Application — Clang 18.0.0git documentation, so a different name
would need to be chosen.

-- Sean Silva

Yes, my laziness is immense. I think the proposed name is libclang_rt etc.

--renato

Hi,

For us, the sanitizers contain behaviors/syscalls that are not valid on
consumer hardware. They perform "debug" functionality and thus can only
operate on special developer kits. Additionally, we need to put this type of
code into a dynamic library instead of a static library to avoid versioning
problems.

Could you elaborate on this? I can't imagine how making sanitizer
runtime a shared library would help with the versioning problems,
other than create them. As a static library it is always in sync with
the compiler (unless someone wants to distribute them with an OS, of
course).

There are two issues with compatibility in the sanitiser libraries:

- Compiler-Runtime interfaces
- Runtime-OS interfaces

The first is made easier with a static library, the latter is made much harder because modifying the OS (including libc and system calls) can break the static library.

Both are made easy with a shared library. The first can be addressed using symbol versioning, the second by updating the shared library whenever the underlying interfaces change.

David