Stripped-down Clang/LLVM

Hi all,
I'm modifying Clang and LLVM to build a compiler for a c-like
language. I would like to remove as much of the codebase that I am not
using as I can to bring down size and debugging time. I can do without
all of the Objective C and C++ specific elements of Clang and much of
the target specific code in both Clang and LLVM. Before I spend an
entire day hacking away at the Makefiles, has anyone done this
already? Are there modules that I shouldn't waste my time trying to
extricate from source tree?

Thanks in advance,
Dani

Unfortunately, all of them. C, C++ and Objective-C frontends are
implemented in Clang as a single frontend that is controlled by
"language options" configuration struct.

Dmitri

Yes, we're not really designed to be pared down like this — not by targets,
not by language modes, not by dialect features, etc.

That said, if you're really focused on code size, especially "live" code size,
there are some simple source hacks you could make that would make a lot
of code trivially dead. That might even be enough to let a linker strip a lot
of the language-specific functions away.

For example, you could hack LangOptions to define things like this:
  static const bool CPlusPlus = false;
  static const bool ObjC1 = false;
etc.

And an easy next step would be to define some explicit specializations
for getAs, e.g.
  template <> const ObjCPointerType Type::getAs<ObjCPointerType>() const {
    return 0;
  }

John.

Just in case you miss it, you can at least disable some clang parts at configure time using the following options:

--disable-clang-arcmt --disable-clang-rewriter --disable-clang-static-analyzer

-- Jean-Daniel

Thanks Dimitri and John!
You confirm what I suspected after a few weeks of reading through
codebase. My primary concern at the moment is the time it takes to
debug so I will begin by removing targets in the Makefile. I think my
next step is going to be removing target specific code that I don't
need (i.e. Hexagon, PPC, &c.). If I am able to do this, I will post my
process/results to this thread.

Thanks again,
Dani

Removing targets from the LLVM build at least is easy—just specify -DLLVM_TARGETS_TO_BUILD=“X86” (or whatever target(s) you need) to your CMake cache. (I don’t know the equivalent configure flag off the top of my head, but there is one.) I’m not sure whether the LLVM targets-to-build propagates automatically to remove code from Clang, though.

-Joe

It does not. This is intentional: it makes it easier to test frontend behavior, and we've been assuming that the code size performance impact is marginal (because it's all dead code and therefore out of i-cache in practice). I'm not going to vouch for that decision, though.

John.