alternate clang driver

We have developed an alternate clang driver and placed it as open source in google code.
http://code.google.com/p/alternate-clang-driver/

Most people are using cross compilers for MIPS. We needed something different than what the clang driver provides us at this time.

This driver behaves enough like gcc cross drivers for us to run make files and other things meant for gcc right out of the box. We have for example run dejagnu gcc using it with no problems.

It's still in development but it works very well for us right now.

I intended to put more documentation for it before announcing it but our schedules are overloaded now and I can't assign anyone to do that for the next 2 months so I decided to just announce it now.
We can put resources into bug fixing and extensions but we already know how it works so it's not essential for us to do more docs at this time. :slight_smile:

It's a simple program and written in python and I think it's pretty easy to understand from just looking at the code.

We welcome other people to help make it work better for more platforms than just MIPS.

Can you tell us a little bit more about what makes this different? I use clang for cross-compiling for ARM, and invoking clang as arm-none-linux-gnueabi-clang, or as clang -ccc-host-triple arm-none-linux-gnueabi enables it to find all of the other bits of the toolchain fine - I'm using it as an almost drop-in replacement for gcc in the WebOS PDK, which shipped with gcc. The only difficulty it had was finding the various tools, since the Linux toolchain stuff ignores -B, but this was solved with a few symlinks into the sysroot...

David

Hi,

Similarly to David, I use the current Clang driver and while it's certainly not fantastic, I wouldn't have thought that a brand new standalone tool in Python would be the best solution?

Cheers,

James

Hi David,

I'm not really trying to sell people on another tool.

It's there if someone else wants to use it.

I found the clang driver to be overly complex and it was slowing down my development and to me it should have taken about 10 minutes to set up a new port and instead it involved lots of study and doing lots of hardcoding regarding specific directories to search and may other such things in the compiler which I find fundamentally wrong.

With my driver, I could do new ones in no time and make simple install scripts so it can be configured for specific users machine.

I have to fix the builtin clang driver for Mips too and am working on that now.
But I find that driver to be fundamentally problematic. It's just my opinion.

Differences of opinion is what makes horse races.

Reed

It often looks easier to hack up your own code for a specific case rather than generalizing existing code to incorporate the case you care about. But in my experience, it's rarely the right course of action. Inevitably, you end up solving the same problems that the existing code already handled, and spent more time re-inventing wheels than you would have spent in generalizing the wheels that work.

It would be awesome if it were easier to incorporate new ports into the Clang driver. We won't get there if everyone hacks up their own Python script instead.

  - Doug

Well, it's my opinion that a lot of the driver functionality does not belong in the clang front end.
It should be taken out and a tool like what I wrote should be used for that functionality. Many compilers do just that, especially ones that work with many targets/hosts , like the MetaWare compiler for example.

I did not launch on my effort without first thinking about this a lot.

The root of the problem is that clang does not use configure to set it's host/target and other things in stone.

I think that is a good idea but then how do you configure things.

Right now it's done by hardcoding the world in C++ code in the front end, including things like all the specific version numbers of gcc.

Gag me with a spoon!

It can't possibly work because many things can be installed in places that the front end could never know about.

So it works for Apple where they can control where Xcode and other things get installed and will work for vanilla machines of other flavors where they can guess where things are.

Reed

Hi David,

I’m not really trying to sell people on another tool.

It’s there if someone else wants to use it.

I found the clang driver to be overly complex and it was slowing down my
development and to me it should have taken about 10 minutes to set up a
new port and instead it involved lots of study and doing lots of
hardcoding regarding specific directories to search and may other such
things in the compiler which I find fundamentally wrong.

With my driver, I could do new ones in no time and make simple install
scripts so it can be configured for specific users machine.

It often looks easier to hack up your own code for a specific case rather than generalizing existing code to incorporate the case you care about. But in my experience, it’s rarely the right course of action. Inevitably, you end up solving the same problems that the existing code already handled, and spent more time re-inventing wheels than you would have spent in generalizing the wheels that work.

It would be awesome if it were easier to incorporate new ports into the Clang driver. We won’t get there if everyone hacks up their own Python script instead.

  • Doug
    Well, it’s my opinion that a lot of the driver functionality does not
    belong in the clang front end.
    It should be taken out and a tool like what I wrote should be used for
    that functionality. Many compilers do just that, especially ones that
    work with many targets/hosts , like the MetaWare compiler for example.

I did not launch on my effort without first thinking about this a lot.

The root of the problem is that clang does not use configure to set it’s
host/target and other things in stone.

That’s the core difference between Clang and say, GCC: one Clang binary can compile for all its targets (once it learns how to find headers and libraries. It wouldn’t make sense to hard code one target for a full-blown cross-compiler. I do agree some kind of spec file based approach with system specific directories is the way to go, so every client can easily point Clang to its system includes.

I think that is a good idea but then how do you configure things.

Right now it’s done by hardcoding the world in C++ code in the front
end, including things like all the specific version numbers of gcc.

Gag me with a spoon!

It can’t possibly work because many things can be installed in places
that the front end could never know about.

It’s not hard to teach it your setup, it takes about five minutes to cook up a system specific patch, which you could use locally for the time being… Until you or someone else bites the bullet and writes something better.

Ruben

I'm not sure writing a new driver from scratch is better than trying to externalize the configuration in the current driver.

Is there anybody currently working on the universal driver ( http://clang.llvm.org/UniversalDriver.html ) ?

-- Jean-Daniel

No one's currently working on it (or, wasn't last week when I asked). It's on my to-do list, but keeps getting pushed lower down by stuff I actually get paid for...

David

-- Sent from my brain

You will see that no matter how you do this, you will ultimately end up with an isomorphic solution to what I did.

You could try and put all the configuration variables in an XML file.

That will be like the data structures in my program but harder to understand when you want to configure things. You can't factor things then because it's just a big data file. If you do a lot of factoring, you won't be able to understand the file after a while without building some tool.

There are often some tricky things for a given installation, target, etc. and it's easier to fix this in the driver script than rebuilding the front end.

Dynamic scripting is more natural for handling installation issues than hard coding it in the compiler or even if you add reading some kind of external file.

Right now lots of people have to touch code in the same files for the driver, always a bad omen and indicator of design flaws and source of bugs.

What will happen is that over time, people will chip away at this problem and in the end you will have some half baked scripting language inside of the driver that does exactly the subset of python needed for my driver.

Clang should be a C++/C front end and that's it.

Let some natural scripting language worrying about gluing other pieces together.

My 2c.

Reed

Hi Reed,

I am fully behind your line of thinking, but why not be even one step more "radical" than your first step:

XML (or JSON or YAML - some variant of tree with properties as text - never mind, once it is processable) + scripting language which transform the registry to tool invocation parameters.

This means just few lines of important scripting code which everyone can modify (in specific cases) and instant understanding of the required data model (because one look at the bundled XML/JSON variants per distribution will be enough for the (even average) developer to realize the actual requirements).

I am also long time Linux/Python user, but in my feelings, the best for CLang (because there are other aspects too) at the moment is LuaJIT.

Kind Regards,
Alek

Hi,

Seeing as everyone's putting in their 2cents, here's mine.

The problem is that the current Clang driver is not extensible enough, or easily extensible enough. One can argue that a driver doesn't belong in Clang - that's really arguing semantics because the Driver, while living under the clang tree is detached from the rest of Clang and invokes it as subprocesses.

My opinion is that the driver should be either:

  * Pure C++/TableGen with pretty much everything declaratively defined and just some C++ glue. OR
  * Pure C++, reads some sort of configuration file.

The latter allows for distros to more easily adapt Clang without (a) rebuilding it and (b) shoving patches to support their weird directory structure on us.

I do *not* think that launching an external scripting language is best for two reasons.

Firstly it is slower than pure C++. ("Oh but LuaJIT is fast!", "Oh but Python is fast!", "The driver time doesn't matter!" - it does. Clang has been built around build speed and to clobber all that effort because of laziness in the driver isn't an option IMHO. And interpreters, even LuaJIT, aren't that fast to boot).

Secondly because they create an extra dependency which is bad in and of itself IMHO but worse causes real difficulty in the bringup of new, native toolchains. You'd have to somehow cross-compile LuaJIT for your new architecture before you could run a hosted compiler. This is a terrible idea.

Reed, to argue that all solutions would be isomorphic to yours is the same as arguing that Python and C are both Turing-complete and so there is no difference in using one over the other.

Cheers,

James

I don't buy the C++ is faster that Python argument. It's just a driver for a compiler! You could write it in Turing machine primitives and it would be super fast on a modern computer. It's not computing the strongly connected components of a terabyte sized graph.

I think you will have some kind of scripting component; whether it's LUA or Python or some hand brewed language format that is read by the clang driver and then interpreted, that is what you will have when you finish solving this problem. That is what I meant by all solutions will be isomorphic.

Reed

I don't buy the C++ is faster that Python argument.

Well it is, and with such a small script you pay the large fixed cost of loading the CPython interpreter for almost no runtime.

The fact that the python runtime can boot quicker than you can notice on a modern computer doesn't make it fast, it makes it not *noticably slow* by the purist definition. It would slow down large build systems, and for lots of small compiles may significantly increase the build time and load.

I don't buy the C++ is faster that Python argument. It's just a driver
for a compiler! You could write it in Turing machine primitives and it
would be super fast on a modern computer. It's not computing the
strongly connected components of a terabyte sized graph.

The issue is start-up time. It takes longer to launch the python process than it does for the entire compilation and code generation process to happen on small C files at -O0.

I think you will have some kind of scripting component; whether it's LUA
or Python or some hand brewed language format that is read by the clang
driver and then interpreted, that is what you will have when you finish
solving this problem. That is what I meant by all solutions will be
isomorphic.

I disagree. The number of things that different targets need are relatively limited. The vast majority can get away with specifying default include paths, crt*.o locations, ld / as locations, and target triple. If that solves 99% of cases, then it's worth doing that and leaving some external driver for the more complex weird cases.

Adding a dependency on Python (or Lua, or what other buzzword scripting language that you favour this week) for invoking an [Objective-]C[++] compiler seems to redefine overkill.

David

It's going to be cached by the OS after the first time it's loaded.

I completely agree. The startup time for python was a huge problem for the *first* clang driver, which happened to be written about Python.

FWIW, Reed contacted me about this work back in May. I encouraged him to contribute to improving the main Clang driver, and he wasn't interested. It's perfectly fine for him to go off and do something different, but this work clearly isn't interesting to mainline clang development. I don't see why we're still discussing it :slight_smile:

-Chris

Sorry, I thought that (at least) this thread:

http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-June/015563.html

implies future including of the embedded scripting language in clang ...

Ok, but what does that have to do with Reed's driver?

-Chris

I didn't say that I'm not interested to fix this in mainline clang .

I needed something right away and it was not practical to try and sort out all of this in mainline clang in a timely fashion.

It's not a small patch.

We need some way to handle the scripting issues that come with installation.

Clang and LLVM have replaced the historical building of a different gcc compiler for each configuration and produce a single compiler but has not replaced much of what "configure" does in gcc.

There does not seem to be any consensus among the people on this list even as to the approach.

So I'm not interested to spend a lot of time making a 5k line or so patch that is likely to be rejected.

I did my driver and it solved all my problems and is very easy to maintain and understand.
As I refine this code I continue to understand the problems associated with this better.

I think all the same ideas could be implemented in the mainline clang.

This scripting language needs to be able to do many of the same things that perl and python does.

Yes, we could build our own and I could design something lightweight that fits in clang.

But before I would do that I would need to know that my hard work will not get tossed out.

At this point I think it's better for me to continue my prototype in Python.

Reed