RFC: Another go at a cross compiler config file.

A while back (2012) there were a few messages related to using YAML config files to set up how clang would build stuff, especially for cross compilers. My ELLCC project is entirely cross compilation focused, so today I decided to play around with the config file idea. Right now it only handles replacing a “-target foo” option with the options defined in the file foo in the resource/config directory, but I think it has potential for doing quite a bit more.

I describe it a little more on my blog at http://ellcc.org/blog/?p=11877

Any comments/criticisms would be greatly appreciated.


There aren’t a lot of details on your blog, but the basic idea sounds great. I’m looking forward to seeing patches for this!

Please don't overload the -target option like that, but make it a
separate option.


I disagree. One of the problems with clang’s driver now is that we have such a long list of built-in targets. From the user’s point of view, whether the target settings come from a yaml file or from hardcoded logic in the driver should be an implementation detail. It would be great if we could default to specify _all_ targets like this, and then choose which targets to build into the driver based solely on performance. (It will be a while before any mechanism like this can support all the features for some of the more complicated targets, so I’m not proposing that we completely switch to that approach now, but I would like to see us move in that direction.)

On that note, Richard, do you have any thoughts on how we could support building these yaml specification into the driver? It seems like many of the targets that are currently hardcoded could be handled with this approach, but the compile-time cost of reading a separate yaml file would be unacceptable in some situations.


Whether it is reasonably possible for such a transistion has to be seen.
If you look at the target logic in the driver, you will seen that
estimated 70% of the complexity is related to Linux, maybe 10% each to
Darwinish systems and Windows and the rest for all other targets
together. So while this can greatly simplify the maintainance cost and
overall code size for Linux, it createse a complexity regression for
other systems. That's a good enough reason for my request to keep it
optional. I'm not against this feature -- if it is well done, it can
improve the status quo. But I am against forced overhead for systems
where we don't need it.


Right. I was just about to mention this. It has, historically, been a
design requirement of the driver that the load time of a separate file is
unacceptable. A separate option of reading in a file with config data might
be useful in some cases, but I think we should see what the actual use
cases are here first :slight_smile:


I agree that all drivers should not have to pay the overhead of reading a file for each compilation. I’ve updated my blog post with a bit more information ( ). Here’s the gist of it: I use the -target argument, or the prefix on the compiler name, to try to open the config file. If a file of that name isn’t found, the -target argument passes through unscathed and works as it does now. I use YAMLIO to read the configuration into a structure… …After some good discussion on the LLVM mailing list, I realized that with a simple registration process, statically initialized info structures could be registered by pre-existing drivers. This would obviously eliminate the need to read the config file for every compilation. Thanks for the input! -Rich

I've updated my blog with the current state of my compilation configuration prototype. http://ellcc.org/blog/?p=13246
Again, any comments/criticisms greatly appreciated.



Did this make it into review / commit? It's probably too late for 3.6 if it didn't, but I'd love to see it in 3.7. It would also be nice if this support could be something a bit more general in LLVM, rather than specifically in clang, so that other llvm tools (e.g. llvm-objdump) could take advantage of a similar format for target description files if they need something more than just the triple argument.


Hi David,

When I posted that there wasn't a lot of enthusiasm for the approach. I am using it but people thought it would complicate the driver and didn't like the fact that a file might need to be read each time the compiler is invoked (it doesn't really require that a file be read, the YAML can be compiled in, but anyway).

I still think the approach is useful, so after I saw your post I made another example that makes clang a Windows cross compiler using the MinGW linker and libraries. I've posted it here: http://ellcc.org/blog/?p=23077

I like the idea of making it more general and allowing it to be used for other LLVM utilities.


I probably wouldn't want it to be required, but reading one extra file is not likely to be a major performance issue when compiling C code, because (at least until modules are ubiquitous) you're likely to be reading a load of C files anyway. It seems a shame currently that we build clang, which is intrinsically a cross-compiler, but actually using it as such with existing build systems is often painful because certain CFLAGS must be inserted into every call at a specific place to make sure that it picks up the right things.

Once you have something that's ready for general review, I'd be happy to take a look.


David Chisnall wrote:

but actually using it as such with

existing build systems is often painful because certain CFLAGS must be
inserted into every call at a specific place to make sure that it picks up
the right things.

Note that CMake has very good support for using clang as a cross compiler
and will add the required flags.



Note also that any time I've tried using the ubuntu packaged clang as a
cross-compiler it has not worked. I talked to the package maintainer about
that at one point, and using clang as a cross compiler was out of scope for
his packages, so you have to not use those packages.