Automake Notes (Long)

Folks,

I have completed the addition of automake makefiles to LLVM. All
libraries, tools, and runtime libs build now with automake. Note that
there are still many missing things in the automake support. Right now
it just builds the basic software.

However, before I invest more time in it, I thought some comparison
would help us make some decisions about whether or not to proceed with
automake for the LLVM standard. There are costs and benefits on both
sides.

BUILD/CONFIGURE TIMES

After puzzling about the size of the executables and the build times,
I discovered (thanks Chris!) that I had compiled everything without
debug symbols in the automake version. So, here's some revision from the
first version of this email.

The build times didn't change much (I guess I/O is cheap on my machine).
The new "Build With Automake" times are 20m28.672s (elapsed), 18m1.900s
(user) and 1m38.540s (system).

The real change is in the size of the executables. Thew new values,
while still smaller are much more reasonable. Previously the automake
build and existing build were using different flags to the compiler. The
results below are with the same flags (I double checked).

Automake Existing Pct Name
16903982 46046545 37% analyze (bug in existing makefiles)
73084123 77679274 94% bugpoint
17638401 19137945 92% extract
37945217 47578060 80% gccas
31870129 34163210 93% gccld
56967280 60263187 95% llc
48570878 52162647 93% lli
15040029 16435732 92% llvm-as
50580919 54185542 93% llvm-db
14306895 15667554 91% llvm-dis
69413397 73995210 94% opt

Sorry for the confusion.

Reid.

One more update. The Makefile.am for analyze was wrong. It wasn't
linking in the some of the passes. The new size is 56951088 which is in
line with the other executables.

Also, I have now completed a run of projects/llvm-test/MultiSource with
the tools generated by automake. The only errors were for:

TEST (llc) 'sgefa' FAILED!
TEST (jit) 'sgefa' FAILED!
TEST (jit) 'make_dparser' FAILED!
TEST (llc) 'kc' FAILED!
TEST (jit) 'anagram' FAILED!
TEST (llc) 'mason' FAILED!
TEST (cbe) 'mason' FAILED!
TEST (jit) 'mason' FAILED!
TEST (jit) 'pcompress2' FAILED!
TEST (jit) 'make' FAILED!
TEST (cbe) 'timberwolfmc' FAILED!
TEST (jit) 'agrep' FAILED!

Which isn't far off what the nightly test produces these days.

Reid.

I'm re-thinking my penchant for automake. automake is great for many
standard applications that just need to get portable makefiles up and
running quickly. However, it turns out that LLVM is "different enough"
from a standard application that automake might not be the best choice.

Here's some of the problems I've run into:

1. There's no way to tell automake to build bytecode. Without a lot of
customization of automake itself, it just can't grok the fact that there
might be multiple ways to compile a c/c++ program producing different
results and requiring different tools.

2. The entire llvm-test project would have to either be completely
rethought or stay the way it is, it just can't be easily automake'd
because it doesn't follow the automake pattern.

3. The llvm/test directory would require significant rework to get it to
automake correctly (via dejagnu).

4. There's no way to avoid listing all the sources for every library
(I've tried several alternatives with no good results).

5. There's no way to install bytecode libraries somewhere separately
from other libraries.

6. Creating a distribution tarball would require additional Makefile.am
files to be inserted in all the directories under llvm/include, the
traversal of which would cause additional overhead for every non-dist
target. That's 17 gratuitous "make" processes per build. There doesn't
seem to be a way around this.

7. I can't get automake to stop doing a double configure. That is, when
you configure all the Makefiles are updated. You then go do a build and
it thinks it needs to reconfigure again so it does it twice. This is
just a waste of time (40 seconds on my machine).

8. automake's notion of buiding a library is very fixed. It basically
supports two things: libtool built shared libraries (linking) and ar
built libraries that are also run through ranlib. While its possible to
override AR and LINK, there isn't a way to override RANLIB on a
per-target basis so you can't build both a regular library (requiring
RANLIB=ranlib) and a bytecode library (requiring something like
RANLIB=true) in the same directory. We have in LLVM at least 4 ways to
build libraries: regular .a, pre-linked .o (combination of .o files),
shared-library, and bytecode-library. While I figured out a hack to do
pre-linked .o, it basically uses GNU Make, not automake to make it work
and therefore breaks automake's "make" portability.

9. There's probably a bunch of things I haven't run into yet.

SOOOOOOO ....

Instead of spending a bunch more time on trying to get automake to work,
I suggest we just fix the current makefile system to do what automake
can do. Specifically we need to:

1. Get dependency generation in a single pass like automake. This will
give us about a 30% speed up on builds .. and one of the main reasons
for moving to automake is taken away. Estimate: 1-2 hours.

2. Get targets for building distributions and checking them. For this we
can look at what automake generates and just mimic it. Estimate: 1 day.

3. Get targets for install and uninstall working correctly and make
proper use of the install command (instead of relying on libtool) so
that installation goes faster. Estimate: 1 day.

4. Get "make check" to work by allowing any directory to have a set of
programs to be run that "check" that directory. These could be shell
scripts or whatever. Estimate: 1-2 days.

5. Integrate dejagnu/expect/other into "make check" so that our make
system can run tests that parse program output to categorize the result
as PASS/FAIL/XFAIL/XPASS. This would eliminate our dependence on qmtest
and make all input/output for the tests be plain text that is easily
editable (instead of in a database). Estimate: 1-2 weeks.

I am, of course, soliciting feedback on this whole idea.

Reid.

Instead of spending a bunch more time on trying to get automake to work,
I suggest we just fix the current makefile system to do what automake
can do. Specifically we need to:

[snip]

I am, of course, soliciting feedback on this whole idea.

I would agree that, given the differences, it is better to improve the
current system to do what automake can do than to switch to automake and
teach automake how to do things that our current build system already
does (which in some cases, as you mention, may not be reasonable).

I am not sufficiently familiar with dependency generation, et al, to
comment on it in detail, but I would *love* a "make check" facility with
results listed in plain-text files rather than a database that required
running qmtest, logging into it via a web browser, and updating the
binary DB that way.

And if there's a "make dist" and/or "make rpm" target(s), so much
the better.

I'm re-thinking my penchant for automake. automake is great for many
standard applications that just need to get portable makefiles up and
running quickly. However, it turns out that LLVM is "different enough"
from a standard application that automake might not be the best choice.

I might just here to suggest that Boost.Build (http://boost.org/boost-build2)
might be good.

Here's some of the problems I've run into:

1. There's no way to tell automake to build bytecode. Without a lot of
customization of automake itself, it just can't grok the fact that there
might be multiple ways to compile a c/c++ program producing different
results and requiring different tools.

Boost.Build supports this very good. In particular, on my project on work I
can run:

   bjam toolset=gcc
   bjam toolset=nm
   bjam toolset=nmm

and the first will do a regular build, the third will compile for some
embedded processor, and the third will pass the sources via some annotation
tool and produce an annotated x86 binary.

Besides, "toolset=gcc" can become "toolset=msvc" on windows, and it will have
high chances of working.

2. The entire llvm-test project would have to either be completely
rethought or stay the way it is, it just can't be easily automake'd
because it doesn't follow the automake pattern.

3. The llvm/test directory would require significant rework to get it to
automake correctly (via dejagnu).

No comments on the two points above yet.

4. There's no way to avoid listing all the sources for every library
(I've tried several alternatives with no good results).

Easy, I've already have a half-working setup (only the targets I use are
converted), and, for example,

   lib/Analysis/DataStructure/Jamfile

has nothing but:

   llvm-lib datastructure ;

5. There's no way to install bytecode libraries somewhere separately
from other libraries.

Should be possible.

6. Creating a distribution tarball would require additional Makefile.am
files to be inserted in all the directories under llvm/include, the
traversal of which would cause additional overhead for every non-dist
target. That's 17 gratuitous "make" processes per build. There doesn't
seem to be a way around this.

Why distribution tarball has anything to do with build system?

7. I can't get automake to stop doing a double configure. That is, when
you configure all the Makefiles are updated. You then go do a build and
it thinks it needs to reconfigure again so it does it twice. This is
just a waste of time (40 seconds on my machine).

Ehm.... I'd say that configure should be orthogonal to building. No?

8. automake's notion of buiding a library is very fixed. It basically
supports two things: libtool built shared libraries (linking) and ar
built libraries that are also run through ranlib. While its possible to
override AR and LINK, there isn't a way to override RANLIB on a
per-target basis so you can't build both a regular library (requiring
RANLIB=ranlib) and a bytecode library (requiring something like
RANLIB=true) in the same directory. We have in LLVM at least 4 ways to
build libraries: regular .a, pre-linked .o (combination of .o files),
shared-library, and bytecode-library. While I figured out a hack to do
pre-linked .o, it basically uses GNU Make, not automake to make it work
and therefore breaks automake's "make" portability.

I think given Boost.Build's "properties" (like "toolset" above), this should
be doable.

1. Get dependency generation in a single pass like automake. This will
give us about a 30% speed up on builds .. and one of the main reasons
for moving to automake is taken away. Estimate: 1-2 hours.

Well, Boost.Build is single pass already. Need to compare performance on LLVM,
though.

Basically, I can try to finish my attempt to add Boost.Build files to LLVM.
But I wonder if that makes sense? Is it already decided to just improve the
current system? Is anything with "Boost" in name will be rejected right away?

- Volodya

> I'm re-thinking my penchant for automake. automake is great for many
> standard applications that just need to get portable makefiles up and
> running quickly. However, it turns out that LLVM is "different enough"
> from a standard application that automake might not be the best choice.

I might just here to suggest that Boost.Build
(http://boost.org/boost-build2) might be good.

> Here's some of the problems I've run into:
>
> 1. There's no way to tell automake to build bytecode. Without a lot of
> customization of automake itself, it just can't grok the fact that there
> might be multiple ways to compile a c/c++ program producing different
> results and requiring different tools.

Boost.Build supports this very good. In particular, on my project on work I
can run:

   bjam toolset=gcc
   bjam toolset=nm
   bjam toolset=nmm

and the first will do a regular build, the third will compile for some
embedded processor, and the third will pass the sources via some annotation
tool and produce an annotated x86 binary.

Besides, "toolset=gcc" can become "toolset=msvc" on windows, and it will
have high chances of working.

I don't think Reid was referring to the use of different compilers on
(possibly) different platforms. If you look at runtime/GC/SemiSpace: those
files are not compiled using the compiler used to compile the rest of the
project (llc, lli and so on). They use the llvm-gcc-frontend to compile and
link those libraries into llvm bytecode. The resulting files end up in
runtime/GC/SemiSpace/BytecodeObj

> 2. The entire llvm-test project would have to either be completely
> rethought or stay the way it is, it just can't be easily automake'd
> because it doesn't follow the automake pattern.
>
> 3. The llvm/test directory would require significant rework to get it to
> automake correctly (via dejagnu).

No comments on the two points above yet.

Any ideas if the boost testing framework can apply to these two points?

> 4. There's no way to avoid listing all the sources for every library
> (I've tried several alternatives with no good results).

Easy, I've already have a half-working setup (only the targets I use are
converted), and, for example,

   lib/Analysis/DataStructure/Jamfile

has nothing but:

   llvm-lib datastructure ;

> 5. There's no way to install bytecode libraries somewhere separately
> from other libraries.

Should be possible.

> 6. Creating a distribution tarball would require additional Makefile.am
> files to be inserted in all the directories under llvm/include, the
> traversal of which would cause additional overhead for every non-dist
> target. That's 17 gratuitous "make" processes per build. There doesn't
> seem to be a way around this.

Why distribution tarball has anything to do with build system?

This is done in automake. Because the distribution consists of files processed
by automake, the "distcheck" target makes sure that those files are in the
distribution and the distribution tarball can build properly without needing
automake/autoconf.

> 7. I can't get automake to stop doing a double configure. That is, when
> you configure all the Makefiles are updated. You then go do a build and
> it thinks it needs to reconfigure again so it does it twice. This is
> just a waste of time (40 seconds on my machine).

Ehm.... I'd say that configure should be orthogonal to building. No?

I totally agree with this as well.

> 8. automake's notion of buiding a library is very fixed. It basically
> supports two things: libtool built shared libraries (linking) and ar
> built libraries that are also run through ranlib. While its possible to
> override AR and LINK, there isn't a way to override RANLIB on a
> per-target basis so you can't build both a regular library (requiring
> RANLIB=ranlib) and a bytecode library (requiring something like
> RANLIB=true) in the same directory. We have in LLVM at least 4 ways to
> build libraries: regular .a, pre-linked .o (combination of .o files),
> shared-library, and bytecode-library. While I figured out a hack to do
> pre-linked .o, it basically uses GNU Make, not automake to make it work
> and therefore breaks automake's "make" portability.

I think given Boost.Build's "properties" (like "toolset" above), this
should be doable.

> 1. Get dependency generation in a single pass like automake. This will
> give us about a 30% speed up on builds .. and one of the main reasons
> for moving to automake is taken away. Estimate: 1-2 hours.

Well, Boost.Build is single pass already. Need to compare performance on
LLVM, though.

Basically, I can try to finish my attempt to add Boost.Build files to LLVM.
But I wonder if that makes sense? Is it already decided to just improve the
current system?

I don't think it is set on stone to keep the current system. We are always
open to new ideas and if they improve things we will most likely adopt them.
I would say you should proceed on finishing what you have and we can test it
and see how it compares with the current system.

Just for clarification: the only requirement for Boost.Build is bjam itself,
right?

Is anything with "Boost" in name will be rejected right away?

Absolutely not! We were using one of the boost libraries before and we
replaced it because it was easy to do and it removed one external dependency
for us, not because it had the "Boost" in name :slight_smile:

Alkis Evlogimenos wrote:
[snip]

Is anything with "Boost" in name will be rejected right away?

Absolutely not! We were using one of the boost libraries before and we replaced it because it was easy to do and it removed one external dependency for us, not because it had the "Boost" in name :slight_smile:

Another consideration is that we try to limit the number of external tools needed to build LLVM. One difficulty with using a non-make build system is that users would need to download another build tool before building LLVM, and for some, that is too much work.

Either that, or we have to include the tool with the LLVM distribution and build it with gmake.

So whatever benefits we get from using another build system have to outweigh the inconvenience of an additional external dependency.

-- John T.

I agree with John, but assuming that boost.build was small enough to
include with the distro, I think that it might be a big win. Just being
able to support win32 and unix targets with a single build would be a VERY
nice thing...

-Chris

> Boost.Build supports this very good. In particular, on my project on work
> I can run:
>
> bjam toolset=gcc
> bjam toolset=nm
> bjam toolset=nmm
>
> and the first will do a regular build, the third will compile for some
> embedded processor, and the third will pass the sources via some
> annotation tool and produce an annotated x86 binary.
>
> Besides, "toolset=gcc" can become "toolset=msvc" on windows, and it will
> have high chances of working.

I don't think Reid was referring to the use of different compilers on
(possibly) different platforms. If you look at runtime/GC/SemiSpace: those
files are not compiled using the compiler used to compile the rest of the
project (llc, lli and so on). They use the llvm-gcc-frontend to compile and
link those libraries into llvm bytecode. The resulting files end up in
runtime/GC/SemiSpace/BytecodeObj

Yes, that should work. That's exactly different compiler: the sources must be
compiled with llvm-gcc, not with regular gcc.

> > 2. The entire llvm-test project would have to either be completely
> > rethought or stay the way it is, it just can't be easily automake'd
> > because it doesn't follow the automake pattern.
> >
> > 3. The llvm/test directory would require significant rework to get it
> > to automake correctly (via dejagnu).
>
> No comments on the two points above yet.

Any ideas if the boost testing framework can apply to these two points?

Not yet. Boost frameworks allows to run executable, check if it failed or not,
and gather that in a nice table. It's hard to say what hidden problems are
there.

> > 6. Creating a distribution tarball would require additional Makefile.am
> > files to be inserted in all the directories under llvm/include, the
> > traversal of which would cause additional overhead for every non-dist
> > target. That's 17 gratuitous "make" processes per build. There doesn't
> > seem to be a way around this.
>
> Why distribution tarball has anything to do with build system?

This is done in automake. Because the distribution consists of files
processed by automake, the "distcheck" target makes sure that those files
are in the distribution and the distribution tarball can build properly
without needing automake/autoconf.

Ah... should not be necessary with Boost.Build.

> Basically, I can try to finish my attempt to add Boost.Build files to
> LLVM. But I wonder if that makes sense? Is it already decided to just
> improve the current system?

I don't think it is set on stone to keep the current system. We are always
open to new ideas and if they improve things we will most likely adopt
them. I would say you should proceed on finishing what you have and we can
test it and see how it compares with the current system.

Thanks. I'll proceed then.

Just for clarification: the only requirement for Boost.Build is bjam
itself, right?

Right, plus the Boost.Build implementation files. But since they are in an
interpreted language, they pose much less problems. They even can be added to
distribution.

> Is anything with "Boost" in name will be rejected right away?

Absolutely not! We were using one of the boost libraries before and we
replaced it because it was easy to do and it removed one external
dependency for us, not because it had the "Boost" in name :slight_smile:

Ok, just was just making sure :wink:

- Volodya

Ok, then again, I'll try to finish my attempt. If the results are good we can
talk about handling the dependency.

- Volodya