Headers & Libraries

LLVMers,

I’m running into a fair bit of confusion as I start to use LLVM to build my own compiler. The issues relate to what is in a given .a or .o file, why linking takes so long, and getting LLVM header files to include correctly, and the lack of viable “install” target. I’ll deal with each of these in turn:

For my own project, I’ve added an AC_CHECK_LIB line to check for libipo.a as a test that the LLVM libraries are available. In order
to get this to work, I needed to use the last argument of AC_CHECK_LIB to specify the dependent libraries. To get this to work correctly, I had to specify vmcore.o and -lsupport to linker. Now, the question is, why isn’t vmcore.o a LIBRARY? I also note that in the objects generated by an LLVM compilation, there are numerous .o files in t $OBJDIR/lib/Debug directory. Is that by design? It certain isn’t very friendly for users of LLVM.

After I got my configure script to detect LLVM libraries correctly, I noted that the AC_CHECK_LIB test for libipo.a took longer than the entire rest of my configuration script (and I have a big one). It sat on the “checking…” line for about a minute. This is happening while configure is linking a one line program against three LLVm libraries (-lipo vmcore.o -lsupport). What gives? This shouldn’t take so long! Am I linking this incorrectly?

Finally, I started to compile against LLVM source. I adjusted my build system to include -I$SRCDIR/include on the g++ command line. But, when I compile, I get:

In file included from /proj/work/llvm/include/Support/ilist:41,
/proj/work/llvm/include/Support/iterator:29:27: Config/config.h: No such file or directory
/proj/work/llvm/include/Support/iterator:47:2: #error "Need to have standard iterator to define bidirectional iterator!"
/proj/work/llvm/include/Support/iterator:68:2: #error "Need to have standard iterator to define forward iterator!"

And, from there the compile goes to hell. Note that “Support/iterator” and “Support/ilist” are included just fine from the -I$SRCDIR/include command line argument but the “Config/config.h” isn’t found even though the Config and Support directories are in the same location! When Support/ilist includes Support/iterator, it uses “#include <Support/ilist>” believe the problem is that iterator:29 has:

#include “Support/support.h”

instead of:

#include <Support/support.h>

Furthermore, this raises another huge issue which is segregation of the LLVM header files. The practice for many open source projects today is to place all the header files in a directory that identifies the project. For example, when you include a Xerces header file, you do so with #include <xercesc/XYZ/File.h>. Similarly for ICU, we use #include <unicode/ucstring.h>. The same is true of many other packages, mine included. Unfortunately, it is not true of LLVM. Every #include in LLVm should look like: #include <llvm/Module/Header.h> As in #include <llvm/Support/support.h>. This clearly identifies it for end users and ensures that support.h won’t get confused with some other “Support/support.h”, which would be disastrous.

On a related question, the current “install” target in the Makefile system is a no-op. It would be very useful if we had an install target that finalized libraries and then copied them and the headers to an installation location. In such a scenario, the user of LLVM only needs to worry about one directory for LLVM: the install directory. Right now, I need to know about the $SRCDIR to get headers and the $OBJDIR to get libraries. As I’ve said before, the install functionality comes for free with automake.

So, my question is, are these things by design? If so, what is the rationale and how would I avoid the compilation problem?

If these things aren’t by design, I’d like to open a bug against them to get it fixed. Before I can do that, however, we need to have a plan for the way things SHOULD build. I’m willing to go along with whatever scheme is comfortable for you as long as it ends up being possible for LLVM users to utilize LLVM correctly and easily.

I know you guys are all busy this week. Feel free to delay your answer. Funding’s more important than answering my questions :slight_smile:

Reid.

Oops, I was a little hasty. The #include problem results from the need to specify both -I$OBJDIR/include and -I$SRCDIR/include on the compiler command line. This isn’t particularly friendly for users of LLVM working in separated directories. Although I have a workaround for this particular problem, the larger issues of installing headers/libraries and what goes in what library remain.

Reid

Just a comment on llvm headers. We currently use:

#include "llvm/Codegen/LiveVariables.h"

which causes an extra unnecessary lookup to compared to:

#include <llvm/Codegen/LiveVariables.h>

because it looks for the header file in the directory of the source file it
includes it first before looking at the rest of the include path. Of course
the header will never be there since the full path is specified.

#include "" should only be used when headers are specified using relative
paths. In our case the majority of header inclusions (if not all) use
relative paths so we may want to consider either converting all our #include
"" to #include <> or change header file inclusions to use relative paths. I
don't see any advantages of one over the other but what we have today is not
strictly correct :slight_smile:

#include <> is for system headers:

While LLVM is slowly moving that direction, we are not yet considered to
be required by the "system".

-Chris

Yes they define as "system" headers files that declare interfaces to parts of
the OS (cpp info, section Header Files). Of couse what is defined as OS is
not mentioned anywhere so you can define that as you like. Personally I
believe it is not just the kernel but also all packages installed in standard
directories. So what they call "system" headers are basically installed
headers and user headers are internal ones. In the context of llvm every
header that is under inlcude is a "system" header (because when we write an
install target it will end up in /usr/include/llvm) otherwise it is a user
header.

> While LLVM is slowly moving that direction, we are not yet considered to
> be required by the "system".

Yes they define as "system" headers files that declare interfaces to parts of
the OS (cpp info, section Header Files). Of couse what is defined as OS is
not mentioned anywhere so you can define that as you like. Personally I
believe it is not just the kernel but also all packages installed in standard
directories. So what they call "system" headers are basically installed
headers and user headers are internal ones. In the context of llvm every
header that is under inlcude is a "system" header (because when we write an
install target it will end up in /usr/include/llvm) otherwise it is a user
header.

I'm sorry, but at this point I don't see the value of making this change.
It would be a lot of work for no clear benefit. Even if/when llvm headers
get installed into /usr/include/llvm, you can still use "" to get them.
For most practical purposes, all <> vs "" do is change the order of the
search path. Am I missing something here, or is this just a matter of a
personal preference?

-Chris

I couldn’t agree more :slight_smile:

So what they call "system" headers are basically installed
headers and user headers are internal ones. In the context of llvm every
header that is under inlcude is a "system" header (because when we write an
install target it will end up in /usr/include/llvm) otherwise it is a user
header.

We don't want a user to compile a new copy of LLVM using the headers of
a previously-installed version of LLVM.

No, you’re not missing anything but it isn’t just a matter of taste. Installed headers should be included
with <> because it is “slightly” more efficient (the local compilation directory is not searched first). Both
will work. However, I think Alkis’ point is exactly right. LLVM headers (most of them) should be considered
system headers and #included with <>. LLVM also has some non-public headers. Those should be
#included with “”.

Let’s not make the amount of work involved here the issue. I’d do the whole thing myself because the
value I see in doing it is that I can clearly identify from a #include line whether it is a public or private
header file that is being included (based on whether <> or “” is used).

At the risk of being controversial, there is another topics in this area that I Would bring up:

I find the separation of header files into the $SRCDIR/include directory bothersome. It is a real
pain to switch between $SRCDIR/include/llvm/… and $SRCDIR/lib/… What was the rationale in
doing this? Shouldn’t the header files be located in the same directory as the .cpp files? The
install target can sort out where they go when the package is installed.

Reid

including with "" will not accomplish that though any less than including with
<>. In both cases you will need to specify the correct -I flags to the
compiler.

I plead for both of you to hold this conversation until AFTER Saturday! :slight_smile:
:slight_smile:

-Chris

Let's be clear about something here. There are two cases for #including
LLVM header files. The first case is when compiling LLVM itself. The
second case is when you're using a compiled/installed version of
LLVM. Life gets simpler when #including for both cases is identical.

This is what I did in my XPS project. All the source code is in a
directory structure like this:

xps/<module>/Code.h
xps/<module>/Code.cpp

There is no separation of headers and cpp files. When I specify my
makefiles, I indicate which headers are public (installable) and which
are private (not installed). When I compile XPS, I ALWAYS use:

#include <xps/<module>/Code.h>

The same form of #include is used when compiling something else that
uses XPS because the headers get installed in:

<installdir>/include/xps/<module>/Code.h.

For compiling XPS, all I need is -I$SRCDIR -I$OBJDIR

For using XPS, all I need is -I$INSTALLDIR/include

Note that if you are installing to INSTALLDIR=/usr or /usr/local, then
the single -I$INSTALLDIR/include works for EVERY installed package you
might want to use.

This arrangement has worked well and I'd advise LLVM to follow a similar
pattern to avoid end-user confusion (like mine) down the road.

There is no separation of headers and cpp files. When I specify my
makefiles, I indicate which headers are public (installable) and which
are private (not installed).

IMO it is better to include with "" internal header files that are never
installed as is the case with all headers under lib. It makes file
self-documenting in terms of headers being public or private which in turn
helps when browsing the code as you know where to look for an included
header.

This arrangement has worked well and I'd advise LLVM to follow a similar
pattern to avoid end-user confusion (like mine) down the road.

As Chris said this is not of high priority right now. I think it would be nice
to have but I am not sure if everyone is convinced (or even anyone else in
the group). If/when we decide to go for it, it may be best if we open a bug
for it and let some contributor handle it, as there are much more important/
interesting issues we need to address and the cost/benefit ratio of this
issue is very low.

In any case, we will get more discussion on this on Sunday, so let's hold our
breath until then...

> There is no separation of headers and cpp files. When I specify my
> makefiles, I indicate which headers are public (installable) and which
> are private (not installed).

IMO it is better to include with "" internal header files that are never
installed as is the case with all headers under lib. It makes file
self-documenting in terms of headers being public or private which in turn
helps when browsing the code as you know where to look for an included
header.

I agree whole heartedly. When I said "no separation of headers and cpp",
I meant that they XYZ.h and XYZ.cpp live in the same directory, the
XYZ.h isn't placed somewhere else in the directory structure.

> This arrangement has worked well and I'd advise LLVM to follow a similar
> pattern to avoid end-user confusion (like mine) down the road.

As Chris said this is not of high priority right now. I think it would be nice
to have but I am not sure if everyone is convinced (or even anyone else in
the group). If/when we decide to go for it, it may be best if we open a bug
for it and let some contributor handle it, as there are much more important/
interesting issues we need to address and the cost/benefit ratio of this
issue is very low.

In any case, we will get more discussion on this on Sunday, so let's hold our
breath until then...

Agreed. I'll volunteer to do the work but I won't bother if the result
is likely to be met with hostility :). I'd rather we all agree on a
plan, which might include doing nothing. If the plan is acceptable then
I don't mind "making it so".

Reid

> In any case, we will get more discussion on this on Sunday, so let's hold our
> breath until then...

Agreed. I'll volunteer to do the work but I won't bother if the result
is likely to be met with hostility :). I'd rather we all agree on a
plan, which might include doing nothing. If the plan is acceptable then
I don't mind "making it so".

I don' think any contributions to LLVM would be met with hostility. Unless
of course you start putting all curly braces on the next line or adopt
hungarian notation :wink:

But of course, I wouldn't spend your time implementing this until its been
throughly discussed (after important paper deadlines of course...).

Thanks for your contributions Reid!

-Tanya Brethour