linux build fix

Hi everyone!

Decided to give hacking on clang a try, as I'm very much interested in
compilers and languages and so forth. I started out by trying to
compile a small C program of mine that exercises a good chunk of the
language.

Various GNU/Linux headers include the stddef.h header, which is
in /usr/include/linux, which isn't in the system header search path.
So, as a "hello everyone" gift, here's my first patch for the project.
Clearly it's the most awesome patch ever, what with it being a whole
single line in a part of a file marked as "need's to be replaced." :slight_smile:

Next up is to figure out why both time_t and size_t result in a variety
of comical errors such as:

///usr/include/linux/time.h:10:2: error: type name requires a specifier
or qualifier
        time_t tv_sec; /* seconds */
        ^

Out of the pages and pages of errors, almost all of them are either
errors from time_t or size_t, so I figure fixing whatever that is will
get me pretty close to compiling this little program.

I'm also rather tempted to go fix whatever is causing that extra // at
the start of the file path in the errors. :slight_smile:

- Sean

linux.diff (481 Bytes)

Hi everyone!

Decided to give hacking on clang a try, as I'm very much interested in
compilers and languages and so forth. I started out by trying to
compile a small C program of mine that exercises a good chunk of the
language.

Cool!

Various GNU/Linux headers include the stddef.h header, which is
in /usr/include/linux, which isn't in the system header search path.
So, as a "hello everyone" gift, here's my first patch for the project.
Clearly it's the most awesome patch ever, what with it being a whole
single line in a part of a file marked as "need's to be replaced." :slight_smile:

Are you sure that this is the problem? Can you please run "gcc -v test.c -c" on something? on my linux box, I get:

#include "..." search starts here:
#include <...> search starts here:
  /usr/local/include
  /usr/lib/gcc/i586-mandriva-linux-gnu/4.0.1/include
  /usr/include
End of search list.

do you see /usr/include/linux?

Next up is to figure out why both time_t and size_t result in a variety
of comical errors such as:

///usr/include/linux/time.h:10:2: error: type name requires a specifier
or qualifier
       time_t tv_sec; /* seconds */
       ^

Out of the pages and pages of errors, almost all of them are either
errors from time_t or size_t, so I figure fixing whatever that is will
get me pretty close to compiling this little program.

If you send a .i file (produced with clang -E test.c > test.i), I can take a look.

I'm also rather tempted to go fix whatever is causing that extra // at
the start of the file path in the errors. :slight_smile:

:slight_smile: Go for it!

-Chris

Indeed, it would seem so. I have a simple patch I was just about to
send to the list. Maybe you've already tackled it, but in case not,
here you go. :slight_smile:

pathfix.diff (629 Bytes)

> Various GNU/Linux headers include the stddef.h header, which is
> in /usr/include/linux, which isn't in the system header search path.
> So, as a "hello everyone" gift, here's my first patch for the project.
> Clearly it's the most awesome patch ever, what with it being a whole
> single line in a part of a file marked as "need's to be replaced." :slight_smile:

Are you sure that this is the problem? Can you please run "gcc -v
test.c -c" on something? on my linux box, I get:

#include "..." search starts here:
#include <...> search starts here:
  /usr/local/include
  /usr/lib/gcc/i586-mandriva-linux-gnu/4.0.1/include
  /usr/include
End of search list.

do you see /usr/include/linux?

I have close to the same. Looking around, I found that all three of the
headers that are missing are in the /usr/lib/gcc/... path. I'm not
entirely clear on what that means, but I've got a pretty strong
suspicion. I'm pretty sure clang isn't meant to depend on gcc, so are
those headers that need to be supplied with clang?

If so, I'm assuming new ones have to be written from scratch due to the
license these files carry in GCC. (GPL v2+, with exceptions for
binaries compiled by GCC.) Correct?

> Next up is to figure out why both time_t and size_t result in a
> variety
> of comical errors such as:
>
> ///usr/include/linux/time.h:10:2: error: type name requires a
> specifier
> or qualifier
> time_t tv_sec; /* seconds */
> ^
>
> Out of the pages and pages of errors, almost all of them are either
> errors from time_t or size_t, so I figure fixing whatever that is will
> get me pretty close to compiling this little program.

If you send a .i file (produced with clang -E test.c > test.i), I can
take a look.

Well that would be no fun for me, now, would it? :wink: Oh well, attached.

Here are the few errors it prints out when I run the command, as well:

elanthis@stargrazer:~/Source/clc$ ../llvm/Debug/bin/clang -E clc.c >
clc.i
In file included from clc.c:20:
In file included from /usr/include/ncurses.h:140:
In file included from /usr/include/stdio.h:75:
/usr/include/libio.h:53:11: error: 'stdarg.h' file not found
# include <stdarg.h>
          ^
In file included from clc.c:20:
/usr/include/ncurses.h:142:10: error: 'stdarg.h' file not found
#include <stdarg.h> /* we need va_list */
         ^
/usr/include/ncurses.h:175:10: error: 'stdbool.h' file not found
#include <stdbool.h>
         ^
3 diagnostics generated.

Just the include headers issue I mentioned above, as stdarg.h and
stdbool.h don't have copies in /usr/include/linux like stddef.h does.

clc.i (99.5 KB)

Ah, never mind - that /usr/include/linux patch of mine broke it. :slight_smile: It
was no longer pulling in the correct time.h header, among others. I
fixed the problems with stddef.h and friends by adding
the /usr/lib/gcc... path to the include path for now. Now I'm getting a
fun-looking segfault on compilation, which is a vastly more interesting
problem to tackle. :slight_smile:

Still getting the hang of this codebase and the finer details of C
compilers, sorry.

- Sean

If you send a .i file (produced with clang -E test.c > test.i), I can
take a look.

Ah, never mind - that /usr/include/linux patch of mine broke it. :slight_smile: It
was no longer pulling in the correct time.h header, among others. I
fixed the problems with stddef.h and friends by adding
the /usr/lib/gcc... path to the include path for now.

Yep, you cannot add the /usr/include/linux include path to clang or it won't work. I did exactly the same error when I started using clang (because I didn't read the install instructions) :stuck_out_tongue:

Now I'm getting a
fun-looking segfault on compilation, which is a vastly more interesting
problem to tackle. :slight_smile:

clang automatically prints a stack trace, which should be enough to debug the problem in most cases. If you don't find the bug (it's normal as you don't know the code yet), send here the code you are parsing that is triggering the segfault.

Still getting the hang of this codebase and the finer details of C
compilers, sorry.

From my experience, clang is very easy and has the lower learning curve I've

ever seen (not that I have worked with many other compilers, though).

Nuno

Indeed, it would seem so. I have a simple patch I was just about to
send to the list. Maybe you've already tackled it, but in case not,
here you go. :slight_smile:

Applied, thanks!
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20071203/003224.html

-Chris

#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc/i586-mandriva-linux-gnu/4.0.1/include
/usr/include
End of search list.

do you see /usr/include/linux?

I have close to the same. Looking around, I found that all three of the
headers that are missing are in the /usr/lib/gcc/... path. I'm not
entirely clear on what that means, but I've got a pretty strong
suspicion. I'm pretty sure clang isn't meant to depend on gcc, so are
those headers that need to be supplied with clang?

Yep, clang currently depends on the GCC headers.

If so, I'm assuming new ones have to be written from scratch due to the
license these files carry in GCC. (GPL v2+, with exceptions for
binaries compiled by GCC.) Correct?

Yep, in the fullness of time, we will tackle that. For now, we're just assuming everyone has GCC, and we're glomming onto the headers they already provide.

-Chris

> If so, I'm assuming new ones have to be written from scratch due to
> the
> license these files carry in GCC. (GPL v2+, with exceptions for
> binaries compiled by GCC.) Correct?

Yep, in the fullness of time, we will tackle that. For now, we're
just assuming everyone has GCC, and we're glomming onto the headers
they already provide.

Are replacement headers something the project would like at this time?
If nothing else, it'll give me an excuse to dig into the C standard to a
far greater extent than I ever have. :wink:

I shudder to think of having to get the standard C++ headers in place...

If so, I'm assuming new ones have to be written from scratch due to
the
license these files carry in GCC. (GPL v2+, with exceptions for
binaries compiled by GCC.) Correct?

Yep, in the fullness of time, we will tackle that. For now, we're
just assuming everyone has GCC, and we're glomming onto the headers
they already provide.

Are replacement headers something the project would like at this time?
If nothing else, it'll give me an excuse to dig into the C standard to a
far greater extent than I ever have. :wink:

Yes, that would be very useful. The GCC includes have a whole lot of "stuff", much of which is target specific (e.g. SSE/altivec headers). In order to incrementally deploy this, we can set clang up to search its header directory before the GCC header directory (which is hacked into clang right now). For example, we could replace *just* iso646.h (which is trivial) while leaving xmmintrin.h alone.

One big thing I really dislike about the GCC headers is that they are target-specific. I think this that clang is a great time to finally get some of this stuff right vs GCC. Some headers (such as iso646.h) are target independent, and simple enough to do. In other stuff like limits.h GCC has mostly the right idea. They basically have it boil down to stuff like:

#define CHAR_BIT __CHAR_BIT__
#define SCHAR_MAX __SCHAR_MAX__
#define SCHAR_MIN (-SCHAR_MAX - 1)

#if __SCHAR_MAX__ == __INT_MAX__
# define UCHAR_MAX (SCHAR_MAX * 2U + 1U)
#else
# define UCHAR_MAX (SCHAR_MAX * 2 + 1)
#endif

etc. The nice thing about this is that the header itself is target-independent, being derived from the builtin macros (like __CHAR_BIT__) that get dumped into the preprocessor when the compiler starts up.

The problem with this approach is that it requires dumping a ton of macros into the compiler when it starts up, which is suboptimal. Instead of having the current grab-bag of pre-defined macros, I'd like to move to a more consistent set of extension points. Specifically, I think we should extend the grammar to support a new builtin, and use this query for properties of the machine. For example, we could use:

#define CHAR_BIT __builtin_config_info("target_char_bit")
#define SCHAR_MAX __builtin_config_info("target_schar_max")
...

This should be parsed as a "builtin" builtin like __builtin_type_compatible_p, which has its own parsing logic and builds its own explicit AST. The nice thing about this is that it preserves some amount of target parameterization in the AST, reduces the amount of stuff we have to slam into the macro table at startup, reduces pressure on the identifier table, and is nicely extensible to other things in the future.

Getting this right will require updating the code to be able to handle __builtin_config_info as an [integer] constant expression, handle its use in the preprocessor conditional, etc.

The 'risk' to this is that it will change the preprocessed output of the compiler vs GCC. For example, something silly like this will expand differently.

#define foo(x) # x

foo(CHAR_BIT);

However, anything that relies on that is dangerously non-conformant anyway, so I don't feel too bad about breaking it :slight_smile:

Looking forward, I think we should aim to have a single directory of headers for clang, that are not "autoconfed". This means that arch-specific headers like xmmintrin.h need to be included in the directory of headers. We would just add something like '#indef __i386__ / #error "This is an i386-specific header" / #endif' to the top of the file. Having a single unified header directory makes it much easier for clang to support an arbitrary "--triple" option to control target selection at runtime, instead of only working for the arch it was configured for.

I shudder to think of having to get the standard C++ headers in place...

Heh, no worries, we'll just use libstdc++ or some other well known STL when the time comes.

-Chris

The problem with this approach is that it requires dumping a ton of
macros into the compiler when it starts up, which is suboptimal.
Instead of having the current grab-bag of pre-defined macros, I'd like
to move to a more consistent set of extension points. Specifically, I
think we should extend the grammar to support a new builtin, and use
this query for properties of the machine. For example, we could use:

#define CHAR_BIT __builtin_config_info("target_char_bit")
#define SCHAR_MAX __builtin_config_info("target_schar_max")
...

This should be parsed as a "builtin" builtin like
__builtin_type_compatible_p, which has its own parsing logic and
builds its own explicit AST. The nice thing about this is that it
preserves some amount of target parameterization in the AST, reduces
the amount of stuff we have to slam into the macro table at startup,
reduces pressure on the identifier table, and is nicely extensible to
other things in the future.

I like it. Anything that removes the giant nest of #if tests and cutesy
math in headers is a good thing in my book, especially having dug
through those headers. Whether they're defined by a standard or not,
it's still nice to have clean, well-commented, easy-to-read headers
shipped with your development environment.

Getting this right will require updating the code to be able to handle
__builtin_config_info as an [integer] constant expression, handle its
use in the preprocessor conditional, etc.

The 'risk' to this is that it will change the preprocessed output of
the compiler vs GCC. For example, something silly like this will
expand differently.

#define foo(x) # x

foo(CHAR_BIT);

However, anything that relies on that is dangerously non-conformant
anyway, so I don't feel too bad about breaking it :slight_smile:

Hmm. Wouldn't it be better to have those kinds of builtins evaluated at
pre-processing time anyway, both to avoid special-casing support for it
in #if macros as well as making end-user debugging of pre-processed code
a tiny bit easier? Leaking compiler magic into pre-processed code might
not be the coolest idea.

Chris Lattner wrote:-

#define CHAR_BIT __builtin_config_info("target_char_bit")
#define SCHAR_MAX __builtin_config_info("target_schar_max")

Hmm, sounds familiar :slight_smile:

The 'risk' to this is that it will change the preprocessed output of
the compiler vs GCC. For example, something silly like this will
expand differently.

#define foo(x) # x

foo(CHAR_BIT);

However, anything that relies on that is dangerously non-conformant
anyway, so I don't feel too bad about breaking it :slight_smile:

? That always produces "CHAR_BIT". If you an extra level of
indirection, so that it is actually expanded, presumably you and
GCC would get "8" each too. I don't see a problem here.

Neil.

Chris Lattner wrote:-

#define CHAR_BIT __builtin_config_info("target_char_bit")
#define SCHAR_MAX __builtin_config_info("target_schar_max")

Hmm, sounds familiar :slight_smile:

Yep, we discussed this awhile back.

The 'risk' to this is that it will change the preprocessed output of
the compiler vs GCC. For example, something silly like this will
expand differently.

#define foo(x) # x

foo(CHAR_BIT);

However, anything that relies on that is dangerously non-conformant
anyway, so I don't feel too bad about breaking it :slight_smile:

? That always produces "CHAR_BIT". If you an extra level of
indirection, so that it is actually expanded, presumably you and
GCC would get "8" each too. I don't see a problem here.

Oh right, duh. :slight_smile:

-Chris

This should be parsed as a "builtin" builtin like
__builtin_type_compatible_p, which has its own parsing logic and
builds its own explicit AST. The nice thing about this is that it
preserves some amount of target parameterization in the AST, reduces
the amount of stuff we have to slam into the macro table at startup,
reduces pressure on the identifier table, and is nicely extensible to
other things in the future.

I like it. Anything that removes the giant nest of #if tests and cutesy
math in headers is a good thing in my book, especially having dug
through those headers. Whether they're defined by a standard or not,
it's still nice to have clean, well-commented, easy-to-read headers
shipped with your development environment.

Yep, the other big win is that with one binary you'd be able to do:

clang foo.c -triple sparc-sun-solaris8
clang foo.c -triple i386-pc-linux-gnu

or whatever. This requires installing headers for all targets in and making them work in place.

Hmm. Wouldn't it be better to have those kinds of builtins evaluated at
pre-processing time anyway, both to avoid special-casing support for it
in #if macros as well as making end-user debugging of pre-processed code
a tiny bit easier? Leaking compiler magic into pre-processed code might
not be the coolest idea.

As Neil mentioned, I don't think this is an issue.

-Chris