Clang build of PostgreSQL

Hi,

I apologise in advance if it is inappropriate to post a query like
this to a dev list, but I found no reasonable alternative.

I work on the PostgreSQL project, and was present at Chris Lattner's
talk on Clang/LLVM at FOSDEM earlier in the year. At that talk,
Postgres was specifically cited as an example of a medium sized C
program that had seen large improvements in compile times when built
with Clang. Here are his slides:

http://www.scribd.com/doc/48921683/LLVM-Clang-Advancing-Compiler-Technology

I would really like to be able to reproduce some of Chris's work here,
to demonstrate to the PostgreSQL community.

Here is an example of arguments given to Clang (and gcc) by PG's build system:

/home/peter/build/Release/bin/clang -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -I../../../src/include
-D_GNU_SOURCE -c -o pg_db_role_setting.o pg_db_role_setting.c

At present, when I build Postgres using gcc (specifically, Fedora's
build of gcc 4.6), I see a build time of 2 minutes 1 second. For
Clang, built today from SVN tip, I see a build time of 3m23s. Clang
has been built with --enable-optimized and --disable-assertions in an
attempt to be objective.

A small number of TUs, particularly gram.c, seem to be real
bottlenecks for Clang.

Any advice on improving build times with Clang would be appreciated.

Thanks

Hi,

I apologise in advance if it is inappropriate to post a query like
this to a dev list, but I found no reasonable alternative.

I work on the PostgreSQL project, and was present at Chris Lattner's
talk on Clang/LLVM at FOSDEM earlier in the year. At that talk,
Postgres was specifically cited as an example of a medium sized C
program that had seen large improvements in compile times when built
with Clang. Here are his slides:

http://www.scribd.com/doc/48921683/LLVM-Clang-Advancing-Compiler-Technology

I would really like to be able to reproduce some of Chris's work here,
to demonstrate to the PostgreSQL community.

Here is an example of arguments given to Clang (and gcc) by PG's build system:

/home/peter/build/Release/bin/clang -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -I../../../src/include
-D_GNU_SOURCE -c -o pg_db_role_setting.o pg_db_role_setting.c

At present, when I build Postgres using gcc (specifically, Fedora's
build of gcc 4.6), I see a build time of 2 minutes 1 second. For
Clang, built today from SVN tip, I see a build time of 3m23s. Clang
has been built with --enable-optimized and --disable-assertions in an
attempt to be objective.

It's worth noting that the graph from the presentation you're pointing
to is at -O0.

A small number of TUs, particularly gram.c, seem to be real
bottlenecks for Clang.

Any advice on improving build times with Clang would be appreciated.

We generally find that the LLVM optimizers are faster than the gcc
ones... but if there's a specific file that's taking longer than you
expect to compile, please file a bug at http://llvm.org/bugs/ with
preprocessed source.

-Eli

Those numbers are for gcc 4.2, and probably Apple's fork thereof. One would assume the gcc folks have made improvements in 4.6, which you used.

I do appreciate that, and that a different optimisation level has been
used, but I found the gap in compile times surprising, to the extent
that I suspect that something is wrong.

Regardless of its performance, I do really like Clang from a usability
perspective, FWIW.

Observe the differences in compile times of the file
/src/backend/parser/gram.c . The warning message that you see is a
known bug with Gnu Bison that they refuse to fix:

[peter@peter postgresql]$ time /home/peter/build/Release/bin/clang -O2
-Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-error -I. -I. -I
/home/peter/postgresql/src/include -D_GNU_SOURCE -c -o gram.o
/home/peter/postgresql/src/backend/parser/gram.c
In file included from gram.y:12939:
scan.c:16246:23: warning: unused variable 'yyg' [-Wunused-variable]
    struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be ...
                      ^
1 warning generated.

real 1m5.660s
user 1m5.344s
sys 0m0.083s
[peter@peter postgresql]$ time gcc -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -Wno-error -I. -I. -I
/home/peter/postgresql/src/include -D_GNU_SOURCE -c -o gram.o
/home/peter/postgresql/src/backend/parser/gram.c
In file included from gram.y:12939:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16246:23: warning: unused variable ‘yyg’ [-Wunused-variable]

real 0m2.800s
user 0m2.688s
sys 0m0.094s

That's obviously a difference way out of proportion to the total
difference in compile times.

Again, please file a bug at http://llvm.org/bugs/ . There's probably
something very specific going wrong that we can fix.

-Eli

Those numbers are for gcc 4.2, and probably Apple's fork thereof. One would assume the gcc folks have made improvements in 4.6, which you used.

I do appreciate that, and that a different optimisation level has been
used, but I found the gap in compile times surprising, to the extent
that I suspect that something is wrong.

Regardless of its performance, I do really like Clang from a usability
perspective, FWIW.

Observe the differences in compile times of the file
/src/backend/parser/gram.c . The warning message that you see is a
known bug with Gnu Bison that they refuse to fix:

[peter@peter postgresql]$ time /home/peter/build/Release/bin/clang -O2
-Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-error -I. -I. -I
/home/peter/postgresql/src/include -D_GNU_SOURCE -c -o gram.o
/home/peter/postgresql/src/backend/parser/gram.c
In file included from gram.y:12939:
scan.c:16246:23: warning: unused variable 'yyg' [-Wunused-variable]
   struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be ...
                     ^
1 warning generated.

real 1m5.660s
user 1m5.344s
sys 0m0.083s
[peter@peter postgresql]$ time gcc -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -Wno-error -I. -I. -I
/home/peter/postgresql/src/include -D_GNU_SOURCE -c -o gram.o
/home/peter/postgresql/src/backend/parser/gram.c
In file included from gram.y:12939:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16246:23: warning: unused variable ‘yyg’ [-Wunused-variable]

real 0m2.800s
user 0m2.688s
sys 0m0.094s

That's obviously a difference way out of proportion to the total
difference in compile times.

Maybe precompiled headers are not utilized when building with clang ? Clang does not automatically pick up the PCH from the first #include, you have to provide it at command-line.

Try reading that again... this is almost certainly not a frontend issue time.

-Eli

I've submitted a bug report, but as the preprocessed source file in
question is a whopping 2.6MB, Bugzilla won't let me add it as an
attachment.

It's actually very easy to re-create yourselves though - just git
clone postgresql and ./configure --with-CC="clang" at tip.

Who should I e-mail the file to? I know the list will balk if I send
the file too...

Thanks

I've submitted a bug report, but as the preprocessed source file in
question is a whopping 2.6MB, Bugzilla won't let me add it as an
attachment.

Does gzip or bzip2 help? Preprocessed source usually compresses
very well. Bugzilla's limit is 1MB if I remember correctly.

Alright, compressed file has been attached to bug report.

Can you try building with -g0? On Linux, clang is pretty slow and
produces huge binaries due to http://llvm.org/PR7554 when debug
information is emitted.

Nico

Hi Nico,

I suppose that I could, but that doesn't seem like much of a solution to me...

Besides, even if that is generally the case, I think that the test
case that I've produced demonstrates a fairly massive performance
problem, which has been isolated to a single flag. Surely it isn't too
difficult to fix.

Ah, I hadn't seen the bug (it's
http://llvm.org/bugs/show_bug.cgi?id=10183 if someone else is
curious).

Hi Nico,

I suppose that I could, but that doesn't seem like much of a solution to me...

Besides, even if that is generally the case, I think that the test
case that I've produced demonstrates a fairly massive performance
problem, which has been isolated to a single flag. Surely it isn't too
difficult to fix.

:slight_smile: