I'm using Clang 2.1 from Xcode 4.1 on Mac OS X 10.7, porting software
which was previously built with GCC to Clang. I'm using an obsolete
version because the software is libraries, which my customers need
to code against, and they'll feel more secure if they have access to
a GCC that appears to correspond to the Clang I'm using.
Mostly, this has worked very well, with only a few problems. The
latest is interesting: I'm finding that 32-bit Clang code needs rather
more stack space than 32-bit GCC code did. This makes a difference to
the stack sizes I need to set up for POSIX threads. This isn't
a problem, and allocating 25% more stack space seems to handle it -
I haven't run all the testing yet.
I had been using 1MB stacks for 32-bit and 2MB for 64-bit, before I
found these needed to be increased. The scaling between 32-bit and
64-bit is, of course, variable according to the mixture of types in
local variables and the stack layouts the respective compilers chose.
The threads are definitely not confined to leaf functions, but execute
some pretty complex algorithms, so needing significant stack space is
expected. The default stack sizes on Mac OS X are too small for us,
but that is true on many other platforms.
I'm not finding that 64-bit code needs more stack space than GCC did,
but since I don't know very much about how spare space I had on the
thread stacks with GCC, this isn't very surprising.
So this is not a complaint, or a bug report, but I would be interested
to know if Clang is deliberately free with stack space, or is meant not
to be, or anything else that is unexpected or interesting about the
behaviour I'm seeing.
Clang does not intentionally use more stack space than GCC. In fact, recent versions have specific optimizations to reduce stack usage (you get better cache utilization that way, etc). Apple Clang 2.1 is really old, you might just check against a 3.x to see if it is doing better on your code. If not, please file a bug report.
you might just check against a 3.x to see if it is doing better on
your code.
Currently, the only way I have to check if I have enough stack space is
to run a suite of tests and see if they crash out. The stack sizes we
were using were based on doing that and rounding up. A way of monitoring
actual stack usage would be distinctly helpful, but I haven't been
getting very far with finding one today.
The nearest on Mac OS X that I can see - but haven't yet tried - is to
locate the individual tests (out of hundreds in a suite) that use the
most stack, modify the test harness to pause at them without tidying up,
and use vmmap -verbose to see how much stack is actually used. This is
rather clumsy, and since the code base is changing and new tests are
constantly being added, goes out-of-date quickly.
I can't find a monitoring tool that will tell me what the largest amount
of thread stack used by any thread in a process is; there are interactive
tools like vmmap, but they rely on you running them at just the right
time. I've worked out how to build a monitoring infrastructure to do the
job in my thread creation wrapper function, but it's rather hacky, and will
take me some time to implement.
Any suggestions welcomed - this appears to be a neglected area.
Any suggestions welcomed - this appears to be a neglected area.
Well, I isolated some of the test cases that have stack overflows, and
found they were all making significant numbers - 25 or so - recursive
calls to the same fairly large function.
So measuring stack space consumed by the function seemed worth trying,
which one can easily approximate as the distance between the address
of a variable in successive recursive calls. The results were quite
interesting:
VS.2008sp1 on Windows: 2560 bytes 32-bit and 3520 bytes 64-bit. This
compiler has always been very frugal with stack space.
GCC 4.2 on Mac OS X 10.6: 5424 bytes 32-bit and 5664 bytes 64-bit.
Clang 2.1 on GCC 10.7: 43,504 bytes 32-bit and 44,224 bytes 64-bit.
Yes, this really is the same code; there's no platform-specific code
in that function. Those measurements were taken with optimisation of
-O2; at -O1, the figures are 12,288 bytes 32-bit and 12,480 bytes
64-bit.
So yes, Clang 2.1 really does seem to have a problem with stack space
usage. This may force us to move up to a later version.