Heh, not really. We're just now getting the C++ standard fixed:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3664.html
Uh, ok 
See the paper for the high level theory of what the standard allows
(once it goes in). The current plan in Clang is to follow that
paper's strategy of optimization even in older standard modes unless
there is significant existing code broken by it.
What could be broken? I think such source code might be quite
perverse..
We can do this, but it may cause an unacceptable growth in stack
usage. Hal Finkel has recently started work on an optimization pass
to do precisely this, and I'm helping review and get it into the
tree. I would expect Clang to start doing this soon for small,
constant size allocations.
This applies to most common use cases, right? Most objects are
little…
Anyway, what will be the criterion for the decision of which
allocation to lower to the stack and which not (e.g. what does
"small" mean)?
The criteria have not yet been decided, and I think some experimentation will be necessary. In the current implementation, the conversion happens for all allocations provably less than 1024 bytes. However, we may want to cap the total size of converted mallocs in addition to, or instead of, the individual sizes. Maybe this cap should depend on how much stack memory is already being requested by the function. I'm certainly open to suggestion.
2) Can clang/llvm identify as a dead store a write to a piece of
memory that will be delete?
void func(Class *obj) {
obj->member = 42; // Is this store removed?
delete obj;
}
Compiling this exact code with -O3, I see the store still there.
Huh. No, I bet we miss this. This is almost certainly just a missed
optimization that we should get.
Is there a call to the destructor after the store? If so, then we might not know that the destructor does not access the stored value.
However, note that all of Clang's optimizations of this form are
mostly conducted by the LLVM optimization libraries Clang is built
on, and so the fix to this missed optimization will likely involve
mostly a change to LLVM.
Yeah, of course. But what I don't know is how does LLVM know at
IR-level that a call to _Znam allocates memory and is paired with a
call to _Zdam that frees it? It's special knowledge about this
symbols, or some attribute put there in the extern symbol
declaration?
Yes, LLVM has special knowledge of certain external symbols. In this case, the relevant code is in: lib/Analysis/MemoryBuiltins.cpp (in LLVM). Although perhaps not obvious from the code, the symbol recognition depends on both the name, and the presence of a special 'builtin' attribute. Clang will add this special attribute in cases where the compiler is free to make assumptions about the semantics of the call.
For example, for this dead store, can you model in the IR that the
memory will be lost or do you need a special case in the dead store
elimination pass? Anyway, is it C++ specific or is there some
generic architecture to model dynamic allocation (I believed not)?
It is not C++ specific, although there may be some C++-specific knowledge in some of the function categories.
I ask this because in another project I'm writing a llvm-based
compiler and I'll want to apply those kind of optimizations as well.
Maybe, it could be worth to use some intrisincs, lowered by a
front-end specific pass, that model dynamic allocation primitives.
What do you think?
We may end up with something like that at some point, but for now the builtin symbol recognition has been sufficient. FWIW, we also do this same thing with other 'libc' functions, see for example lib/Transforms/Utils/SimplifyLibCalls.cpp, and some functions get special CodeGen support as well.
-Hal