I am trying to implement the lifetime marker insertion for unnamed temporaries in order to reduce stack consumption.
Inserting the lifetime.begin marker is the easy part; on the other hand, inserting the lifetime.end gets a bit more complicated, as it involves scoping , exception handling and lifetime extension.
Ideally, the lifetime.end marker should be inserted when the scope is exited, after the cleanups have been run. The problem is that in some cases, an object with a trivial destructor for example, there is … no cleanup. The existence of cleanups appears in the AST with the ExpWithCleanups, which wraps the underlying MaterializeTemporaryExpr.
This is used in CodeGenFunction::EmitLValue to start a new RunCleanupsScope when visiting an ExpWithCleanups. My problem is how can I add a RunCleanupScope for MaterializeTemporaryExpr that are not part of an ExpWithCleanups. I can not always add a new scope, because it would mess-up the cleanup ordering — when there are cleanups, the lifetime.end marker should be emitted after the destructor.
Is there anything obvious I am missing ?
Any hints would be welcome.
Can’t this affect debugging as well?
I mean, if you run this on O1 or less, you’ll have a lot more variables optimised away, especially at the end of the function, and at higher optimisation levels you’ll have a lot less variables to care about.
This would apply to unnamed temporaries, which presumably have no debug
info and can't be examined in a debugger.
As to his actual question, I have no idea. Lifetime of C++ temporaries is
subtle. I always have to defer to Richard and John. The static analysis
guys are having a similar pile of trouble figuring out where to insert
calls to C++ destructors in the CFG.
I am clearly not a specialist for debug info. It will certainly put some pressure on the debug info generator at opt level >= 01, as the stack slots will be reused, so it needs to place the info at the right place / right time.
You have less live variables, at any given point in the function, but this assumes you have handled the dead variables correctly…
By the way, my patch indeed also affected some debuginfo (some breakpoint location if I remember correctly). I have switched temporarily to something else, but this patch proved to be extremely complex — complexity linked to what you can find in the thread about temporary destructors. There is something going wrong in how the scopes are handled and my patch triggers it. I am probably breaking some undocumented or implicit assumptions. But there is definitely a lot of potential to reduce stack usage, and we saw that on real code.
In theory, you could track liveness of non-temp variables and re-use
their spaces for temps after they're dead, which would mess up with
debugging. But you're right, for everything else, it should be fine.
This might be a bad idea, but is it possible to start with a C-only
implementation, and move on with C++ later? At least we can get the
general implementation right, and then only fiddle with exception
handling when we need to.
I do not think this can apply to "C-only", as C has no way to express unnamed temporaries (language lawyers may contradict me here
On the other hand for C++, you can have lots of those unnamed temporaries.
A possible path, along the line of what you suggest, would be to activate lifetime markers only in the non-throwing cases (or when compiled with -fno-exceptions). The exceptional part could come later.
We could at least get some of the benefits now.
That's what I was thinking, yes.
If you label the feature as "experimental" and only turn it on with a
specific flag, it could be in an incomplete (though correct) state for
a while before being made official.
I do not think this can apply to “C-only”, as C has no way to express unnamed temporaries (language lawyers may contradict me here
Write a c function that returns a large struct by value and another with that type as a parameter and call the later with the result of the former?
struct big source(void);
void sink(struct big);
Should get you some anonymous stack usage in c that you can optimize the use of.
Yes, that’s the only case I could think of. I do not know if this would be used often though.