Object in a try-catch block not being destroyed even after an exception

Dear LLVM Members,

The below source code from gcc test suite replicates the problem which i am facing. When it is built with clang, it throws an abort(highlighted).
The following are my observations -

  1. The object(tmp) is not getting destroyed immediately after the exception is thrown. In case of Clang, the object is getting destroyed at a later point of time.
  2. In case of gcc, the scope of the object created inside the try-catch block is limited to the block. Once the exception is thrown, the object(tmp) is destroyed.
  3. The assembly files for both clang and gcc were generated and compared, wherein the sequence of operations look similar.
  4. I also tried the MS-Visual Studio compiler, which gives a result similar to gcc.

As per my understanding of the C++ standard, the object has to be destroyed as soon as the exception is thrown.
Please let me know which is the correct behavior and suggest me the changes required.

Thanks & Regards,
Shivaprasad
shivahms@gmail.com

/Source Code Start*/
extern “C” void abort ();

int thrown;

int as;
struct a {
a () { ++as; }
~a () { --as; if (thrown++ == 0) throw 42; }
};

int f (a const&) { return 1; }
int f (a const&, a const&) { return 1; }

int bs;
int as_sav;
struct b {
b (…) { ++bs; }
~b () { --bs; as_sav = as; }
};

bool p;
void g()
{
if (p) throw 42;
}

int main () {
thrown = 0;
try {
b tmp(f (a(), a()));

g();
}
catch (…) {}

// We throw when the first a is destroyed, which should destroy b before
// the other a.
if (as_sav != 1)
abort ();

thrown = 0;
try {
b tmp(f (a()));

g();
}
catch (…) {}

if (bs != 0)
abort ();
}
/Source Code End*/

This looks specific to clang, not to LLVM, so moving to cfe-dev. [Source code below not elided for anyone on cfe-dev who isn’t on llvmdev; I have more comments trailing, so scroll down]. What clang is generating can be approximated by the following pseudocode: atmp1, atmp2, btmp = alloca space for temporaries call a::a on atmp1 try { call a::a on atmp2 try { call f(atmp1, atmp2) call b::b on btmp } catch { } finally { call a::~a on atmp2 } } catch { } finally { call a::~a on atmp1 } // NOW b is “constructed” Looking at the output IR, the best interpretation I can give is that Clang is cleaning up the temporary arguments before calling the new variable constructed. The question, of course, is whether or not this is legal per the C++ spec. C++11 section 12.2, par 3 states that Section 1.9, par 10 has an example which states that the full expression associated with the call to the constructor in this case is indeed the actual constructor call. Section 15.2, par 1 states: I’m not 100% what the correct interpretation is in this case. It comes down to whether b should be considered constructed before or after the full expression of the declarator is completed.

b should be considered constructed as soon as its constructor completes. This is a bug in clang.

John.

b is constructed, since its constructor has finished.

However, the problem seems to be a bit more subtle. When we get to the
destructors at the end of the full-expression, the two 'a' temporaries
and 'tmp' have been constructed. In normal execution, we would then
destroy the 'a' temporaries in reverse construction order, and then
destroy the 'tmp' object at the end of its scope.

Now, GCC seems to believe that after the first '~a' call throws, we
should destroy 'tmp' and then destroy the 'a' temporary. Perhaps it's
trying to enforce reverse construction order ('tmp' was constructed
after the 'a' temporaries), per 15.2/1. However, that paragraph does
not directly apply, since temporaries don't have automatic storage
duration (see core issue 365). Clang believes that, after the first
'~a' throws, we destroy the second 'a' then destroy 'tmp'.

The relevant paragraph for the destruction of temporaries is 12.2/3:
"Temporary objects are destroyed as the last step in evaluating the
full-expression (1.9) that (lexically) contains the point where they
were created. This is true even if that evaluation ends in throwing an
exception. The value computations and side effects of destroying a
temporary object are associated only with the full-expression, not
with any specific subexpression."

So the question is, how are the "As control passes from a
throw-expression to a handler" from 15.2/1 and the "last step in
evaluating the full-expression" in 12.2/3 ordered? GCC acts as if
'control passes' before 'the last step in evaluating the
full-expression'. Clang acts as if 'the last step in evaluating the
full-expression' happens before 'control passes'. I'm not sure who is
right. Does GCC's revision control history for the test file provide
us with any clues?

I don't find the reasoning in this core issue convincing, and I agree
with Steve Adamczyk's comment that it's really just their object lifetime
that's a bit odd, not their storage duration.

I believe the intent of the standard is quite clearly to have the local
variable destroyed in reverse order of construction w.r.t. the temporaries,
much like a return value must be. Clang doesn't get either of these
right.

John.

b is constructed, since its constructor has finished.

However, the problem seems to be a bit more subtle. When we get to the
destructors at the end of the full-expression, the two 'a' temporaries
and 'tmp' have been constructed. In normal execution, we would then
destroy the 'a' temporaries in reverse construction order, and then
destroy the 'tmp' object at the end of its scope.

Now, GCC seems to believe that after the first '~a' call throws, we
should destroy 'tmp' and then destroy the 'a' temporary. Perhaps it's
trying to enforce reverse construction order ('tmp' was constructed
after the 'a' temporaries), per 15.2/1. However, that paragraph does
not directly apply, since temporaries don't have automatic storage
duration (see core issue 365).

I don't find the reasoning in this core issue convincing, and I agree
with Steve Adamczyk's comment that it's really just their object lifetime
that's a bit odd, not their storage duration.

The problem, as I see it, is that the standard currently specifies
that they have static storage duration (because they are objects
"implicitly created by the implementation" (3.7/2), but they're not
"block-scope variables" (3.7.3/1)). Also, even if we were to imagine
they have automatic storage duration, then the GCC optimizations which
reuse their storage after their lifetime ends are non-conforming,
because "the storage for these entities lasts until the block in which
they are created exits" (3.7.3/1). I don't think anyone wants that.

I believe the intent of the standard is quite clearly to have the local
variable destroyed in reverse order of construction w.r.t. the temporaries,
much like a return value must be.

That does not match my expectation, which is that in:

  S s(T(), T());

the T temporaries should *always* be destroyed before the S object is.
Having an inconsistency in destruction order (depending on whether one
of the ~T() calls throws) does not seem useful, and I don't see any
way to read the standard in which it is mandated. Indeed, consider the
following (credit to Chandler for the argument, any flaws in conveying
it are my own):

* The destructors for local variables are only invoked "As control
passes from a throw-expression to a handler".
* The evaluation of the throw-expression's full-expression is
sequenced-before the evaluation of the handler.
* Therefore the throw-expression and its temporaries are destroyed
before control passes.
* Therefore the temporaries are destroyed before the objects of
automatic storage duration are.

I would also like to put forward the hypothesis that the change
suggested would, in practice, only ever break code, because almost all
code is going to be set up to correctly handle the case where the
temporaries are destroyed before the local variable. Hence I'd like to
understand the motivation for GCC's behavior, in case there are cases
where it is important to destroy the T temporary after the S object is
destroyed.

To take this a step further, if the code looked like this:

{
S s(T(), T());
doSomethingElse();
}

…the S object should not be destroyed until doSomethingElse() finishes. The T objects, OTOH, will be destroyed at the end of the S object’s decl statement (or thereabouts).

But if I /remove/ the call to doSomethingElse(), suddenly the S object is destroyed first?

I like John’s interpretation, it’s very cute, but it doesn’t really match up with my actual expectations.

Jordan

Along normal control flow, there’s no debate this we always see this:
temp1.T();
temp2.T();
s(temp1, temp2); // assuming left-to-right evaluation for sake of argument
temp2.~T();
temp1.~T();
doSomethingElse(); // if present
s.~S();

Annotating this with exceptional edges, we would have:
temp1.T(); // resume
temp2.T(); // temp1.~T(); resume
s(temp1, temp2); // temp2.~T(); temp1.~T(); resume
temp2.~T(); // ??
temp1.~T(); // s.~S(); resume
doSomethingElse(); // s.~S(); resume
s.~S(); // resume

The presence of doSomethingElse does not affect the ordering here, and I’m not sure why you’re suggesting that my interpretation would make it so. The only question is what the ordering of destructors is out of temp2.~T().

I’ll respond to Richard in more detail.

John.

b is constructed, since its constructor has finished.

However, the problem seems to be a bit more subtle. When we get to the
destructors at the end of the full-expression, the two 'a' temporaries
and 'tmp' have been constructed. In normal execution, we would then
destroy the 'a' temporaries in reverse construction order, and then
destroy the 'tmp' object at the end of its scope.

Now, GCC seems to believe that after the first '~a' call throws, we
should destroy 'tmp' and then destroy the 'a' temporary. Perhaps it's
trying to enforce reverse construction order ('tmp' was constructed
after the 'a' temporaries), per 15.2/1. However, that paragraph does
not directly apply, since temporaries don't have automatic storage
duration (see core issue 365).

I don't find the reasoning in this core issue convincing, and I agree
with Steve Adamczyk's comment that it's really just their object lifetime
that's a bit odd, not their storage duration.

The problem, as I see it, is that the standard currently specifies
that they have static storage duration (because they are objects
"implicitly created by the implementation" (3.7/2), but they're not
"block-scope variables" (3.7.3/1)).

The choice of wording in [basic.stc] is not accidental: a temporary is an
object but not a variable. I do not believe [basic.stc.auto] is intended to be
exhaustive.

[class.temporary]p5 strongly suggests (but no, it does not state directly) that
the storage duration of a temporary varies by how it's used, e.g. being
increased when bound as the initializer of a reference of static duration.
Note the places which discuss objects "with the same storage duration" as
the temporary.

As an aside, the standard does not seem to require us to destroy a variable
of *static* storage duration if the destructor of a temporary required during
initialization throws — and yet I believe it does require us to repeat
initialization in this case. So really there's quite a bit of poor drafting in
this area.

I believe the intent of the standard is quite clearly to have the local
variable destroyed in reverse order of construction w.r.t. the temporaries,
much like a return value must be.

That does not match my expectation, which is that in:

S s(T(), T());

the T temporaries should *always* be destroyed before the S object is.
Having an inconsistency in destruction order (depending on whether one
of the ~T() calls throws) does not seem useful, and I don't see any
way to read the standard in which it is mandated. Indeed, consider the
following (credit to Chandler for the argument, any flaws in conveying
it are my own):

* The destructors for local variables are only invoked "As control
passes from a throw-expression to a handler".
* The evaluation of the throw-expression's full-expression is
sequenced-before the evaluation of the handler.
* Therefore the throw-expression and its temporaries are destroyed
before control passes.
* Therefore the temporaries are destroyed before the objects of
automatic storage duration are.

Er. I am reading your argument as:
  end-throw-expression < begin-passing-control < destroy-locals < end-passing-control < begin-handler
  end-throw-expression < begin-handler
  (implicitly) destroy-throw-expression-temporaries < end-throw-expression
therefore
  end-throw-expression < begin-passing-control
  destroy-throw-expression-temporaries < begin-passing-control
  destroy-throw-expression-temporaries < destroy-locals

This is all nicely tautological, but it doesn't seem to argue for anything,
and it certainly doesn't contradict the proposition that, during unwind,
local objects are always destroyed in the reverse of the order in which
they were constructed.

It is also somewhat irrelevant because you cannot have a throw lexically
in the full-expression for an initializer and have that throw evaluated
after the completion of the variable's constructor.

Also, your premises are wrong, because the operand of a
throw-expression is not itself a full-expression, so the destruction of
"its temporaries" does not occur until unwinding begins, i.e. during
the passing of control from the throw-expression to the handler.
We are then clearly controlled by [class.temporary]p3, and [except.ctor]p3
(terminate on throw) applies.

I would also like to put forward the hypothesis that the change
suggested would, in practice, only ever break code, because almost all
code is going to be set up to correctly handle the case where the
temporaries are destroyed before the local variable. Hence I'd like to
understand the motivation for GCC's behavior, in case there are cases
where it is important to destroy the T temporary after the S object is
destroyed.

This is equally true of return values: along the normal path, all the local
variables will be destroyed before the return value is, and yet the unwind
path is necessarily different.

John.