Impending initialization rewrite

Over the next few days, I plan to rewrite much of the semantic analysis for initialization to clean up a variety of problems that have surfaced over the past few months.

Major issues to address:

  1) Redundancy: initialization is currently spread among several files (SemaDecl, SemaDeclCXX, SemaInit, SemaOverload) with a disturbing amount of redundancy. I'd like to pull all of this code together into a single place that more clearly implements the C and C++ initialization rules.
  2) Diagnostics: our diagnostics for initialization failures, overload resolution, etc. are currently rather poor. They use barbaric means to distinguish the different kinds of initialization and assignment (see the "const char *Flavor" argument to, e.g., Sema::PerformCopyInitialization), have wording that does backflips to make up for that fact, and fail to adequately track the entities that are being initialized (variable, field, base class, temporary due to cast, temporary due to implicit conversion, etc.)
  3) Correctness & ASTs: we've been patching up the initialization code and its corresponding ASTs for a while, but since the code is so spread out we end up fixing the same bugs in many places.
  4) Overloading: all of the initialization code has to be prepared to attempt a conversion without producing any diagnostics if it fails, for overloading purposes. To prepare for C++0x, this includes checking of { } initializer lists.
  5) Copy elision: audit all of the places where we can perform copy elision to ensure that the initialization code propagates that information into the ASTs.
  6) Temporaries: Make temporaries explicit, even when they are of POD or scalar type.
  7) Warnings: (possibly) implement new initialization warnings, such as -Wmissing-field-initializers.

Interface:

The main initialization code will have three entry points:

  TryInitialization) determines whether an object or reference can be initialized from a given set of expressions, taking into account the kind of initialization (direct, copy, value, etc.), target type, source expression, etc. The result of this routine will be some indicator of success/failure and an InitializationSequence which, like ImplicitConversionSequence, will describe the steps needed to perform the initialization. TryInitialization will never emit diagnostics nor build any permanent ASTs.

  PerformInitialization) performs the initialization of an object or reference given an InitializationSequence, building appropriate ASTs and performing checking that had to be delayed because it isn't part of overloading (e.g., binding a non-const reference to a bit-field).

  DiagnoseInitialization) emits diagnostics for a failed initialization, where TryInitialization could not produce a valid initialization sequence. The key here is reusability: we want DiagnoseInitialization to be used both by the obvious clients (variable initialization, base/member initialization, casts) and also for overload resolution (e.g., to specify precisely why a given overload candidate failed to match).

Any questions, comments, or advice would be appreciated!

  - Doug

Douglas Gregor wrote:

Interface:

The main initialization code will have three entry points:

  TryInitialization) determines whether an object or reference can be initialized from a given set of expressions, taking into account the kind of initialization (direct, copy, value, etc.), target type, source expression, etc. The result of this routine will be some indicator of success/failure and an InitializationSequence which, like ImplicitConversionSequence, will describe the steps needed to perform the initialization. TryInitialization will never emit diagnostics nor build any permanent ASTs.

  PerformInitialization) performs the initialization of an object or reference given an InitializationSequence, building appropriate ASTs and performing checking that had to be delayed because it isn't part of overloading (e.g., binding a non-const reference to a bit-field).

  DiagnoseInitialization) emits diagnostics for a failed initialization, where TryInitialization could not produce a valid initialization sequence. The key here is reusability: we want DiagnoseInitialization to be used both by the obvious clients (variable initialization, base/member initialization, casts) and also for overload resolution (e.g., to specify precisely why a given overload candidate failed to match).

Any questions, comments, or advice would be appreciated!
  

I love it.

I believe there are some comments in the static_cast code that address a
problem with diagnostics for the "explicit implicit conversion" case. We
should make sure that DiagnoseInitialization is suitable for this purpose.

Sebastian

I missed that consumer of initialization... I'll look there, too. Thanks!

  - Doug

Over the next few days, I plan to rewrite much of the semantic analysis for initialization to clean up a variety of problems that have surfaced over the past few months.

Generally sounds good.

   6\) Temporaries: Make temporaries explicit, even when they are of POD or scalar type\.

I'd suggest keeping track of what happens for PR5524 here.

Any questions, comments, or advice would be appreciated!

The only other relevant issue I can think of off the top of my head is
that a higher-quality replacement for Expr::isConstantInitializer
would be nice; perhaps it could work as a flag for the proposed
PerformInitialization.

-Eli

Douglas Gregor wrote:

I believe there are some comments in the static_cast code that address a
problem with diagnostics for the "explicit implicit conversion" case. We
should make sure that DiagnoseInitialization is suitable for this
purpose.

I missed that consumer of initialization... I'll look there, too. Thanks!

Good. Take particular note of the evil "CStyle" flag. It is relevant to
PerformInitialization, because it suppresses access checks in
derived-to-base conversions.

Sebastian

This turns out to be a huge task. I'm attaching a completely-untested version of what I'd like to do, in case anyone wants to discuss the approach. Essentially, we're trying to capture everything in an InitializationSequence, whose initialization corresponds to initialization (har har) that can then be diagnosed (to emit any delayed diagnostics) or performed (to produce a complete initializer AST). If you've ever looked at ImplicitConversionSequence, it's like that... but provides much more information when there is a failure, is better encapsulated, and will subsume more of the initialization rules.

My plan is to switch one simple client of reference-initialization (AddInitializerToDecl when the VarDecl is a reference) over to this initialization logic and write tests to exercise all of the new code paths, tweak the diagnostics until I'm happy, etc. Then, I'll move all of the CheckReferenceInit callers over to this scheme, and so on.

  - Doug

initialization-rewrite-checkpoint-2.patch (78.7 KB)

Douglas Gregor wrote:

Essentially, we're trying to capture everything in an InitializationSequence, whose initialization corresponds to initialization (har har) that can then be diagnosed (to emit any delayed diagnostics) or performed (to produce a complete initializer AST). If you've ever looked at ImplicitConversionSequence, it's like that... but provides much more information when there is a failure, is better encapsulated, and will subsume more of the initialization rules.

It looks very nice. I like the approach.
What about zero-initialization? Does it still exist in C++0x?

My plan is to switch one simple client of reference-initialization (AddInitializerToDecl when the VarDecl is a reference) over to this initialization logic and write tests to exercise all of the new code paths, tweak the diagnostics until I'm happy, etc. Then, I'll move all of the CheckReferenceInit callers over to this scheme, and so on.

I suppose that makes sense. Alternatively you could try to move AddInitializerToDecl over completely, step by step, and then follow with the other places that do initialization. But I think your approach is better.

Sebastian

Douglas Gregor wrote:

Essentially, we're trying to capture everything in an InitializationSequence, whose initialization corresponds to initialization (har har) that can then be diagnosed (to emit any delayed diagnostics) or performed (to produce a complete initializer AST). If you've ever looked at ImplicitConversionSequence, it's like that... but provides much more information when there is a failure, is better encapsulated, and will subsume more of the initialization rules.

It looks very nice. I like the approach.
What about zero-initialization? Does it still exist in C++0x?

It happens as part of value initialization for non-class and POD types; I don't think it's ever directly "invoked" by anywhere else in the language.

My plan is to switch one simple client of reference-initialization (AddInitializerToDecl when the VarDecl is a reference) over to this initialization logic and write tests to exercise all of the new code paths, tweak the diagnostics until I'm happy, etc. Then, I'll move all of the CheckReferenceInit callers over to this scheme, and so on.

I suppose that makes sense. Alternatively you could try to move AddInitializerToDecl over completely, step by step, and then follow with the other places that do initialization. But I think your approach is better.

I want to try to get nicer diagnostics than "candidate function here" for overload-resolution failures. I *think* I have the right interface for that, but I want to try it first with a very narrow application area (reference initialization) before I commit to this interface.

  - Doug