Preliminary reinterpret_cast Sema patch

Hi,

For people who want to write test cases, or otherwise play with this, here's my preliminary patch for semantic analysis of reinterpret_cast. The patch also includes my previous work on const_cast, slightly updated.

The patch doesn't include reinterpret_cast test cases yet, but I've attached another file containing a number of commented (but not -verify-ready) test cases I wrote last night.

Sebastian

reinterpret_cast.patch (20 KB)

cxx-reinterpret-cast.cpp (2.5 KB)

Hi Sebastian,

Some comments about the tests:

  // G++ doesn't like the first two. I disagree with that. /2 allows casting
  // an expression to its own type, subject to all other restrictions. The only
  // restriction mentioned, however, is that casting away constness is invalid.
  // It should therefore be valid to cast a non-pointer type to its own type,
  // no matter how useless that is.
  int i = 0;
  i = reinterpret_cast<int>(i);
  structure s;
  s = reinterpret_cast<structure>(s);

These are errors for MSVC and Comeau too. I think the "Subject to the restrictions in this section" refers to the whole 5.2.10 section, not the paragraph.
See here for another related perspective:
http://www.archivum.info/comp.std.c++/2006-06/msg00080.html

  float *fp = reinterpret_cast<float*>(l);
  // Note: should fail on 64-bit, but LLVM is target-agnostic. Is it possible
  // to annotate the bytecode to make this fail in native codegen?
  int i = reinterpret_cast<int>(fp);

LLVM IR produced by C/C++ cannot be target-agnostic. See: http://llvm.org/docs/tutorial/LangImpl8.html#targetindep
You can use TargetInfo to check whether the above conversion is legal.

  fnptr2 fp2 = reinterpret_cast<fnptr2>(fp);
  // Should fail in C++03 without extensions, succeed with exts or in 0x mode.
  void *vp = reinterpret_cast<void*>(fp2);

All three compilers accept this; I think Clang should accept it too, programmers will be unpleasantly surprised if they find this is not allowed.

-Argiris

Sebastian Redl wrote:

Argiris Kirtzidis wrote:

Hi Sebastian,

Some comments about the tests:

  // G++ doesn't like the first two. I disagree with that. /2 allows casting
  // an expression to its own type, subject to all other restrictions. The only
  // restriction mentioned, however, is that casting away constness is invalid.
  // It should therefore be valid to cast a non-pointer type to its own type,
  // no matter how useless that is.
  int i = 0;
  i = reinterpret_cast<int>(i);
  structure s;
  s = reinterpret_cast<structure>(s);

These are errors for MSVC and Comeau too. I think the "Subject to the restrictions in this section" refers to the whole 5.2.10 section, not the paragraph.
See here for another related perspective:
http://www.archivum.info/comp.std.c++/2006-06/msg00080.html

The thing is, compilers implement a curiously hybrid view of the standard.
5.2.10/1 says that only the explicitly listed conversions are supported.
5.2.10/2 says that conversion to self is allowed, "subject to the restrictions in this section". But 5.2.10 contains only two restrictions: that of not casting away constness (/2) and that of only allowing explicitly listed conversions (/1).
Let's look at the other paragraphs:
/4 allows pointer->integral. Obviously, this can't allow conversion to self.
/5 allows integral->pointer. Same thing.
/6 allows pointer to function to pointer to function "of different type". This explicitly disallows conversion to self.
/7 allows pointer to object to pointer to object "of different type". Again, conversion to self is explicitly disallowed.
/9 allows pointer to member to pointer to member, if both sides are member function or member object types, no mixing. This does not explicitly disallow conversion to self, although it was probably intended, since it would be in keeping with the spirit of the other two paragraphs.

This means that there are, in my opinion, three valid interpretations of the standard:
1) Conversion to self is allowed only for pointers to members. This is the most literal interpretation.
2) Conversion to self is not allowed at all. This assumes that /9 is in the spirit of /6 and /7.
3) Conversion to self is always allowed. This assumes the note in /2 is actually normative and the only restriction it talks about is casting away constness.

But compilers generally allow conversion to self if the involved types are both pointers, or the equivalent reference cast. GCC allows int*->int*, but not int->int.

It's easy to follow general practice, of course, but I still think that the other compilers are wrong.

  float *fp = reinterpret_cast<float*>(l);
  // Note: should fail on 64-bit, but LLVM is target-agnostic. Is it possible
  // to annotate the bytecode to make this fail in native codegen?
  int i = reinterpret_cast<int>(fp);

LLVM IR produced by C/C++ cannot be target-agnostic. See: http://llvm.org/docs/tutorial/LangImpl8.html#targetindep
You can use TargetInfo to check whether the above conversion is legal.

Sad, but makes sense. How do I get a TargetInfo from the Sema?

  fnptr2 fp2 = reinterpret_cast<fnptr2>(fp);
  // Should fail in C++03 without extensions, succeed with exts or in 0x mode.
  void *vp = reinterpret_cast<void*>(fp2);

All three compilers accept this; I think Clang should accept it too, programmers will be unpleasantly surprised if they find this is not allowed.

I accept it unless LangInfo.NoExtensions is set. I don't know any switch that actually enables this bit, though - I think -std=c++98 should, but it doesn't. I can get rid of the NoExtensions check if you think it's better.
In addition, I warn if -pedantic is set. This is consistent with GCC's behavior.

Sebastian

Sebastian Redl wrote:

The thing is, compilers implement a curiously hybrid view of the standard.
5.2.10/1 says that only the explicitly listed conversions are supported.
5.2.10/2 says that conversion to self is allowed, "subject to the restrictions in this section". But 5.2.10 contains only two restrictions: that of not casting away constness (/2) and that of only allowing explicitly listed conversions (/1).
Let's look at the other paragraphs:
/4 allows pointer->integral. Obviously, this can't allow conversion to self.
/5 allows integral->pointer. Same thing.
/6 allows pointer to function to pointer to function "of different type". This explicitly disallows conversion to self.
/7 allows pointer to object to pointer to object "of different type". Again, conversion to self is explicitly disallowed.
/9 allows pointer to member to pointer to member, if both sides are member function or member object types, no mixing. This does not explicitly disallow conversion to self, although it was probably intended, since it would be in keeping with the spirit of the other two paragraphs.

This means that there are, in my opinion, three valid interpretations of the standard:
1) Conversion to self is allowed only for pointers to members. This is the most literal interpretation.
2) Conversion to self is not allowed at all. This assumes that /9 is in the spirit of /6 and /7.
3) Conversion to self is always allowed. This assumes the note in /2 is actually normative and the only restriction it talks about is casting away constness.

But compilers generally allow conversion to self if the involved types are both pointers, or the equivalent reference cast. GCC allows int*->int*, but not int->int.

It's easy to follow general practice, of course, but I still think that the other compilers are wrong.

The justification is that integral->integral is not a "explicitly listed conversion", so int->int should also be illegal.

  float *fp = reinterpret_cast<float*>(l);
  // Note: should fail on 64-bit, but LLVM is target-agnostic. Is it possible
  // to annotate the bytecode to make this fail in native codegen?
  int i = reinterpret_cast<int>(fp);

LLVM IR produced by C/C++ cannot be target-agnostic. See: http://llvm.org/docs/tutorial/LangImpl8.html#targetindep
You can use TargetInfo to check whether the above conversion is legal.

Sad, but makes sense. How do I get a TargetInfo from the Sema?

It's in ASTContext (Context.Target), but a better way is to use ASTContext's getTypeSize(QualType T).

  fnptr2 fp2 = reinterpret_cast<fnptr2>(fp);
  // Should fail in C++03 without extensions, succeed with exts or in 0x mode.
  void *vp = reinterpret_cast<void*>(fp2);

All three compilers accept this; I think Clang should accept it too, programmers will be unpleasantly surprised if they find this is not allowed.

I accept it unless LangInfo.NoExtensions is set. I don't know any switch that actually enables this bit, though - I think -std=c++98 should, but it doesn't. I can get rid of the NoExtensions check if you think it's better.
In addition, I warn if -pedantic is set. This is consistent with GCC's behavior.

Ah never mind, I agree about keeping it consistent with GCC.

If you issue a diagnostic marked with EXTENSION it will have this behavior without having to do any explicit checks yourself:
-will not be emitted by default
-will emit a warning at -pedantic
-will emit an error at -pedantic-errors.

And you are right, for C++0x it should probably be allowed without pedantic warnings.

-Argiris

Updated patch.
Cast to self is brought in line with GCC's behavior, even though I still think that it's wrong. Or maybe it's what was intended, and the standard is defective.
Cast to integral of smaller size now fails.
Cast between object and function no longer depends on the NoExtensions bit. I have no idea when this will actually be set, and I don't want it to unexpectedly break things.

Argiris Kirtzidis wrote:

And you are right, for C++0x it should probably be allowed without pedantic warnings.

GCC is not yet updated to do this, by the way. Yay, we've got better C++0x support (in a very minor subpoint). :slight_smile:

Sebastian

cxx-reinterpret-cast.cpp (2.07 KB)

reinterpret_cast.patch (19.6 KB)

Hi Sebastian,

Thanks for the update; comments follow.

Argiris Kirtzidis wrote:

And you are right, for C++0x it should probably be allowed without
pedantic warnings.

GCC is not yet updated to do this, by the way. Yay, we've got better C++0x
support (in a very minor subpoint). :slight_smile:

Sebastian

Index: test/SemaCXX/const-cast.cpp

--- test/SemaCXX/const-cast.cpp (revision 0)
+++ test/SemaCXX/const-cast.cpp (revision 0)
@@ -0,0 +1,47 @@
+// RUN: clang -fsyntax-only -verify %s
+
+// See if aliasing can confuse this baby.
+typedef char c;
+typedef c *cp;
+typedef cp *cpp;
+typedef cpp *cppp;
+typedef cppp &cpppr;
+typedef const cppp &cpppcr;
+typedef const char cc;
+typedef cc *ccp;
+typedef volatile ccp ccvp;
+typedef ccvp *ccvpp;
+typedef const volatile ccvpp ccvpcvp;
+typedef ccvpcvp *ccvpcvpp;
+typedef int iar[100];
+typedef iar &iarr;
+typedef int (*f)(int);
+
+char ***good_const_cast_test(ccvpcvpp var)
+{
+ char ***var2 = const_cast<cppp>(var);
+ char ***const &var3 = static_cast<cpppcr>(var2); // Different bug.
+ char ***&var4 = const_cast<cpppr>(var3);
+ char *** var5 = const_cast<cppp>(var4);

Did you really mean to use var2, var3, and var4 as the expressions in
these const_casts? The tests aren't actually casting away any
constness here, so I suspect that you want to use "var" as the
argument to the const_casts throughout.

Also, why create variables like var3 and var4? You can typically
express any "this expression has no effect"-type warnings (if that was
your intent) by writing, e.g.,

  (void)const_cast<cpppr>(var3);

+ const int ar[100] = {0};
+ int (&rar)[100] = const_cast<iarr>(ar);
+ int *pi = const_cast<int*>(ar); // Array decay?
+ f fp = 0;
+ f *fpp = const_cast<f*>(&fp);

There doesn't seem to be any "const" involved here. I suggest adding, e.g.,

  f const * fp2 = 0;
  f* fpp2 = const_cast<f*>(fp2);

+short *bad_const_cast_test(char const *volatile *const volatile *var)
+{
+ char **var2 = const_cast<char**>(var); // expected-error {{invalid
const_cast from 'char const *volatile *const volatile *' to incompatible
type 'char **'}}
+ short ***var3 = const_cast<short***>(var); // expected-error {{invalid
const_cast from 'char const *volatile *const volatile *' to incompatible
type 'short ***'}}

Note that these expected-errors don't match the actual error messages
we get, so -verify fails.

+ int *(*rar)[100] = const_cast<int *(*)[100]>(&ar); // expected-error
{{invalid const_cast from 'int const *(*)[100]' to incompatible type 'int
*(*)[100]'}}

This expected-error doesn't match the actual error message.

DIAG(err_value_init_for_array_type, ERROR,
     "array types cannot be value-initialized")
+DIAG(err_bad_cxx_cast_generic, ERROR,
+ "%0 from '%2' to '%1' is not allowed")

[...]

Please put a "// C++ casts" before all of the C++ cast-specific diagnostics.

+/// CheckConstCast - Check that a const_cast&lt;DestType>(SrcExpr) is

The &lt; should be a '<' here.

valid.
+/// 5.2.11/3: Both types must be pointers (to objects, void, or data
members,
+/// see 5.2.11/5) with the same level of indirection. The final pointee
type
+/// must be the same. All cv ([EXT] and r?) types along the way are
mutable
+/// by the cast. The result is an rvalue.
+/// 5.2.11/4: If the incoming expression denotes an lvalue, DestType can be
a
+/// reference to the underlying type. The result is an lvalue.

A couple comments about the comments... please put "C++" prior to the
section/paragraph you are citing and use the "p" instead of the "/",
e.g., "C++ 5.2.11p4", so that the specification reference script can
pick up the references.

Also, I suggest moving the C++ standard quotes down into the body of
the function, where the code actually implements that rule. The
comment for CheckConstCast itself should point to the section in the
standard where const_cast is defined, and give a small example of how
const_cast is used.

+void
+Sema::CheckConstCast(SourceLocation OpLoc, Expr *&SrcExpr, QualType
DestType)
+{
+ QualType OrigDestType = DestType, OrigSrcType = SrcExpr->getType();
+
+ DestType = Context.getCanonicalType(DestType);
+ QualType SrcType = SrcExpr->getType();
+ const PointerLikeType *SrcTypeTmp, *DestTypeTmp;

Okay, I know I'm being pretty picky, but I'd really prefer if
SrcTypeTmp and DestTypeTmp were only declared locally within the
blocks where they are used, e.g., replacing

+ if ((DestTypeTmp = DestType->getAsReferenceType())) {

with

  if (const ReferenceTypeType *DestTypeTmp = DestType->getAsReferenceType()) {

+ if (SrcExpr->isLvalue(Context) != Expr::LV_Valid) {
+ // Cannot cast non-lvalue to reference type.
+ Diag(OpLoc, diag::err_bad_cxx_cast_rvalue,
+ "const_cast", OrigDestType.getAsString());
+ return;
+ }
+
+ // /4: "if a pointer to T can be [cast] to the type pointer to T2"

Here's a case where citing all of p4 (and using the full spec
reference, [C++ 5.2.10p4)

+ DestType = Context.getPointerType(DestTypeTmp->getPointeeType());

+ if ((SrcTypeTmp = SrcType->getAsReferenceType()))
+ SrcType = SrcTypeTmp->getPointeeType();

This shouldn't happen, because an expression never has reference type.
However, Clang is currently does this wrong, so I expect we'll need
this code for now. Please put a FIXME here pointing out that this "if"
should never be true.

+ SrcType = Context.getPointerType(SrcType);
+ } else {
+ // Decay the argument.
+ DefaultFunctionArrayConversion(SrcExpr);
+ SrcType = SrcExpr->getType();

Here's another case where citing the relevant part of p1 would help the reader.

+ }
+ if (!DestType->isPointerType()) {
+ // Cannot cast to non-pointer, non-reference type.
+ Diag(OpLoc, diag::err_bad_const_cast_dest, OrigDestType.getAsString());
+ return;

Please note in the comment here that if DestType was a reference type
originally, it is now a pointer type.

+ }
+ if (DestType->isFunctionPointerType()) {
+ // Cannot cast direct function pointers.
+ Diag(OpLoc, diag::err_bad_const_cast_dest, OrigDestType.getAsString());
+ return;
+ }
+ SrcType = Context.getCanonicalType(SrcType);
+
+ // Unwrap the pointers. Ignore qualifiers. Terminate early if the types
are
+ // completely equal.
+ while (SrcType != DestType &&
+ (SrcTypeTmp = SrcType->getAsPointerType()) &&
+ (DestTypeTmp = DestType->getAsPointerType()))

Please put SrcTypeTmp and DestTypeTmp (and their initializations)
inside the body of the while loop.

+ {
+ SrcType = Context.getCanonicalType(SrcTypeTmp->getPointeeType());
+ DestType = Context.getCanonicalType(DestTypeTmp->getPointeeType());
+ }

I'm very surprised to see that you aren't calling getUnqualifiedType
inside this loop. Calling getUnqualifiedType on each of SrcType and
DestType each time through would allow us to short-circuit the
comparison if we're casting, e.g., T1 * * * const * * * to T1 * * * *
* *; right now, it'll go through the loop all the way down until it
compares the T1's.

Also: is this going to allow us to const_cast between different
address spaces? I'm not sure what the right answer is there.

Finally, once SrcType and DestType are both canonical types (which is
guaranteed before we even enter the loop), all of the types we get via
getPointeeType will already be canonical. So, you don't need to call
getCanonicalType in the loop.

+ // If we end up with constant arrays of equal size, unwrap those too. A
cast
+ // from const int [N] to int (&)[N] is invalid by my reading of the
+ // standard, but g++ accepts it even with -ansi -pedantic.
+ // No more than one level, though, so don't embed this in the unwrap loop
+ // above.

How interesting... EDG also accepts this const_cast, but the standard
says absolutely nothing about being able to const_cast away qualifiers
on the element types of arrays. I'm curious whether MSVC diagnoses
this as an error or not. I'm also inclined to diagnose this as an
error, unless our reading proves faulty.

Thanks again for this patch; it's in very good shape already. I'll
follow up with a review of the reinterpret_cast implementation later.

  - Doug

Sebastian Redl wrote:

Updated patch.
Cast to self is brought in line with GCC's behavior, even though I still think that it's wrong. Or maybe it's what was intended, and the standard is defective.

Another point that may convince you :slight_smile: :

The quote in question, "Subject to the restrictions in this section, an expression may be cast to its own type using a reinterpret_cast operator", is a note. Notes are for clarification, they are not supposed to introduce new semantics. The semantics should be the same with or without the note.
Now consider what would happen if this note didn't exist at all. Wouldn't you be obliged to consider int->int as a integral->integral conversion and reject it as illegal ?
I think this is why "Subject to the restrictions in this section" is added, otherwise it would just say "An expression may be cast to its own type using a reinterpret_cast operator".

-Argiris

Argiris Kirtzidis wrote:

Sebastian Redl wrote:

Updated patch.
Cast to self is brought in line with GCC's behavior, even though I still think that it's wrong. Or maybe it's what was intended, and the standard is defective.

Another point that may convince you :slight_smile: :

You've convinced me that this is what the standard intended - but not that it is what it says. :-p

Sebastian

Doug Gregor wrote:

Hi Sebastian,

Thanks for the update; comments follow.

+char ***good_const_cast_test(ccvpcvpp var)
+{
+ char ***var2 = const_cast<cppp>(var);
+ char ***const &var3 = static_cast<cpppcr>(var2); // Different bug.
+ char ***&var4 = const_cast<cpppr>(var3);
+ char *** var5 = const_cast<cppp>(var4);
    
Did you really mean to use var2, var3, and var4 as the expressions in
these const_casts? The tests aren't actually casting away any
constness here, so I suspect that you want to use "var" as the
argument to the const_casts throughout.
  

No, I mean it exactly like it's there. var3->var4 does cast away constness (const& -> &). var2->var3 should be implicit, but that fails, so I made it a static_cast. var4->var5 tests the dropping of the reference, not casting away constness.

Also, why create variables like var3 and var4? You can typically
express any "this expression has no effect"-type warnings (if that was
your intent) by writing, e.g.,

  (void)const_cast<cpppr>(var3);
  

I suppose, for those cases where I don't use the variable.

+ const int ar[100] = {0};
+ int (&rar)[100] = const_cast<iarr>(ar);
+ int *pi = const_cast<int*>(ar); // Array decay?
+ f fp = 0;
+ f *fpp = const_cast<f*>(&fp);
    
There doesn't seem to be any "const" involved here. I suggest adding, e.g.,

  f const * fp2 = 0;
  f* fpp2 = const_cast<f*>(fp2);
  

These test array decay and that a function pointer pointer is not misidentified as a function pointer, not constness. I like my unit tests to test just one thing.

Note that these expected-errors don't match the actual error messages
we get, so -verify fails.
  

Yeah, I changed the messages in the last reinterpret_cast update and forgot to update the test case.

Please put a "// C++ casts" before all of the C++ cast-specific diagnostics.
  

OK.

+/// CheckConstCast - Check that a const_cast&lt;DestType>(SrcExpr) is
    
The &lt; should be a '<' here.
  

Doxygen doesn't mind? I know that JavaDoc can get confused.

A couple comments about the comments... please put "C++" prior to the
section/paragraph you are citing and use the "p" instead of the "/",
e.g., "C++ 5.2.11p4", so that the specification reference script can
pick up the references.
  

OK.

Also, I suggest moving the C++ standard quotes down into the body of
the function, where the code actually implements that rule. The
comment for CheckConstCast itself should point to the section in the
standard where const_cast is defined, and give a small example of how
const_cast is used.
  

I like having the entire relevant standard parts summarized at the start of the function, but I agree that it results in quite a bit of info duplication, because I repeat some of it at the places I actually implement it.

  

+void
+Sema::CheckConstCast(SourceLocation OpLoc, Expr *&SrcExpr, QualType
DestType)
+{
+ QualType OrigDestType = DestType, OrigSrcType = SrcExpr->getType();
+
+ DestType = Context.getCanonicalType(DestType);
+ QualType SrcType = SrcExpr->getType();
+ const PointerLikeType *SrcTypeTmp, *DestTypeTmp;
    
Okay, I know I'm being pretty picky, but I'd really prefer if
SrcTypeTmp and DestTypeTmp were only declared locally within the
blocks where they are used,

Usually, I do too, but I already had them in this scope because of the loop later, and I thought it ugly to have them again, so I just moved the declarations up here.

+ if (SrcExpr->isLvalue(Context) != Expr::LV_Valid) {
+ // Cannot cast non-lvalue to reference type.
+ Diag(OpLoc, diag::err_bad_cxx_cast_rvalue,
+ "const_cast", OrigDestType.getAsString());
+ return;
+ }
+
+ // /4: "if a pointer to T can be [cast] to the type pointer to T2"
    
Here's a case where citing all of p4 (and using the full spec
reference, [C++ 5.2.10p4)
  

As I said, duplication.

+ if ((SrcTypeTmp = SrcType->getAsReferenceType()))
+ SrcType = SrcTypeTmp->getPointeeType();
    
This shouldn't happen, because an expression never has reference type.
  

Really? What type does "static_cast<int&>(i)" have? What about simply "i", where i is declared as "int &i = ...;"? Do they have type int, with lvalue possibility?

Here's another case where citing the relevant part of p1 would help the reader.
  

OK.

Please note in the comment here that if DestType was a reference type
originally, it is now a pointer type.
  

OK.

+ }
+ if (DestType->isFunctionPointerType()) {
+ // Cannot cast direct function pointers.
+ Diag(OpLoc, diag::err_bad_const_cast_dest, OrigDestType.getAsString());
+ return;
+ }
+ SrcType = Context.getCanonicalType(SrcType);
+
+ // Unwrap the pointers. Ignore qualifiers. Terminate early if the types
are
+ // completely equal.
+ while (SrcType != DestType &&
+ (SrcTypeTmp = SrcType->getAsPointerType()) &&
+ (DestTypeTmp = DestType->getAsPointerType()))
    
Please put SrcTypeTmp and DestTypeTmp (and their initializations)
inside the body of the while loop.
  

Can't. The loop header also checks if the result of the assignment is null. I'd have to write something like:
while(SrcType != DestType) {
  const PointerType *SrcTypeTmp = SrcType->getAsPointerType();
  const PointerType *DestTypeTmp = DestType->getAsPointerType();
  if(!SrcTypeTmp || !DestTypeTmp) {
    break;
  }
  // ...
}
and I actually think my current code is nicer. (Matter of opinion, of course. I'll change it if you insist.)

  

+ {
+ SrcType = Context.getCanonicalType(SrcTypeTmp->getPointeeType());
+ DestType = Context.getCanonicalType(DestTypeTmp->getPointeeType());
+ }
    
I'm very surprised to see that you aren't calling getUnqualifiedType
inside this loop. Calling getUnqualifiedType on each of SrcType and
DestType each time through would allow us to short-circuit the
comparison if we're casting, e.g., T1 * * * const * * * to T1 * * * *
* *; right now, it'll go through the loop all the way down until it
compares the T1's.
  

I don't see how. It costs us at most one loop iteration more. T1***const* is a different type from T1****. T1***const is a different type from T1*** (that's the wasted iteration). T1** is the same type as T1**.
The change makes sense, but I don't think it's as bad as you think.

Also: is this going to allow us to const_cast between different
address spaces? I'm not sure what the right answer is there.
  

I don't even know what you mean by different address spaces, or how they are implemented. Are you talking about x86-16-type segmented memory or something even more extreme?

Finally, once SrcType and DestType are both canonical types (which is
guaranteed before we even enter the loop), all of the types we get via
getPointeeType will already be canonical. So, you don't need to call
getCanonicalType in the loop.
  

I wasn't aware of this guarantee - in fact, I was somehow under the impression that getCanonicalType only canonicalized the first level. Can't remember why I thought that, but I know that I very deliberately added these additional calls. I'll try it without them.

  

+ // If we end up with constant arrays of equal size, unwrap those too. A
cast
+ // from const int [N] to int (&)[N] is invalid by my reading of the
+ // standard, but g++ accepts it even with -ansi -pedantic.
+ // No more than one level, though, so don't embed this in the unwrap loop
+ // above.
    
How interesting... EDG also accepts this const_cast, but the standard
says absolutely nothing about being able to const_cast away qualifiers
on the element types of arrays. I'm curious whether MSVC diagnoses
this as an error or not.

I can try it tomorrow.

I'm also inclined to diagnose this as an
error, unless our reading proves faulty.
  

Or at least emit a warning or something. I wonder if any real-world code actually relies on this quirk of the compilers.

Thanks again for this patch; it's in very good shape already. I'll
follow up with a review of the reinterpret_cast implementation later.
  

Thank you for the review.

Sebastian

Hi Sebastian,

Doug Gregor wrote:

+char ***good_const_cast_test(ccvpcvpp var)
+{
+ char ***var2 = const_cast<cppp>(var);
+ char ***const &var3 = static_cast<cpppcr>(var2); // Different bug.
+ char ***&var4 = const_cast<cpppr>(var3);
+ char *** var5 = const_cast<cppp>(var4);

Did you really mean to use var2, var3, and var4 as the expressions in
these const_casts? The tests aren't actually casting away any
constness here, so I suspect that you want to use "var" as the
argument to the const_casts throughout.

No, I mean it exactly like it's there. var3->var4 does cast away constness
(const& -> &). var2->var3 should be implicit, but that fails, so I made it a
static_cast. var4->var5 tests the dropping of the reference, not casting
away constness.

Ah, okay. The initialization of "var3" probably isn't checking what we
want it to check, because the semantics of reference binding aren't
implemented. Still, we're definitely checking *something*, so let's
leave it as is.

Also, why create variables like var3 and var4? You can typically
express any "this expression has no effect"-type warnings (if that was
your intent) by writing, e.g.,

(void)const_cast<cpppr>(var3);

I suppose, for those cases where I don't use the variable.

Now that I'm unconfused about the uses of var2, var3, etc., this
suggestion of mine doesn't matter so much.

These test array decay and that a function pointer pointer is not
misidentified as a function pointer, not constness. I like my unit tests to
test just one thing.

Okay, good.

+/// CheckConstCast - Check that a const_cast&lt;DestType>(SrcExpr) is

The &lt; should be a '<' here.

Doxygen doesn't mind? I know that JavaDoc can get confused.

Ah, right. The right escaping is '\<', according to
http://www.stack.nl/~dimitri/doxygen/commands.html#cmdat

and you'll also need to escape the > with '\>'.

Also, I suggest moving the C++ standard quotes down into the body of
the function, where the code actually implements that rule. The
comment for CheckConstCast itself should point to the section in the
standard where const_cast is defined, and give a small example of how
const_cast is used.

I like having the entire relevant standard parts summarized at the start of
the function, but I agree that it results in quite a bit of info
duplication, because I repeat some of it at the places I actually implement
it.

I guess it depends on how we read the comments... personally, I want
to get a rough idea of what the function is doing when I read the
function-level comment, then when I'm browsing the code I want to see
the correspondence between the standard and the code.

+ const PointerLikeType *SrcTypeTmp, *DestTypeTmp;

Okay, I know I'm being pretty picky, but I'd really prefer if
SrcTypeTmp and DestTypeTmp were only declared locally within the
blocks where they are used,

Usually, I do too, but I already had them in this scope because of the loop
later, and I thought it ugly to have them again, so I just moved the
declarations up here.

Hrm, yeah. I guess what I don't like here is the use of
PointerLikeType (which, in truth, I don't really like anyway) as a
ReferenceType in some places and as a PointerType in other places.

+ if ((SrcTypeTmp = SrcType->getAsReferenceType()))
+ SrcType = SrcTypeTmp->getPointeeType();

This shouldn't happen, because an expression never has reference type.

Really?

Yeah, check out C++ 5p5, where it talks about the adjustments to the
type of an expression. This is on my short list to fix in Clang.

What type does "static_cast<int&>(i)" have?

The type is "int", and it's an lvalue.

What about simply "i",
where i is declared as "int &i = ...;"? Do they have type int, with lvalue
possibility?

Yes, exactly.

+ while (SrcType != DestType &&
+ (SrcTypeTmp = SrcType->getAsPointerType()) &&
+ (DestTypeTmp = DestType->getAsPointerType()))

Please put SrcTypeTmp and DestTypeTmp (and their initializations)
inside the body of the while loop.

Can't. The loop header also checks if the result of the assignment is null.
I'd have to write something like:
while(SrcType != DestType) {
const PointerType *SrcTypeTmp = SrcType->getAsPointerType();
const PointerType *DestTypeTmp = DestType->getAsPointerType();
if(!SrcTypeTmp || !DestTypeTmp) {
  break;
}
// ...
}
and I actually think my current code is nicer. (Matter of opinion, of
course. I'll change it if you insist.)

I don't insist.

+ {
+ SrcType = Context.getCanonicalType(SrcTypeTmp->getPointeeType());
+ DestType = Context.getCanonicalType(DestTypeTmp->getPointeeType());
+ }

I'm very surprised to see that you aren't calling getUnqualifiedType
inside this loop. Calling getUnqualifiedType on each of SrcType and
DestType each time through would allow us to short-circuit the
comparison if we're casting, e.g., T1 * * * const * * * to T1 * * * *
* *; right now, it'll go through the loop all the way down until it
compares the T1's.

I don't see how. It costs us at most one loop iteration more. T1***const* is
a different type from T1****. T1***const is a different type from T1***
(that's the wasted iteration). T1** is the same type as T1**.
The change makes sense, but I don't think it's as bad as you think.

You're right; it's not as bad as I thought.

Also: is this going to allow us to const_cast between different
address spaces? I'm not sure what the right answer is there.

I don't even know what you mean by different address spaces, or how they are
implemented. Are you talking about x86-16-type segmented memory or something
even more extreme?

We can ignore them for now with a FIXME. Address spaces are a C
extension for embedded platforms, which are represented by ASQualType
in Clang.

I *think* that const_cast should not permit one to change the address
space, but my understanding of address spaces is too fuzzy to be sure.

I'm also inclined to diagnose this as an
error, unless our reading proves faulty.

Or at least emit a warning or something. I wonder if any real-world code
actually relies on this quirk of the compilers.

If one compiler gets it wrong, there is code somewhere that depends on
that behavior. It's a corollary to Murphy's Law for compilers :slight_smile:

Still, if the code is ill-formed we should diagnose it up until the
point where users complain too loudly to ignore.

  - Doug

Doug Gregor wrote:

Hi Sebastian,

No, I mean it exactly like it's there. var3->var4 does cast away constness
(const& -> &). var2->var3 should be implicit, but that fails, so I made it a
static_cast. var4->var5 tests the dropping of the reference, not casting
away constness.
    
Ah, okay. The initialization of "var3" probably isn't checking what we
want it to check, because the semantics of reference binding aren't
implemented. Still, we're definitely checking *something*, so let's
leave it as is.
  

It uses the C rules. They're not set up to handle references. However, they don't complain if you get exactly the same type, so since the expression static_cast<int&> has a reference type, this works. Note that this will break if you fix the bug you mentioned below.

  

Also, why create variables like var3 and var4? You can typically
express any "this expression has no effect"-type warnings (if that was
your intent) by writing, e.g.,

(void)const_cast<cpppr>(var3);

I suppose, for those cases where I don't use the variable.
    
Now that I'm unconfused about the uses of var2, var3, etc., this
suggestion of mine doesn't matter so much.
  

Meh, I did it anyway. It makes the tests a bit more readable. I also added comments to a few.

+/// CheckConstCast - Check that a const_cast&lt;DestType>(SrcExpr) is

The &lt; should be a '<' here.

Doxygen doesn't mind? I know that JavaDoc can get confused.
    
Ah, right. The right escaping is '\<', according to
http://www.stack.nl/~dimitri/doxygen/commands.html#cmdat

and you'll also need to escape the > with '\>'.
  

Done.

  

Also, I suggest moving the C++ standard quotes down into the body of
the function, where the code actually implements that rule. The
comment for CheckConstCast itself should point to the section in the
standard where const_cast is defined, and give a small example of how
const_cast is used.

I like having the entire relevant standard parts summarized at the start of
the function, but I agree that it results in quite a bit of info
duplication, because I repeat some of it at the places I actually implement
it.
    
I guess it depends on how we read the comments... personally, I want
to get a rough idea of what the function is doing when I read the
function-level comment, then when I'm browsing the code I want to see
the correspondence between the standard and the code.
  

Done.

Hrm, yeah. I guess what I don't like here is the use of
PointerLikeType (which, in truth, I don't really like anyway) as a
ReferenceType in some places and as a PointerType in other places.
  

OK, I split that into ReferenceType for the early checks and PointerType for the loop.

Yeah, check out C++ 5p5, where it talks about the adjustments to the
type of an expression. This is on my short list to fix in Clang.
  

You mean p6, but I see it. OK, added FIXME.

Can't. The loop header also checks if the result of the assignment is null.
I'd have to write something like:
while(SrcType != DestType) {
const PointerType *SrcTypeTmp = SrcType->getAsPointerType();
const PointerType *DestTypeTmp = DestType->getAsPointerType();
if(!SrcTypeTmp || !DestTypeTmp) {
  break;
}
// ...
}
and I actually think my current code is nicer. (Matter of opinion, of
course. I'll change it if you insist.)
    
I don't insist.
  

I moved the declarations of the two variables just before the loop, though.

Also: is this going to allow us to const_cast between different
address spaces? I'm not sure what the right answer is there.

I don't even know what you mean by different address spaces, or how they are
implemented. Are you talking about x86-16-type segmented memory or something
even more extreme?
    
We can ignore them for now with a FIXME. Address spaces are a C
extension for embedded platforms, which are represented by ASQualType
in Clang.

I *think* that const_cast should not permit one to change the address
space, but my understanding of address spaces is too fuzzy to be sure.
  

OK, added FIXME.

  

I'm also inclined to diagnose this as an
error, unless our reading proves faulty.

Or at least emit a warning or something. I wonder if any real-world code
actually relies on this quirk of the compilers.
    
If one compiler gets it wrong, there is code somewhere that depends on
that behavior. It's a corollary to Murphy's Law for compilers :slight_smile:

Still, if the code is ill-formed we should diagnose it up until the
point where users complain too loudly to ignore.
  

OK, #ifdefed the code out for now.

Updated patch is attached. The reinterpret_cast tests have been made verifiable but are not yet in the tree.
By the way, I discovered that it emits a disambiguation warning on this line:
int (&rar)[100] = const_cast<iarr>(ar);
This feels a bit wrong - how else would I define a reference to an array to avoid the warning?

Sebastian

cxx_cast.patch (21 KB)

cxx-reinterpret-cast.cpp (2.28 KB)

Well, it *is* ambiguous, at least from the parser's perspective;
consider if the int() were a cast and the & an addressof operator.

-Eli

Sure, it's ambiguous, but we shouldn't warn about it because the code
isn't wrong (and isn't likely to be wrong).

  - Doug

Doug Gregor wrote:

Sure, it's ambiguous, but we shouldn't warn about it because the code
isn't wrong (and isn't likely to be wrong).
  
Ambiguous resolutions confuse people in general which was why I added the warning.
But I don't have a strong opinion on how useful this warning is; e.g. it warns people over declarations that look like constructor calls but it's not likely for people to write stuff like

T1(T2(x));

since creating an object like that isn't very useful.
If the consensus is that the warning doesn't add much value I'll remove it.

On another note, I find the function declarator ambiguous warning useful, since stuff like this:

T1 x(T2(y));

are more likely to occur and confuse people.

-Argiris

It seems to me that this kind of warning might be useful if something
does eventually go wrong. For example, if something that referenced
the name being declared was part of some semantic failure later on, it
might make sense to emit a note that said: "this was ambiguous but C++
says it should be a declaration; put parentheses around it to make it
an expression". However, as it is now I think it produces too many
false positives.

I'd like to hear others' opinions on the matter, though.

  - Doug

Hrm, yeah. I guess what I don't like here is the use of
PointerLikeType (which, in truth, I don't really like anyway) as a
ReferenceType in some places and as a PointerType in other places.

OK, I split that into ReferenceType for the early checks and PointerType for
the loop.

Thank you.

Updated patch is attached. The reinterpret_cast tests have been made
verifiable but are not yet in the tree.

Okay, thanks for the updated patch. Comments inline.

+ case tok::kw_const_cast:
+ Op = CXXCastExpr::ConstCast;
+ CheckConstCast(OpLoc, Ex, DestType);
+ break;
+ case tok::kw_dynamic_cast:
+ Op = CXXCastExpr::DynamicCast;
+ break;
+ case tok::kw_reinterpret_cast:
+ Op = CXXCastExpr::ReinterpretCast;
+ CheckReinterpretCast(OpLoc, Ex, DestType);
+ break;
+ case tok::kw_static_cast:
+ Op = CXXCastExpr::StaticCast;
+ break;

Hrm. It's not your problem, but I think we'll end up refactoring the
cast hierarchy a bit. I really dislike having a CXXCastExpr node with
an enum inside it telling us which case it is, because the casts are
all *totally* different semantically. But, I'll make this change after
your const_cast and reinterpret_cast patches are in the tree.

+/// CheckConstCast - Check that a const_cast\<DestType\>(SrcExpr) is valid.
+/// Refer to C++ 5.2.11 for details. const_cast is typically used in code
+/// like this:
+/// const char *str = "literal";
+/// legacy_function(const_cast\<char*\>(str));
+void
+Sema::CheckConstCast(SourceLocation OpLoc, Expr *&SrcExpr, QualType
DestType)

This routine looks good, thanks.

+ // Doug Gregor said to disallow this until users complain.

I'll gladly take the blame :slight_smile:

+ DestType = Context.getCanonicalType(DestType);
+ QualType SrcType = SrcExpr->getType();
+ if (const ReferenceType *DestTypeTmp = DestType->getAsReferenceType()) {
+ if (SrcExpr->isLvalue(Context) != Expr::LV_Valid) {
+ // Cannot cast non-lvalue to reference type.
+ Diag(OpLoc, diag::err_bad_cxx_cast_rvalue,
+ "reinterpret_cast", OrigDestType.getAsString());
+ return;
+ }

Hmm... in the case of an error when type-checking the casts, we should
not actually build a CXXCastExpr node. Rather, Sema::ActOnCXXCasts
should return "true" so that the parser knows that there was an error.

I suggest that both CheckConstCast and CheckReinterpretCast return a
bool that is true when there is an error and false otherwise (this is
the scheme used elsewhere in Clang). Then, ActOnCXXCasts will never
build an ill-formed node in the AST (which is a good thing).

+ // C++ 5.2.10p10: [...] a reference cast reinterpret_cast<T&>(x) has
the
+ // same effect as the conversion *reinterpret_cast<T*>(&x) with the
+ // built-in & and * operators.
+ // This code does this transformation for the checked types.
+ DestType = Context.getPointerType(DestTypeTmp->getPointeeType());
+ if (const ReferenceType *SrcTypeTmp = SrcType->getAsReferenceType()) {
+ // FIXME: This shouldn't actually be possible, but right now it is.
+ SrcType = SrcTypeTmp->getPointeeType();
+ }
+ SrcType = Context.getPointerType(SrcType);
+ } else {
+ // C++ 5.2.10p1: [...] the lvalue-to-rvalue, array-to-pointer, and
+ // function-to-pointer standard conversions are performed on the
+ // expression v.
+ DefaultFunctionArrayConversion(SrcExpr);
+ SrcType = SrcExpr->getType();
+ }
+
+ // Canonicalize source for comparison.
+ SrcType = Context.getCanonicalType(SrcType);
+
+ bool destIsPtr = DestType->isPointerType();
+ bool srcIsPtr = SrcType->isPointerType();
+ if (!destIsPtr && !srcIsPtr) {
+ // Except for std::nullptr_t->integer, which is not supported yet, and
+ // lvalue->reference, which is handled above, at least one of the two
+ // arguments must be a pointer.
+ Diag(OpLoc, diag::err_bad_cxx_cast_generic, "reinterpret_cast",
+ OrigDestType.getAsString(), OrigSrcType.getAsString());
+ return;
+ }
+
+ if (SrcType == DestType) {
+ // C++ 5.2.10p2 has a note that mentions that, subject to all other
+ // restrictions, a cast to the same type is allowed. The intent is not
+ // entirely clear here, since all other paragraphs explicitly forbid
casts
+ // to the same type. However, the behavior of compilers is pretty
consistent
+ // on this point: allow same-type conversion if the involved are
pointers,
+ // disallow otherwise.
+ return;
+ }
+
+ if (DestType->isIntegralType()) {

At the moment, this needs to be DestType->isIntegralType() &&
!DestType->isEnumeralType(), because Clang currently considers
enumeration types to be integral types. (That's wrong for C++ but
might be right for C; it needs more investigation).

+ assert(srcIsPtr);
+ // C++ 5.2.10p4: A pointer can be explicitly converted to any integral
+ // type large enough to hold it.
+ if (Context.getTypeSize(SrcType) > Context.getTypeSize(DestType)) {
+ Diag(OpLoc, diag::err_bad_reinterpret_cast_small_int,
+ OrigDestType.getAsString());
+ }
+ return;
+ }
+
+ if (SrcType->isIntegralType() || SrcType->isEnumeralType()) {

Despite the fact that SrcType->isIntegralType() also includes
SrcType->isEnumeralType(), please leave the || in here so we don't
forget later on that we also want to check enumeration types :slight_smile:

+ // Unwrap the pointers. Terminate early if the types are completely
equal.
+ while (SrcType != DestType &&
+ (SrcTypeTmp = SrcType->getAsPointerType()) &&
+ (DestTypeTmp = DestType->getAsPointerType()))

When I added the qualification conversions, I put in the routine
Sema::UnwrapSimilarPointerTypes that should help simplify the control
flow of this loop. I suggest that you check it out. (It may also help
CheckConstCast, but perhaps not as much, since that loop is a bit
simpler).

+ {
+ deepPtr = true;
+ SrcType = SrcTypeTmp->getPointeeType();
+ DestType = DestTypeTmp->getPointeeType();
+ // Qualifiers must be identical at later levels.
+ if (SrcType.getCVRQualifiers() != DestType.getCVRQualifiers()) {

Hmmm, I don't think this is right. 4.4p4, which describes the implicit
qualification conversion, doesn't require later levels of have exact
qualifiers. For this implicit qualification conversion, if there was a
"const" at an earlier level, there needs to be a const at later
levels, and you can only add (not remove) qualifiers;
Sema::IsQualificationConversion encodes this implicit conversion.

One easy way to check whether a conversion casts away constness would
be to actually construct the types described in 5.2.11p8 and use
IsQualificationConversion... when that is false, the reinterpret_cast
is casting away constness.

+ Diag(OpLoc, diag::err_bad_cxx_cast_const_away, "reinterpret_cast",
+ OrigDestType.getAsString(), OrigSrcType.getAsString());
+ return;
+ }
+ }
+
+ if (SrcType == DestType) {
+ // See above for the similar check.
+ return;
+ }
+
+ if(deepPtr) {
+ // If we unwrapped more than one level, we're not dealing with function
or
+ // member pointers at the top level. This means we're done.
+ return;
+ }
+
+ // We have stripped away all indirections from at least one of the types.

Ah, this is a bit subtle. The checks that follow are skipped if we
have pointers-to-pointers. Personally, I think it would be cleaner if
the "casting away constness" check didn't modify SrcType and DestType;
then, pointers-to-pointers would just go through the tests below
without complaining, since pointers are object types.

+ // As for members, we don't have them yet.
+
+ if (SrcType->isObjectType() || SrcType->isVoidType()) {
+ if (DestType->isObjectType() || DestType->isVoidType()) {
+ // C++ 5.2.10p7: A pointer to an object can be explicitly converted
to
+ // a pointer to an object of different type.
+ // Void pointers are not specified, but supported by G++.
+ return;
+ }

EDG supports void pointers, too, and it seems obvious that they should
be allowed, but there is nothing in the standard or in the issues list
to imply that conversion to/from void pointers is okay.

+ if (DestType->isFunctionType()) {
+ // C++0x 5.2.10p8: Converting a pointer to a function into a pointer
to
+ // an object type or vice versa is conditionally-supported.
+ // Compilers support it in C++03 too, though, because it's necessary
for
+ // casting the return value of dlsym() and GetProcAddress().
+ if (!getLangOptions().CPlusPlus0x) {
+ Diag(OpLoc, diag::ext_reinterpret_cast_fn_obj);
+ }

This makes me wonder if we should have some kind of option for
conditionally-supported behavior. /me files that in the back of his
mind

It would be nice to have, in addition to the cxx-reinterpret-cast.cpp
test, another "-pedantic -std=c++98" version of the test that tests
the complaint about reinterpret_cast'ing pointer pointers to functions
and pointers to objects.

Thanks again for the patch! We're getting close... actually, the
const_cast bits could probably go in with a tiny amount of tweaking.
If you'd like to separate out those bits and submit/commit, please go
ahead; I'd like to look over a second round of the reinterpret_cast
stuff before it goes in, though.

  - Doug

Doug Gregor wrote:

Hmm... in the case of an error when type-checking the casts, we should
not actually build a CXXCastExpr node. Rather, Sema::ActOnCXXCasts
should return "true" so that the parser knows that there was an error.

I suggest that both CheckConstCast and CheckReinterpretCast return a
bool that is true when there is an error and false otherwise (this is
the scheme used elsewhere in Clang). Then, ActOnCXXCasts will never
build an ill-formed node in the AST (which is a good thing).
  
Is there any harm in building semantically invalid nodes ?
It will not affect clients that care about semantics since they will not analyse the AST if there are errors (there could be invalid decls).
The clients that only care about the textual representation of the program, and not about the semantics, will get seriously impacted by not building the nodes, e.g. a refactoring client will not be able to pickup the use of a variable because the expression that references it is not added to the AST.

To mirror the decls marked as "invalid" we could have exprs marked as "invalid" too.

-Argiris

Doug Gregor wrote:

  

+ case tok::kw_const_cast:
+ Op = CXXCastExpr::ConstCast;
+ CheckConstCast(OpLoc, Ex, DestType);
+ break;
+ case tok::kw_dynamic_cast:
+ Op = CXXCastExpr::DynamicCast;
+ break;
+ case tok::kw_reinterpret_cast:
+ Op = CXXCastExpr::ReinterpretCast;
+ CheckReinterpretCast(OpLoc, Ex, DestType);
+ break;
+ case tok::kw_static_cast:
+ Op = CXXCastExpr::StaticCast;
+ break;
    
Hrm. It's not your problem, but I think we'll end up refactoring the
cast hierarchy a bit. I really dislike having a CXXCastExpr node with
an enum inside it telling us which case it is, because the casts are
all *totally* different semantically. But, I'll make this change after
your const_cast and reinterpret_cast patches are in the tree.
  

Oh, absolutely! I left it the way it is now because it's irrelevant to the semantic checks I'm doing, but it was pretty clear the current scheme is insufficient.

+ DestType = Context.getCanonicalType(DestType);
+ QualType SrcType = SrcExpr->getType();
+ if (const ReferenceType *DestTypeTmp = DestType->getAsReferenceType()) {
+ if (SrcExpr->isLvalue(Context) != Expr::LV_Valid) {
+ // Cannot cast non-lvalue to reference type.
+ Diag(OpLoc, diag::err_bad_cxx_cast_rvalue,
+ "reinterpret_cast", OrigDestType.getAsString());
+ return;
+ }
    
Hmm... in the case of an error when type-checking the casts, we should
not actually build a CXXCastExpr node. Rather, Sema::ActOnCXXCasts
should return "true" so that the parser knows that there was an error.

I suggest that both CheckConstCast and CheckReinterpretCast return a
bool that is true when there is an error and false otherwise (this is
the scheme used elsewhere in Clang). Then, ActOnCXXCasts will never
build an ill-formed node in the AST (which is a good thing).
  

I had it that way originally, but Argiris told me to return the node regardless of the result of the check.

+ if (DestType->isIntegralType()) {
    
At the moment, this needs to be DestType->isIntegralType() &&
!DestType->isEnumeralType(), because Clang currently considers
enumeration types to be integral types. (That's wrong for C++ but
might be right for C; it needs more investigation).
  

OK.

+ // Unwrap the pointers. Terminate early if the types are completely
equal.
+ while (SrcType != DestType &&
+ (SrcTypeTmp = SrcType->getAsPointerType()) &&
+ (DestTypeTmp = DestType->getAsPointerType()))
    
When I added the qualification conversions, I put in the routine
Sema::UnwrapSimilarPointerTypes that should help simplify the control
flow of this loop. I suggest that you check it out. (It may also help
CheckConstCast, but perhaps not as much, since that loop is a bit
simpler).
  

I was planning to. Of course, the patch is older than your commit.

  

+ {
+ deepPtr = true;
+ SrcType = SrcTypeTmp->getPointeeType();
+ DestType = DestTypeTmp->getPointeeType();
+ // Qualifiers must be identical at later levels.
+ if (SrcType.getCVRQualifiers() != DestType.getCVRQualifiers()) {
    
Hmmm, I don't think this is right.

Indeed. I realized that after reading your work on IsQualificationConversion.
I'll just reuse your function here.

+ if(deepPtr) {
+ // If we unwrapped more than one level, we're not dealing with function
or
+ // member pointers at the top level. This means we're done.
+ return;
+ }
+
+ // We have stripped away all indirections from at least one of the types.
    
Ah, this is a bit subtle. The checks that follow are skipped if we
have pointers-to-pointers. Personally, I think it would be cleaner if
the "casting away constness" check didn't modify SrcType and DestType;
then, pointers-to-pointers would just go through the tests below
without complaining, since pointers are object types.
  

OK.

+ // As for members, we don't have them yet.
+
+ if (SrcType->isObjectType() || SrcType->isVoidType()) {
+ if (DestType->isObjectType() || DestType->isVoidType()) {
+ // C++ 5.2.10p7: A pointer to an object can be explicitly converted
to
+ // a pointer to an object of different type.
+ // Void pointers are not specified, but supported by G++.
+ return;
+ }
    
EDG supports void pointers, too, and it seems obvious that they should
be allowed, but there is nothing in the standard or in the issues list
to imply that conversion to/from void pointers is okay.
  

So, do we support this or not?

It would be nice to have, in addition to the cxx-reinterpret-cast.cpp
test, another "-pedantic -std=c++98" version of the test that tests
the complaint about reinterpret_cast'ing pointer pointers to functions
and pointers to objects.
  

OK.

Thanks again for the patch! We're getting close... actually, the
const_cast bits could probably go in with a tiny amount of tweaking.
If you'd like to separate out those bits and submit/commit, please go
ahead; I'd like to look over a second round of the reinterpret_cast
stuff before it goes in, though.
  

I'll just post another full version of the patch. With SVN, separating different parts is always a bit of a hassle.

Sebastian

Doug Gregor wrote:

Hmm... in the case of an error when type-checking the casts, we should
not actually build a CXXCastExpr node. Rather, Sema::ActOnCXXCasts
should return "true" so that the parser knows that there was an error.

I suggest that both CheckConstCast and CheckReinterpretCast return a
bool that is true when there is an error and false otherwise (this is
the scheme used elsewhere in Clang). Then, ActOnCXXCasts will never
build an ill-formed node in the AST (which is a good thing).

I had it that way originally, but Argiris told me to return the node
regardless of the result of the check.

Ah, interesting. Well, in that case, leave it as is and we can have a
separate discussion about this issue.

+ // As for members, we don't have them yet.
+
+ if (SrcType->isObjectType() || SrcType->isVoidType()) {
+ if (DestType->isObjectType() || DestType->isVoidType()) {
+ // C++ 5.2.10p7: A pointer to an object can be explicitly
converted
to
+ // a pointer to an object of different type.
+ // Void pointers are not specified, but supported by G++.
+ return;
+ }

EDG supports void pointers, too, and it seems obvious that they should
be allowed, but there is nothing in the standard or in the issues list
to imply that conversion to/from void pointers is okay.

So, do we support this or not?

Yes, and I'm going to ask for clarification or file an issue.

Thanks again for the patch! We're getting close... actually, the
const_cast bits could probably go in with a tiny amount of tweaking.
If you'd like to separate out those bits and submit/commit, please go
ahead; I'd like to look over a second round of the reinterpret_cast
stuff before it goes in, though.

I'll just post another full version of the patch. With SVN, separating
different parts is always a bit of a hassle.

Yes, it is. Looking forward to your next revision!

  - Doug

Doug Gregor wrote:

  

I'll just post another full version of the patch. With SVN, separating
different parts is always a bit of a hassle.
    
Yes, it is. Looking forward to your next revision!
  

Here you go. I split the check for casting away constness in its own function, since it will be needed for static and dynamic casts as well. Both const cast checker and the constness helper now use UnwrapSimilarPointers for the loop.

I integrated the test cases into the source tree, and added one for the edge case of casting away constness, i.e. that adding const at every level should work. There's probably even more complicated tests involving volatile that should be tested, but my mind refuses to handle more than two levels of pointers with qualifiers properly. :wink:

There's also a test case for the pedantic warning.

Now, with all your nice work on conversions, static_cast checking should be possible without too much effort.

Sebastian

cxx_cast.patch (25.2 KB)