Questions/discussions about cast types in clang

Hi all,

I’m currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I’m almost done, but I’ve ran into a collection of issues/questions that I’d like to raise here to make sure it’s not a misunderstanding on my part. My solutions are included in each bullet point:

  1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template void foo() { int* ptr; (T*) ptr; }

If we call foo, it can be a const_cast, but if we call foo, it has to be a reinterpret_cast. In a situation where the user is writing library code, it won’t only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.

  1. (Issue/bug) In OperationKinds.def, there is an enum detailing casting to union types. This doesn’t work in clang(nor GCC). Here’s a compiler explorer MVCE: https://godbolt.org/z/rp8DDt. The first statement is not a cast, but rather a CompoundLiteralExpr(InitListExpr(…)), and should not be treated as a cast. Here is the corresponding GCC document: https://gcc.gnu.org/onlinedocs/gcc/Cast-to-Union.html about union casts. If CastKind::CK_ToUnion is ever encountered in my program, I will leave the C-style cast as is.

  2. (Question) In general, should this tool for C++ also support OpenCL and Objective C++? Technically, const/static/reinterpret_casts exist in those languages (see https://godbolt.org/z/DEz8Rs for example). In my opinion, I think the casting tool should strictly adhere to C++ as the language of support and nothing more for the first iteration because different coding standards may be held for different languages(and I only have ample experience with C++). As a result, I propose that the first iteration of the casting tool be placed outside of clang-tidy as clang-tidy also supports Objective C (but not OpenCL), as seen here: https://clang.llvm.org/extra/clang-tidy/#id2. In the main driver I will explicitly check whether the program is in C++.

I’d appreciate any questions/concerns, or oversight on my part on the 3 issues above. Thanks!

Best,
Ray Zhang

Hi all,

I’m currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I’m almost done, but I’ve ran into a collection of issues/questions that I’d like to raise here to make sure it’s not a misunderstanding on my part. My solutions are included in each bullet point:

  1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template void foo() { int* ptr; (T*) ptr; }

If we call foo, it can be a const_cast, but if we call foo, it has to be a reinterpret_cast.

It could even involve both a const_cast and a static_cast; or it could be unrepresentable as a non-C-style cast (e.g. a cast to a private base).

In a situation where the user is writing library code, it won’t only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.

Sounds reasonable. Although, it would be nice to flag such casts for human inspection somehow; see my very informal proposal here for example.

  1. (Issue/bug) In OperationKinds.def, there is an enum detailing casting to union types. This doesn’t work in clang(nor GCC). Here’s a compiler explorer MVCE: https://godbolt.org/z/rp8DDt. The first statement is not a cast, but rather a CompoundLiteralExpr(InitListExpr(…)), and should not be treated as a cast. Here is the corresponding GCC document: https://gcc.gnu.org/onlinedocs/gcc/Cast-to-Union.html about union casts. If CastKind::CK_ToUnion is ever encountered in my program, I will leave the C-style cast as is.

It works in GCC trunk for me… but only in C++20 mode.
https://godbolt.org/z/Qjpxga

I suspect that your analysis is still correct: this feature is documented, but does not work, in GCC pre-C++20, and then in C++20 the changes around parens-initialization of aggregates broke it in such a way that now it appears to work but nobody should trust it to actually work. It might be nice to submit a patch ripping this broken stuff out of Clang and/or out of GCC.

my $.02,
–Arthur

Hi Arthur,

First of all, it’s an honor. I remember watching your “Allocator is a handle to a heap” c++now talk a while ago :slight_smile:

Second, you’re right - I forgot about DerivedToBase or BaseToDerived in the above example. Regarding the statement regarding unpresentable as a non-C-style cast, in the standard it breaks down C-style casts as a cascading attempt to use const/static/reinterpret, or combinations of them. Here’s an example of private base: https://godbolt.org/z/5STynJ.

Third, I didn’t check for C++20 for union casts, but I’m guessing it “working” is a side-effect of the parens-initialization of aggregates allowing narrowing conversions. Here’s a link that shows that the example in the C++20 mode you showed is not equivalent to a C-style cast assignment which is just an alias for (union U) {.x = x}: https://godbolt.org/z/7LJCQy. I will leave the union c-style casts as is for now(as clang trunk can’t reproduce this even), and in the future hope to contribute a patch to remove this. Thank you for your input!

Best,
Ray

Hi Arthur,
First of all, it’s an honor. I remember watching your “Allocator is a handle to a heap” c++now talk a while ago :slight_smile:

Hey, thanks! :slight_smile: Don’t take anything I say as gospel when it comes to cfe-dev, though. :slight_smile:

Second, you’re right - I forgot about DerivedToBase or BaseToDerived in the above example. Regarding the statement regarding unpresentable as a non-C-style cast, in the standard it breaks down C-style casts as a cascading attempt to use const/static/reinterpret, or combinations of them. Here’s an example of private base: https://godbolt.org/z/5STynJ.

Yes, but notice where cppreference says “static_cast with exceptions”; I was talking about the exceptions.
Here’s an example where C-style cast and reinterpret_cast do different things, but any other kind of cast (e.g. static_cast) is ill-formed.
https://godbolt.org/z/if7Pis

Clang gives a warning suggesting that you replace the reinterpret_cast with a static_cast, but of course if you actually did that, the code wouldn’t compile anymore.

Third, I didn’t check for C++20 for union casts, but I’m guessing it “working” is a side-effect of the parens-initialization of aggregates allowing narrowing conversions. Here’s a link that shows that the example in the C++20 mode you showed is not equivalent to a C-style cast assignment which is just an alias for (union U) {.x = x}: https://godbolt.org/z/7LJCQy.

Ah, yes, you’re right. With (U)y, GCC is calling the same pseudo-constructor as U(y), which is supposed to just implicitly convert the float to an int and initialize the first member of the union, completely ignoring the fact that y is a float
This appears to be actually the correct behavior in C++20, even though it seems pretty crazy. (In other words, C++20’s new union syntax doesn’t follow the same overload-resolution-ish rules as C++17’s std::variant.) Textual support:
http://eel.is/c++draft/expr.cast#4 says that (U)y should behave the same as static_cast(y) if possible, and

http://eel.is/c++draft/expr.static.cast#4 says that static_cast(y) should behave the same as U(y), which (new in C++20) should behave the same as U{.x = y}.

Here’s a pared-down example where the two fields of the union have utterly different types, so the silent implicit conversion fails instead of succeeding. https://godbolt.org/z/9wVX6B
Apparently Clang just doesn’t implement this C++20 rule yet; once it does, Clang’s behavior should end up matching GCC’s.

…Oh, and wait a minute! The first sentence of the documentation you linked says, “A cast to a union type is a C extension not available in C++.” So that explains why neither of us could reproduce the documented behavior! :slight_smile: Now here’s a Godbolt showing the same code compiled as C++ (where it initializes the first field of the union) and C-with-extensions (where it initializes the second field because that field is a float).
https://godbolt.org/z/heN2Fb

And here’s Clang, reproducing GCC’s behavior in C-with-extensions but failing to implement C++20 yet:
https://godbolt.org/z/dY6ti8

–Arthur

Hi Arthur,

Regarding the static_cast extension - I see how I can reproduce that now. Thank you! For each DerivedToBase cast type, I’ll check to see whether the base class is inaccessible, and leave the C-style cast as-is.

:facepalm: at not reading the first sentence of that GCC document. I had the assumption that all C extensions were available in C++ for backwards compatibility, but today I learned. Sorry about the wild goose-chase! The enum CastKind::CK_ToUnion is thus referring to a feature from a C extension not yet available(will it ever be available?) in C++, I’ll try to document it in my tool and later in the enum definitions.

Learned a lot from our discussion, thanks so much!

Best,
Ray

Hi all,

I'm currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I'm almost done, but I've ran into a collection of issues/questions that I'd like to raise here to make sure it's not a misunderstanding on my part. My solutions are included in each bullet point:

1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template <typename T> void foo() { int* ptr; (T*) ptr; }
If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast. In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.

Makes sense. You shouldn't be trying to transform instantiations of templates. Perhaps issuing a warning as Arthur wrote makes sense though.

3. (Question) In general, should this tool for C++ also support OpenCL and Objective C++? Technically, const/static/reinterpret_casts exist in those languages (see Compiler Explorer for example). In my opinion, I think the casting tool should strictly adhere to C++ as the language of support and nothing more for the first iteration because different coding standards may be held for different languages(and I only have ample experience with C++). As a result, I propose that the first iteration of the casting tool be placed outside of clang-tidy as clang-tidy also supports Objective C (but not OpenCL), as seen here: Clang-Tidy — Extra Clang Tools 18.0.0git documentation. In the main driver I will explicitly check whether the program is in C++.

Clang-tidy has plenty of checks which are C++-specific.

You can make your check c++-specific too:

bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
return LangOpts.CPlusPlus;
}

clang-tidy is still an appropriate place for your check.

Thanks,

Stephen.

Hi Stephen,

Thank you for the suggestions! And I see the option, I will aim to contribute it inside of clang-tidy :slight_smile:

By the way, I have encountered an issue when trying to address the cast type CastKind::CK_BuiltinFnToFnPtr. According to this demo, it is impossible to pass in a builtin function as a function pointer. In the comments here, it says “only available from the callee of a call expression”. It seems like I am passing in the function pointer as the callee in my demo, no?

Best,
Ray