[RFC] New file extension for compiling C++ for OpenCL sources

Hello,

Up to now, we have used the same file extension for OpenCL C and C++ for OpenCL
(https://clang.llvm.org/docs/UsersManual.html#cxx-for-opencl). But to keep
consistency with the clang driver interface, it makes more sense that C++ for OpenCL
has a different extension from OpenCL C. Mirroring C and C++ it would be logical to
support the following:

.cl - OpenCL C source file
.clcpp - C++ for OpenCL source file

I would like to share the review that adds a new file extension for C++ for OpenCL with
a wider audience https://reviews.llvm.org/D96771. Feel free to provide us any feedback
regarding the direction or implementation details.

Kind regards,
Anastasia

I mean, if .cl is the OpenCL C extension, logically the C++ extension should be .cxxl or .cppl

Thanks! I see, I think .cl was meant to represent Compute Language as in OpenCL, I would somehow find it better if we don’t split cl… I think that “cl” indicates more intuitively that it is OpenCL specific file. We could however use cppcl?

That seems like a good compromise

Is there any prior art? As in, what other compilers do? Or is Clang the precursor here?

-Andrzej

I am not aware of non-clang implementations of OpenCL kernel language
parsing. There are some forks of clang customizing the default behaviour
though. But I am not aware of any customizing that specific functionality.

As an otherwise uninterested party, I’ll say that :

.cxxcl/.cppcl

and

.clcxx/.clcpp

Are both about equally as acceptable to me barring prior art here. cxxl or cppl doesn’t seem to make sense to me (as you said below, ‘cl’ stands for ‘compute language’, not ‘C-something-starting-with-L’.

I personally have a slight preference for cxxcl/cppcl for purely aesthetic reasons though.

Interesting! :slight_smile:

IMHO both:
.cxxcl/.cppcl
and
.clcxx/.clcpp
make it clear what the files contain. +1 from me for either of these (or both).

Thank you for working on this Anastasia!

-Andrzej

FWIW, as a complete outsider, as a complete non-user of OpenCL (but a heavy user of C++), I don’t see why a new filename extension is a good thing. In the C++ world we’ve already had to deal with “same extension different language” for many years — you have a file named test.cpp, but you must still tell the compiler what language it’s written in:

  • via the driver name, e.g. clang vs. clang++
  • via the -std= option, e.g. -std=c++03 vs. -std=c++20
    The compiler generally can’t guess just from the filename.
    Is C+±with-OpenCL really so different from the situation with C++ today?
    Is there any appetite to create a driver alias such as “clangcl”, analogous to what we do today with “clang++”?

my very small $.02,
–Arthur

Interesting observation.

Is C+±with-OpenCL really so different from the situation with C++ today?

Right now we only have one version but it is very plausible that more will appear in the future.

Do I understand it correctly that you suggest the filename suffixes are not that useful and we should abandon the idea?

Then should this apply to all or some languages?

Aside from detecting the default compiler flags (even if not entirely)
I find extensions are quite useful to indicate in what language sources are written.

Is there any appetite to create a driver alias such as “clangcl”, analogous to what we do today with “clang++”?

We had no plans for this yet.

you have a file named test.cpp, but you must still tell the compiler what language it’s written in:

  • via the driver name, e.g. clang vs. clang++

I don’t have any problem getting ‘clang’ to compile a .cpp file as C++. My experience is that ‘clang++’ adds the c++ libraries to the linker command line, but that’s the only observable difference. (I’m not aware that we did anything special in this regard in our downstream project.)

–paulr

Is C+±with-OpenCL really so different from the situation with C++ today?

[…]
Do I understand it correctly that you suggest the filename suffixes are not that useful and we should abandon the idea?

If I ran the world, yes, definitely that is my opinion.
However, mine is quite likely to be an ill-informed opinion, because I don’t know basically anything about the OpenCL ecosystem or even OpenCL itself.
So you shouldn’t let me run the world. :slight_smile:

Then should this apply to all or some languages?

It depends on how much “C++ + OpenCL” is like a dialect of C++ and how much it’s like a brand-new language.

Objective-C++ is maybe in a vaguely similar situation, and it does get its own extension (.mm). But Objective-C++ is also:

  • explicitly a cross between two “equally first-class” languages Objective-C and C++, each of which already had its own extension (.m and .cpp/.cc)
  • perhaps thankfully moribund and maybe this is our chance to do something different?

Is there any appetite to create a driver alias such as “clangcl”, analogous to what we do today with “clang++”?

We had no plans for this yet.

I guess the typical reason to create a new driver is if you need it to do something special at link-time, when the original filenames won’t be available. E.g.
clang x.o y.o -o program.exe
and
clang++ x.o y.o -o program.exe
pass different flags to the linker. I don’t know if this situation applies to OpenCL.

–Arthur

I'm also an opencl outsider.

However, if opencl files are to be detected by a buildsystem like (but not limited to) cmake, then a file extension is a good thing. Consider this cmake code:

cmake_minimum_required(VERSION 3.10)

project(testproj C CXX OBJCXX OBJC)

add_library(test
test.cpp
test.c
test.m
test.mm
)

CMake detects the source-language of each source file and uses the appropriate driver and language compile option for each one.

With a file extension for opencl, CMake could use the appropriate driver/option for that language too. Without a distinct file extension for it, the user would have to tell cmake what the language is, which is inconvenient for the user.

Thanks,

Stephen.

It depends on how much “C++ + OpenCL” is like a dialect of C++ and how much it’s like a brand-new language.

Objective-C++ is maybe in a vaguely similar situation, and it does get its own extension (.mm). But Objective-C++ is also:

  • explicitly a cross between two “equally first-class” languages Objective-C and C++, each of which already had its own extension (.m and .cpp/.cc)

  • perhaps thankfully moribund and maybe this is our chance to do something different?

I think the comparison to Objective-C++ is probably the best way to positionC++ for OpenCL. OpenCL C also had a distinct extension from C. But I somehow
find it natural to add a separate one for C++ for OpenCL. I guess the advantage
is even if the language might evolve and more versions will likely be added
we can still choose the default one that can work for most cases. For example,
we can always reset it to the latest one or most used one. Although I
appreciate this doesn’t always happen. For OpenCL C we still use the first
version as default. However, I do admit it has most of the common
functionality.

There are also other useful aspects that are not compiler related though like
build system or syntax highlight that can use the file extensions. It makes
sense if they work with the same standardized extension rather than using
different ones.

Good point - another similar example would be the syntax highlight
in the IDEs or editors that frequently use the file extensions too.

I think that there are two things. First, the file extensions on their own. I think that it's fine to introduce e.g. .clcxx/.clcpp and make them map to e.g. TY_CL. This won't change much from the perspective of the driver, but end-users could start using and getting used to it. And adoption in IDEs/CMake could start as well.

The other thing is internal representation in clangDriver. Do we need to introduce TY_CLXX on top of TY_CL? I get the impression that TY_CL should be fine for now. There is TY_C and TY_CXX and you will find many places in the driver where the input type (e.g. TY_C vs TY_CXX vs TY_Fortran) affects the logic (e.g. which headers/libs/search-paths to include).

For Fortran, extensions are used to differentiate between:
   * fixed-form, free-form
   * pre-processed, not-pre-processed
   * various language standards
(personally I find this very confusing). So the extensions matters a lot and affects various stages of the compilation (e.g. preprocessing, parsing & sema). Some of the resulting logic lives in clangDriver, some of it in the new Flang frontend driver. Do you think that we will need such logic to differentiate between C and C++ OpenCL source files?

I get the impression that in order to support what's currently required, we don't need TY_CLXX just yet. But I'm also an OpenCL outsider and I might be missing something here :slight_smile:

-Andrzej

FYI in my initial patch (https://reviews.llvm.org/D96771), I am adding TY_CLXX,
but it was done mainly for the orthogonality rather than any functionality.

In a long term, I imagine the OpenCL C and C++ for OpenCL can start diverging
in their setup and then we would need more changes in the driver. The only
example I can think of right now is the standard libraries. But however, we don’t
add any special new libraries for C++ mode yet. Do you think it is better to leave
this functionality out until we have an actual use case?

As for the other extensions (e.g. header files, pre-processed output, etc) we haven’t
used any different ones from the standard C/C++ ones up to now. I imagine they are
not extremely popular and I also find adding too many of them making things
complicated and confusing too. So I don’t imagine we would want to add them into
driver support.

I think that TY_CLXX could be a bit confusing at this point. It makes no real difference in the compiler driver at this moment (i.e. OpenCL C and OpenCL C++ are treated identically regardless).

Separately, in the frontend driver, you can set the language to `OpenCLCXX` based on the file extension. This way you get rid of the need for `-cl-std=clc++` as OpenCL C++ files can be identified based on the extension in the frontend. Basically what you do you in your patch :slight_smile:

So yes, I would leave TY_CLXX out for now. [*]

As for file extensions - this change makes `-cl-std=clc++` a rather obscure flag (for e.g. compiling *.cl files as OpenCL C++ files). I think that what you are proposing is a great improvement, thank you!

FWIW, there is way too many flags in Options.td. Less is more :slight_smile:

-Andrzej

[*] Unless I've missed something and it's actually needed.

Just to summarize - it seems that both ‘.cppcl’ and ‘.clcpp’ would be viable
options.

While this survey shows that the preference is slightly towards ‘.cppcl’, we
have discussed this further on the review (https://reviews.llvm.org/D96771) and
it seems ‘.clcpp’ would be more in alignment with ‘-cl-std’ option that accepts
“cl” for OpenCL C and “clc++” for C++ for OpenCL. Therefore the proposed
interface would be as follows:

  • OpenCL C has default ‘-cl-std=cl’ and file extension ‘.cl’.
  • C++ for OpenCL has default ‘-cl-std=clc++’ and file extension ‘.clcpp’.

Note that the extension ‘.clc++’ would be even more aligned with ‘-cl-std’ but
however clang uses ‘.cpp’ as the main extension for C++.

Let me know if you have any more input.

Thanks,
Anastasia