[RFC] Solution for preserving the address space of ‘this’ in C++ methods

Hello,

I am trying to sort out the use of address spaces in C++ classes in Clang, and I would like to get your feedback about the following problem.

Every C++ method implicitly contains a pointer to an object that invokes the method (aka ‘this’). It is implicitly added by Clang during parsing of a class. On architectures that use address spaces, objects might be instantiated in different address spaces. For example, in OpenCL we may wish to construct objects in a specific address space, e.g. __local, that is shared between a group of threads; there is no syntax in core C++ that allows specifying the address space of ‘this’ and, therefore, the compiler can’t provide an implementation that preserves the address space of an object.

I would like to implement an extension that would allow specifying address space qualifier on methods to signal that the method is to be used with objects constructed in that address space.

Here is an example:

1 struct C {

2 void foo() attribute((address_space(1)));

3 void foo();

4 };

If an instance ‘obj’ of struct ‘C’ is allocated in address space ‘1’ a call to ‘obj.foo()’ will resolve to the method on line 2; if ‘obj’ is not allocated in a specific address space then the implementation on line 3 will be used. In essence this is overloading based on the address space qualifiers of a method.

For languages with explicit address space semantics, like OpenCL, it would look like:

struct C {

void foo() __local;

void foo();

};

This approach has one problem, however, if the implementation of a method does not differ for different address spaces, i.e. the qualification is only done to preserve the address space of ‘this’, multiple copies of the same method will have to be defined. Example:

struct C {

int a;

int geta() attribute((address_space(1))) {return a;}

int geta() {return a;}

};

A better solution, suggested by John McCall [2], that can extend the template syntax for methods with an address space of ‘this’ would require a larger language change and would require longer time before it would become available to users of Clang.

The current plan is as follows:

  1. Implement the aforementioned language extension to allow the overloading of methods qualified with an address space. A prototype implementation of this is available in [1]. If there are no objections here I am happy to work on the patch for this. I am quite confident it can be progressed quickly considering that the general overloading mechanism with method qualifiers is already in place. Let me know if I might have missed something though.

  2. Work towards a longer term solution, potentially utilizing templates to reduce code repetition and make address spaces in C++ a more usable feature. I would appreciate some assistance either with ideas on language design or with prototyping/drafting the language spec proposal.

Feedback request:

  • Please let me know if there are any objections to implementing option 1 and if you have any details to be considered.

  • Please let me know if you are interested in being involved with the language design for option 2. I would particularly be interested to get some people from the general C++ community that use address spaces, or not!, as currently my main background is mainly in GPU/OpenCL area.

[1] https://github.com/KhronosGroup/SPIR/commit/0cfb6f7533d42ae3398a8574efd8abbdac88ac44

[2] https://reviews.llvm.org/D54862#1308125

Thanks in advance,

Anastasia

Hi Anastasia,

I find this terrifying. Using const method qualifiers correctly already leads to silly code repetition. I’d like to entirely remove volatile qualified methods from C++ (see wg21.link/p1152). Your proposal adds an unbounded number of qualifiers, and the code duplication this entails seems nonsensical.

Do you foresee user code ever changing meaningfully based on address space?
Or is this just a trick to get different codegen around this?

I suspect you’re trying to achieve the later. If so, qualifiers on methods just don’t seem like the right language-level solution.

Address spaces on variables seem totally sane, and I agree that you want encapsulation, and somehow tracking address space on this makes sense. So I’m not saying I disagree with your goal. I’m saying method qualifiers seem like a terrible fit.

I think we need a completely different language solution which allows the same C++ code to codegen differently based on variable address space.

JF

I think the code duplication at the object code level is unavoidable – we need to generate different code for the different overloads here. To solve the problem of a duplication at the source level, I think we should wait for http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0847r1.html rather than inventing our own thing.

Hi Richard,

Thanks for sending this interesting reference. I will look look at it in more details and try to see if it covers our requirements for address spaces. At the same time if there are other ideas that make sense and can work better, I guess it’s worth exploring the options.

Cheers,

Anastasia

Hi JF,

Do you foresee user code ever changing meaningfully based on address space?

Or is this just a trick to get different codegen around this?

I can imagine that code can be written differently for different memory segments. For example if caches are used you might want to explore data locality or if memory banks are used you might want to exploit memory coalescing.

But my biggest problem at the moment is that Clang can’t even preserve the address spaces of ‘this’ to make C++ for architectures with memory segments usable. We can’t always rely on the optimizers to put objects in the right memory or perhaps we don’t even always have such strong optimizer support.

Address spaces on variables seem totally sane, and I agree that you want encapsulation, and somehow tracking address space on this makes sense. So I’m not saying I disagree with your goal. I’m saying method qualifiers seem like a terrible fit.

I have been looking at several options and we had some discussions with John too. Therefore I am suggesting to work on a second option as well that will target the situation where no manual tuning of the method definition for a specific memory segment is required. But I imagine this will take quite a while and I would like to offer some workable solution to the Clang users in the meantime (hopefully soon!).

I think we need a completely different language solution which allows the same C++ code to codegen differently based on variable address space.

Do you have any specific idea in mind? I am happy to explore the options further.

Thanks,

Anastasia

Hi Richard,

Thanks again for sharing this, it seems like this technique should generally work for the address spaces apart from I am not very convinced about the superfluous template instantiation problem described at the end of the report. As address spaces are used widely on architectures with memory segments (either explicitly or implicitly) I would quite like to offer the solution that avoids unnecessary code duplication both at the source level and object code level too (at least by avoiding extra template instantiation caused by other differences in a type unrelated to address space quals).

The method qualifiers approach is quite appealing because it's cheap enough to implement and it's completely free from any unnecessary code duplication during the compilation. So it still seems to me like the best possible option at this very moment. In a long term I would, however, quite like to explore the idea from John about templating on address space qualifiers that should allow to eliminate both problems with the replication.

Btw, what is our generation strategy with prototyping/implementing the new proposals to the spec. At what stage would it be acceptable?

Cheers,
Anastasia