Allow implicit copy constructor between address spaces in C++

Hello,

It is completely legit to initialize (copy) between address spaces in C while it fails in C++. This makes address spaces completely unusable in C++, at least I don't see any way to copy initialize in C++.

The reason being is I am trying to use OpenCL++ (OpenCL in C++11 mode), which works pretty fine [minor changes to Clang], except being unable to initialize from __global to __private.

Please consider following code:

// RUN: %clang_cc1 -fsyntax-only -verify %s
struct Point {
  float x, y;
};

typedef struct Point Point;
typedef struct Point Point_1 __attribute__((address_space(1)));

float test(int i, const Point_1 *a) {
  Point r = a[i];
  return r.x * r.y;
}

Running it in `-x c` works fine, running with `-x c++ -std=c++11` generates:

  File test.cc Line 3: candidate constructor (the implicit copy constructor) not viable: 1st argument ('const Point_1' (aka 'const __attribute__((address_space(1))) Point')) is in address space 1, but parameter must be in address space 0
  File test.cc Line 3: candidate constructor (the implicit move constructor) not viable: 1st argument ('const Point_1' (aka 'const __attribute__((address_space(1))) Point')) is in address space 1, but parameter must be in address space 0
  File test.cc Line 3: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 1 was provided

Can anyone point me any workaround, or how to fix it in Clang code? I believe this has to be relaxed in SemaOverload.cpp TryCopyInitialization, but I wish to not try solution that wouldn't be accepted into upstream.

FYI I have tried reinterpret_cast hinted in tests/SemaCXX/address-space-conversion.cpp:

  Point r = reinterpret_cast<const Point *>(a)[i]

But I believe emitted ll is INVALID, since @llvm.memcpy looses address space qualifier for source pointer:

  %7 = getelementptr inbounds %struct.Point* %6, i64 %4
  %8 = bitcast %struct.Point* %r to i8*
  %9 = bitcast %struct.Point* %7 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %8, i8* %9, i64 8, i32 4, i1 false)

Comparing to ll emitted in plain C mode:

  %6 = getelementptr inbounds %struct.Point addrspace(1)* %5, i64 %4
  %7 = bitcast %struct.Point* %r to i8*
  %8 = bitcast %struct.Point addrspace(1)* %6 to i8 addrspace(1)*
  call void @llvm.memcpy.p0i8.p1i8.i64(i8* %7, i8 addrspace(1)* %8, i64 8, i32 4, i1 false)

Regards,

It looks correct to reject this. Pointers to different address spaces can't be used interchangeably - they can require different instructions to load and store, depending on the architecture, and for disjoint address space there may be no mechanism for translating an address from one to the other.

For C, this is fine because we just emit an llvm.memcpy from one address space to the other[1], which can be turned into a sequence of loads from one address space and stores to the other.

For C++, with a non-trivial copy constructor, you'd have to provide a copy constructor for each possible source address space. With a the default copy constructor, you'd have to teach clang that it should be providing a default copy constructor for each possible address space. I'm not sure this would even be the correct thing to do, because some classes may not support copying between address spaces and so there's a lot of complexity that you'd have to think about before proposing a correct model for the language to support.

David

[1] SelectionDAG then fairly consistently does the wrong thing for this in certain cases, and I really should get around to cleaning up and upstreaming the patches to fix it.

David wrote:

It looks correct to reject this. Pointers to different address spaces can't be used interchangeably - they can require different instructions to load and store, depending on the architecture, and for disjoint address space there may be no mechanism for translating an address from one to the other.

Agreed. I understand that:
__attribute__((address_space(1)) int *p_1 = ...;
__attribute__((address_space(2)) int *p_2 = p_1;
should be treated as invalid whatever you do.

For C, this is fine because we just emit an llvm.memcpy from one address space to the other[1], which can be turned into a sequence of loads from one address space and stores to the other.

Again. This is clear too.

For C++, with a non-trivial copy constructor, you'd have to provide a copy constructor for each possible source address space.

Yes, however since there is absolutely NO way to force plain C behavior from C++, it is now impossible to define such constructor.
[1] At least I don't know any way to trigger non-constructor plain C copy on C++ struct/class that will emit proper @llvm.memcpy.

With a the default copy constructor, you'd have to teach clang that it should be providing a default copy constructor for each possible address space.

Yes this is kind of solution, but it would work ONLY for classes having no own copy constructor, which is pretty limiting.

I'm not sure this would even be the correct thing to do, because some classes may not support copying between address spaces and so there's a lot of complexity that you'd have to think about before proposing a correct model for the language to support.

Agreed. I think some explicit way of copying[1] (omitting C++ constructors) would be sufficient for a start.

Regards,

FYI. Continuing my effort I have added "pod_assign" variable attribute, that triggers POD assign rather than constructor on object (behavior described in previous post). There was nothing better that came to my mind to circumvent problems described previously.

  https://github.com/ujhpc/clang/commits/patch/pod-assign-attr
  https://github.com/ujhpc/clang/commits/patch/openclxx

Here is working code:

struct Point {
  float x, y;
};
typedef struct Point Point;

float test(int i, // index
           __attribute__((address_space(1))) const Point *in,
           __attribute__((address_space(1))) Point *out) {
  __attribute__((pod_assign)) Point r = in[i];
  out[i] = r;
  return r.x * r.y;
}

Now both C++ and C modes generate identical ll assembly. Now it is time to test whether using C++ goodness will produce runnable SPIR kernels.

Regards,

I'm not sure an attribute makes sense if you're looking for C-style assignment.

Try adding a builtin instead:

Builtins.def:

BUILTIN(__builtin_assign, "v.", "t")

SemaChecking.cpp:

     case Builtin::BI__builtin_assign:
       if (checkArgCount(*this, TheCall, 2)) return true;
       return CreateBuiltinBinOp(TheCall->getLocStart(), BO_Assign, TheCall->getArg(0), TheCall->getArg(1));

Usage, ignoring harmless asserts:

#ifdef __cplusplus
#define assign(x,y) __builtin_assign(x,y)
#else
#define assign(x,y) (x = y)
#endif

assign(r, a[0]);

On the other hand there's an expectation that valid C often builds in C++ mode so it might be worth using CXXRecordDecl::isCLike() to support the C syntax directly in C++ as an extension.

Alp.

Try adding a builtin instead: (…)

Thanks a lot, that looks like definitively better idea.

Try adding a builtin instead: (…)

Thanks a lot, that looks like definitively better idea.

If that interface is ok, it seems like you should be able to implement it as a normal function template, with no compiler extensions.

Unfortunately it doesn’t work for me. Clang throws an assertion about what I should do, while I am trying do exactly opposite :>

clang-3.5: /home/ono/Projects/llvm/tools/clang/lib/AST/ExprClassification.cpp:619: clang::Expr::Classification::ModifiableType IsModifiable(clang::ASTContext&, const clang::Expr*, clang::Expr::Classification::Kinds, clang::SourceLocation&): Assertion `(E->getObjectKind() == OK_ObjCProperty || !Ctx.getLangOpts().CPlusPlus) && "C++ struct assignment should be resolved by the " "copy assignment operator."' failed.

Code I am testing on:

typedef struct Point { float x, y; } Point;
float test(int i, //
           __attribute__((address_space(1))) const Point *in,
           __attribute__((address_space(1))) Point *out) {
  Point r;
  __builtin_assign(r, in[i]);
  out[i] = r;
  return r.x * r.y;
}

Change proposed by Alp:

--- a/include/clang/Basic/Builtins.def
+++ b/include/clang/Basic/Builtins.def
@@ -1204,6 +1204,7 @@ BUILTIN(__builtin_smulll_overflow, "bSLLiCSLLiCSLLi*", "n")

// Clang builtins (not available in GCC).
BUILTIN(__builtin_addressof, "v*v&", "nct")
+BUILTIN(__builtin_assign, "v.", "t")

#undef BUILTIN
#undef LIBBUILTIN
--- a/lib/Sema/SemaChecking.cpp
+++ b/lib/Sema/SemaChecking.cpp
@@ -296,6 +296,12 @@ Sema::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) {
     if (SemaBuiltinAddressof(*this, TheCall))
       return ExprError();
     break;
+ case Builtin::BI__builtin_assign:
+ if (checkArgCount(*this, TheCall, 2))
+ return true;
+ return CreateBuiltinBinOp(TheCall->getLocStart(), BO_Assign,
+ TheCall->getArg(0), TheCall->getArg(1));
+ break;
   }

Unfortunately it doesn’t work for me. Clang throws an assertion about what I should do, while I am trying do exactly opposite :>

Like I mentioned you can remove any assertions arising to see it in action because we're intentionally using the C code path in this quick patch.

There's a case for providing compatibility with the C assignment syntax for C-like POD structs in C++ given the neat upgrade path.

I'm intrigued by the suggestion of doing this with a template -- Richard, could you expound on that?

Alp.

It’s worse than that. The address space propagates through member accesses in the type system, so correct type-checking requires you to be able to overload member functions by the address space of “this” — and not just for regular member functions, but for constructors and destructors, too.

(I think this isn’t actually a problem for OpenCL++ in practice because you typically don’t create or even modify objects in non-local address spaces.)

The simplest solution here is probably to add a special case to the type-checker which permits an address-space mismatch when directly calling a trivial copy constructor or copy-assignment operator.

John.

The simplest solution here is probably to add a special case to the type-checker which permits an address-space mismatch when directly calling a trivial copy constructor or copy-assignment operator.

I have submitted patch that works reasonably well without need to extra attributes or builtins.

http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140512/105706.html

The idea behind was to enforce implicit conversion which drops address space qualifier if copy constructor fails and destination is POD without user defined copy constructor.

WDYT?

Regards,

Hmm. I’d like Richard’s opinion about where to do this. There are a lot of compile-time benefits to only changing the failure path during initialization, but I’m a little worried about having these exact conditions. Maybe the best thing to do is to just try again with an OpaqueValueExpr in the default address space and, if that succeeds and picks a trivial constructor, then you can go ahead and generate the C-style AST.

+        const RecordType *DestRecordType = DestType->getAs<RecordType>();
+        CXXRecordDecl *DestRecordDecl
+          = cast<CXXRecordDecl>(DestRecordType->getDecl());

This is just DestType->getAsCXXRecordDecl().

John.

The simplest solution here is probably to add a special case to the
type-checker which permits an address-space mismatch when directly calling
a trivial copy constructor or copy-assignment operator.

I have submitted patch that works reasonably well without need to extra
attributes or builtins.

http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140512/105706.html

The idea behind was to enforce implicit conversion which drops address
space qualifier if copy constructor fails and destination is POD without
user defined copy constructor.

Hmm. I’d like Richard’s opinion about where to do this. There are a lot
of compile-time benefits to only changing the failure path during
initialization, but I’m a little worried about having these exact
conditions. Maybe the best thing to do is to just try again with an
OpaqueValueExpr in the default address space and, if that succeeds and
picks a trivial constructor, then you can go ahead and generate the C-style
AST.

I think the most general thing to do would be to teach SemaInit that a
reference can bind to an object with a different address space if the
initialized entity is the first parameter of a trivial copy or move
constructor or assignment, and to teach it to add a new
cross-address-space-copy step when that happens.

Separately, overload resolution should be taught that this is possible; the
first standard conversion sequence in such a case should probably have
conversion rank.

+ const RecordType *DestRecordType = DestType->getAs<RecordType>();

I think the most general thing to do would be to teach SemaInit that a reference can bind to an object with a different address space if the initialized entity is the first parameter of a trivial copy or move constructor or assignment, and to teach it to add a new cross-address-space-copy step when that happens.

One thing that bothers me is that reference is kind of constant pointer, so dropping address space qualifier sounds kind of weird.

Separately, overload resolution should be taught that this is possible; the first standard conversion sequence in such a case should probably have conversion rank.

I am afraid this isn't trivial change anymore. Do you think we could resolve these address-space issues before final 3.5?

+ const RecordType *DestRecordType = DestType->getAs<RecordType>();
+ CXXRecordDecl *DestRecordDecl
+ = cast<CXXRecordDecl>(DestRecordType->getDecl());

This is just DestType->getAsCXXRecordDecl().

Thanks for a hint.

> I think the most general thing to do would be to teach SemaInit that a
reference can bind to an object with a different address space if the
initialized entity is the first parameter of a trivial copy or move
constructor or assignment, and to teach it to add a new
cross-address-space-copy step when that happens.

One thing that bothers me is that reference is kind of constant pointer,
so dropping address space qualifier sounds kind of weird.

We wouldn't be dropping them; we'd insert an AST node representing "copy
from this address space to that address space" as a special conversion
before we bind the reference. But that's all just implementation details.
The user model is approximately that, for each trivial copy or move
operation, there's something like an overload set copying from each address
space.

Though that would only solve the problem of copying *from* some custom
address space to the default address space. For the other direction, you'd
presumably need address-space-qualified member functions (in particular,
assignment operators).

> Separately, overload resolution should be taught that this is possible;
the first standard conversion sequence in such a case should probably have
conversion rank.

I am afraid this isn't trivial change anymore. Do you think we could
resolve these address-space issues before final 3.5?

Magic 8-ball says "ask again later" =)