[RFC] Re-use OpenCL address space attributes for SYCL

Hi,

We would like to re-use OpenCL address space attributes for SYCL to target

SPIR-V format and enable efficient memory access on GPUs.


__attribute__((opencl_global))

__attribute__((opencl_local))

__attribute__((opencl_private))

The first patch enabling conversion between pointers annotated with OpenCL

address space attribute and “default” pointers is being reviewed here

https://reviews.llvm.org/D80932.

Before moving further with the implementation we would like to discuss two

questions raised in review comments (https://reviews.llvm.org/D80932#2085848).

## Using attributes to annotate memory allocations

Introduction section of SYCL-1.2.1 specification describes multiple compilation

flows intended by the design:

SYCL is designed to allow a compilation flow where the source file is passed

through multiple different compilers, including a standard C++ host compiler

of the developer’s choice, and where the resulting application combines the

results of these compilation passes. This is distinct from a single-source

flow that might use language extensions that preclude the use of a standard

host compiler. The SYCL standard does not preclude the use of a single

compiler flow, but is designed to not require it.

The advantages of this design are two-fold. First, it offers better

integration with existing tool chains. An application that already builds

using a chosen compiler can continue to do so when SYCL code is added. Using

the SYCL tools on a source file within a project will both compile for an

OpenCL device and let the same source file be compiled using the same host

compiler that the rest of the project is compiled with. Linking and library

relationships are unaffected. This design simplifies porting of pre-existing

applications to SYCL. Second, the design allows the optimal compiler to be

chosen for each device where different vendors may provide optimized

tool-chains.

SYCL is designed to be as close to standard C++ as possible. In practice,

this means that as long as no dependence is created on SYCL’s integration

with OpenCL, a standard C++ compiler can compile the SYCL programs and they

will run correctly on host CPU. Any use of specialized low-level features

can be masked using the C preprocessor in the same way that

compiler-specific intrinsics may be hidden to ensure portability between

different host compilers.

Following this approach, SYCL uses C++ templates to represent pointers to

disjoint memory regions on an accelerator to enable compilation with standard

C++ toolchain and SYCL compiler toolchain.

For instance:


// CPU/host implementation

template <typename T, address_space AS> class multi_ptr {

T *data; // ignore address space parameter on CPU

public:

T *get_pointer() { return data; }

}

// check that SYCL mode is ON and we can use non-standard annotations

#if defined(__SYCL_DEVICE_ONLY__)

// GPU/accelerator implementation

template <typename T, address_space AS> class multi_ptr {

// GetAnnotatedPointer<T, global>::type == "__attribute__((opencl_global)) T"

using pointer_t = typename GetAnnotatedPointer<T, AS>::type *;

pointer_t data;

public:

pointer_t get_pointer() { return data; }

}

#endif

User can use multi_ptr class as regular user-defined type in regular C++ code:


int *UserFunc(multi_ptr<int, global> ptr) {

/// ...

return ptr.get_pointer();

}

Depending on the compiler mode multi_ptr will either annotate internal data

with address space attribute or not.

## Implementation details

OpenCL attributes are handled by Parser in all modes. OpenCL mode has specific

logic in Sema and CodeGen components for these attributes.

SYCL compiler re-use generic support for these attributes as is and modifies

Sema and CodeGen libraries. The main difference with OpenCL mode is that SYCL

mode (similar to other single-source GPU programming modes like OpenMP/CUDA/HIP)

keeps “default” address space for the declaration without address space

attribute annotations. This keeps the code shared between the host and device

semantically-correct for both compilers: regular C++ host compiler and SYCL

compiler.

To make all pointers without an explicit address space qualifier to be pointers

in generic address space, we updated SPIR target address space map, which

currently maps default pointers to “private” address space. We made this change

specific to SYCL by adding SYCL environment component to the Triple to avoid

impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to

see get a feedback from the community if changing this mapping is applicable for

all the modes and additional specialization can be avoided (e.g.

AMDGPU

maps default to “generic” address space with a couple of exceptions).

There are a few cases when CodeGen assigns non-default address space:

  1. For declaration explicitly annotated with address space attribute

  2. Variables with static storage duration and string literals are allocated in

global address space unless specific address space it specified.

  1. Variables with automatic storage durations are allocated in private address

space. It’s current compiler behavior and it doesn’t require additional

changes.

For (2) and (3) cases, once “default” pointer to such variable is obtained, it

is immediately addrspacecast’ed to generic, because a user does not (and should

not) specify address space for pointers in source code.

A draft patch containing complete change-set is available

here.

Does this approach seem reasonable?

Thanks,

Alexey

Hi Alexey,

Thanks for the clarification.

SYCL compiler re-use generic support for these attributes as is and modifies

Sema and CodeGen libraries.

Can you elaborate on your modifications in Sema and CodeGen, please?

The main difference with OpenCL mode is that SYCL

mode (similar to other single-source GPU programming modes like

OpenMP/CUDA/HIP)

keeps “default” address space for the declaration without address space

attribute annotations.

Just FYI in C++ mode, Clang implements default/generic address space as

specified in embedded C (ISO/IEC TR 18037) s5.1 - 5.3.

"When not specified otherwise, objects are allocated by default in a generic

address space, which corresponds to the single address space of ISO/IEC

9899:1999."

"Objects are allocated in one or more address spaces. A unique generic address

space always exists. Every address space other than the generic one has a unique

name in the form of an identifier. Address spaces other than the generic one are

called named address spaces. An object is always completely allocated into at

least one address space. Unless otherwise specified, objects are allocated in

the generic address space."

It feels to me this is the model you intend to follow? If you use OpenCL address

space attributes outside of OpenCL mode there is limited logic that you will

inherit. For example deduction of address spaces wouldn’t work but conversions

or generation to IR should work fine. It generally sounds like a viable approach

but OpenCL however used Default (no address space) as private AS for a very long

time and there are still a number of places where this assumption is inherent in

the implementation. This is not entirely strange as Default is use by many
languages for automatic storage anyway. My worry is there could be difficulties
in reusing the OpenCL address space model due to this.

Btw can you elaborate on your implementation of constant addr space?

This keeps the code shared between the host and device

semantically-correct for both compilers: regular C++ host compiler and SYCL

compiler.

Sorry perhaps I am not following this thought but can you explain how

address spaces make code semantically incorrect?

To make all pointers without an explicit address space qualifier to be

pointers

in generic address space, we updated SPIR target address space map, which

currently maps default pointers to “private” address space.

The address space map in Clang is not specific to pointer types. How do you

make it work for pointers only?

We made this change

specific to SYCL by adding SYCL environment component to the Triple to avoid

impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to

see get a feedback from the community if changing this mapping is applicable

for all the modes and additional specialization can be avoided (e.g.

AMDGPU

maps default to “generic” address space with a couple of exceptions).

Ok, does it mean that you map Default address space to OpenCL generic?

Please note that Default address space is used outside of OpenCL for all

other languages so remapping this unconditionally will have a wider impact.

There are a few cases when CodeGen assigns non-default address space:

  1. For declaration explicitly annotated with address space attribute

This is generally how CodeGen works mapping language address spaces to target

address spaces. Is there something different you do here for SYCL?

  1. Variables with static storage duration and string literals are allocated in

global address space unless specific address space it specified.

  1. Variables with automatic storage durations are allocated in private address

space. It’s current compiler behavior and it doesn’t require additional

changes.

We already have this logic for OpenCL in Sema. I am not an expert in CodeGen but

I believe its primary task is to map language constructs onto the target specific IR
i.e. map from AST into IR. However, you are making it dial with language semantic
instead i.e. add missing AST logic such as address space attribute. I believe there
are good reasons to have layering architecture that separates various concerns.
What drives your decision for moving this logic into CodeGen?

For (2) and (3) cases, once “default” pointer to such variable is obtained, it

is immediately addrspacecast’ed to generic, because a user does not (and

should not) specify address space for pointers in source code.

Can you explain why you need this cast? Can user not specify address spaces using

pointer classes that map into address space attributed types i.e. ending up with
pointer with address spaces originating from the user code?

Cheers,
Anastasia

Hi Anastasia,

Sorry for the delay.

The main difference with OpenCL mode is that SYCL

mode (similar to other single-source GPU programming modes like

OpenMP/CUDA/HIP)

keeps “default” address space for the declaration without address space

attribute annotations.

Just FYI in C++ mode, Clang implements default/generic address space as

specified in embedded C (ISO/IEC TR 18037) s5.1 - 5.3.

"When not specified otherwise, objects are allocated by default in a generic

address space, which corresponds to the single address space of ISO/IEC

9899:1999."

"Objects are allocated in one or more address spaces. A unique generic address

space always exists. Every address space other than the generic one has a unique

name in the form of an identifier. Address spaces other than the generic one are

called named address spaces. An object is always completely allocated into at

least one address space. Unless otherwise specified, objects are allocated in

the generic address space."

It feels to me this is the model you intend to follow?

After reading the document I don’t see major conflicts with our SYCL

implementation.

If you use OpenCL address

space attributes outside of OpenCL mode there is limited logic that you will

inherit. For example deduction of address spaces wouldn’t work but conversions

or generation to IR should work fine. It generally sounds like a viable approach

but OpenCL however used Default (no address space) as private AS for a very long

time and there are still a number of places where this assumption is inherent in

the implementation. This is not entirely strange as Default is use by many

languages for automatic storage anyway. My worry is there could be difficulties

in reusing the OpenCL address space model due to this.

Btw can you elaborate on your implementation of constant addr space?

As SPIR-V doesn’t allow casts between constant and generic pointers, SYCL

implementation doesn’t use OpenCL constant address space attribute. “const”

qualified “global” address space attribute is used instead.

This keeps the code shared between the host and device

semantically-correct for both compilers: regular C++ host compiler and SYCL

compiler.

Sorry perhaps I am not following this thought but can you explain how

address spaces make code semantically incorrect?

It’s not “address spaces” per se, but how OpenCL mode implements them.

Victor did a good job covering this question in this comment:

https://reviews.llvm.org/D80932#2073542

Example form this comment of valid C++ function, which is not valid in OpenCL

mode:


template<typename T1, typename T2>

struct is_same {

static constexpr int value = 0;

};

template<typename T>

struct is_same<T, T> {

static constexpr int value = 1;

};

void foo(int p) {

static_assert(is_same<decltype(p), int>::value, "int is not an int?"); // Fails: p is '__private int' != 'int'

static_assert(is_same<decltype(&p), int*>::value, "int* is not an int*?"); // Fails: p is '__private int*' != '__generic int*'

}

To make all pointers without an explicit address space qualifier to be

pointers

in generic address space, we updated SPIR target address space map, which

currently maps default pointers to “private” address space.

The address space map in Clang is not specific to pointer types. How do you

make it work for pointers only?

I don’t think we did anything specific to apply this change to pointers only.

Pointers provided here as an example to demonstrate the impact of the change in

LLVM IR representation for SPIR target.

We made this change

specific to SYCL by adding SYCL environment component to the Triple to avoid

impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to

see get a feedback from the community if changing this mapping is applicable

for all the modes and additional specialization can be avoided (e.g.

AMDGPU

maps default to “generic” address space with a couple of exceptions).

Ok, does it mean that you map Default address space to OpenCL generic?

Please note that Default address space is used outside of OpenCL for all

other languages so remapping this unconditionally will have a wider impact.

Current implementation applies different mapping only when “sycldevice”

environment is set in target triple.

https://github.com/bader/llvm/pull/18/files#diff-d62fb2e1d8c597ce59fd10e018f6fb77R61

What languages do you think might be impacted if we enable this change

unconditionally? Are there modes other than OpenCL and SYCL targeting SPIR?

There are a few cases when CodeGen assigns non-default address space:

  1. For declaration explicitly annotated with address space attribute

This is generally how CodeGen works mapping language address spaces to target

address spaces. Is there something different you do here for SYCL?

No.

  1. Variables with static storage duration and string literals are allocated in

global address space unless specific address space it specified.

  1. Variables with automatic storage durations are allocated in private address

space. It’s current compiler behavior and it doesn’t require additional

changes.

We already have this logic for OpenCL in Sema. I am not an expert in CodeGen but

I believe its primary task is to map language constructs onto the target specific IR

i.e. map from AST into IR. However, you are making it dial with language semantic

instead i.e. add missing AST logic such as address space attribute. I believe there

are good reasons to have layering architecture that separates various concerns.

What drives your decision for moving this logic into CodeGen?

I don’t think (2) deal with language semantics. I assume we both talking about

the same case when variable declaration is not explicitly annotated with address

space attribute. According to language semantics such objects are allocated in

generic address space, but the problem is that most OpenCL implementations have

problems with consuming SPIR-V files with global variables in generic address

space. As an alternative to CodeGen changes we can consider handling this issue

in SPIR-V translator tool.

For (2) and (3) cases, once “default” pointer to such variable is obtained, it

is immediately addrspacecast’ed to generic, because a user does not (and

should not) specify address space for pointers in source code.

Can you explain why you need this cast?

This change is need to keep the types in LLVM IR consistent. Regular C++ user

code usually doesn’t have address space annotations, so if memory references and

pointers are “generic”. When allocation is done in named address space, we must

add address space cast to keep LLVM pointer types aligned.

Can user not specify address spaces using

pointer classes that map into address space attributed types i.e. ending up with

pointer with address spaces originating from the user code?

Yes.

Feel free to join today’s sync meeting at 9AM PT to have an online discussion.

Thanks,

Alexey

As SPIR-V doesn’t allow casts between constant and generic pointers, SYCL

implementation doesn’t use OpenCL constant address space attribute. “const”

qualified “global” address space attribute is used instead.

FYI in OpenCL C such conversions are disallowed and since address spaces are

preserved in AST, any such conversion in the source will be diagnosed and
rejected. Constant address space indicates a memory region where read-only
data are to be placed for efficient accesses, so it is not quite the same as global
memory.

It’s not “address spaces” per se, but how OpenCL mode implements them.

Victor did a good job covering this question in this comment:

https://reviews.llvm.org/D80932#2073542

I have replied to Victor’s comment: https://reviews.llvm.org/D80932#2074792

What languages do you think might be impacted if we enable this change

unconditionally? Are there modes other than OpenCL and SYCL targeting SPIR?

Yes, some toolchains use it for standard C and C++ compilations to create

libraries to run on GPU-like targets. Generally, we should not limit SPIR to

OpenCL or SYCL. Clang design is made flexible to allow multiple targets

to support multiple languages. We shouldn’t come up with an implementation

that will limit choices for future development.

I don’t think (2) deal with language semantics. I assume we both talking about

the same case when variable declaration is not explicitly annotated with address

space attribute. According to language semantics such objects are allocated in

generic address space, but the problem is that most OpenCL implementations have

problems with consuming SPIR-V files with global variables in generic address

space. As an alternative to CodeGen changes we can consider handling this issue

in SPIR-V translator tool.

I am not really a CodeGen expert, maybe it will be ok. I think it’s better if you discuss

it with John McCall or someone who is more experienced with CodeGen architecture.

Why don’t you just do regular address space deduction in Sema and then cast the
deduced address space to generic straight after? You already add similar casts for
pointers that are annotated with address spaces through the user code, right?
This approach will probably allow to reuse the logic from OpenCL and simplify CodeGen.

Alternatively, I imagine you could add a simple transformation pass to remap address
spaces If you move this logic into the translator then I guess your compilation
will only work for SPIR-V?

This change is need to keep the types in LLVM IR consistent. Regular C++ user

code usually doesn’t have address space annotations, so if memory references and

pointers are “generic”. When allocation is done in named address space, we must

add address space cast to keep LLVM pointer types aligned.

I feel that your design is slightly different to what address space attributes were intended

for. The address spaces were introduced for embedded C and other dialects where the

same logic applies. The address space is added into a type qualifer. This binds an object to

certain memory segments and therefore the only valid conversions for different addresses

are either using explicit casts or implicit conversions in operands of operations, similar to
regular qualifiers or arithmetic conversions. There are no unexpected address space
conversions from explicit address spaces to generic/default otherwise.

It feels to me that what you actually need semantically is a flat memory. Then the
embedded C model is just overkill. I feel the address space attribute might just not be
a good conceptual fit for your design. Have you considered adding a new custom
attribute to annotate pointer variable classes or variables with memory segments
without propagating this into a type qualifier?

I imagine it would be pretty easy to implement in the frontend as you just need to propagate

this to IR. Then your middle-end passes can use this annotation to remap from
default/generic address space into any exact one. I think you can even achieve higher flexibility
by using such annotation only as some sort of a hint and allow an optimizer to choose alternative
memory regions if it can result in higher performance.

Feel free to join today’s sync meeting at 9AM PT to have an online discussion.

Thanks, but sorry it was short notice. Also I think it’s good to use LLVM channels so we can

keep everyone else in the loop and also record information for future reference during the

code review or other documentation purposes.

I don’t think (2) deal with language semantics. I assume we both talking about

the same case when variable declaration is not explicitly annotated with address

space attribute. According to language semantics such objects are allocated in

generic address space, but the problem is that most OpenCL implementations have

problems with consuming SPIR-V files with global variables in generic address

space. As an alternative to CodeGen changes we can consider handling this issue

in SPIR-V translator tool.

I am not really a CodeGen expert, maybe it will be ok. I think it’s better if you discuss

it with John McCall or someone who is more experienced with CodeGen architecture.

Why don’t you just do regular address space deduction in Sema and then cast the

deduced address space to generic straight after? You already add similar casts for

pointers that are annotated with address spaces through the user code, right?

This approach will probably allow to reuse the logic from OpenCL and simplify CodeGen.

I don’t see how it can be done without breaking C++ semantics demonstrated in

https://reviews.llvm.org/D80932#2073542.

This change is need to keep the types in LLVM IR consistent. Regular C++ user

code usually doesn’t have address space annotations, so if memory references and

pointers are “generic”. When allocation is done in named address space, we must

add address space cast to keep LLVM pointer types aligned.

I feel that your design is slightly different to what address space attributes were intended

for. The address spaces were introduced for embedded C and other dialects where the

same logic applies. The address space is added into a type qualifer. This binds an object to

certain memory segments and therefore the only valid conversions for different addresses

are either using explicit casts or implicit conversions in operands of operations, similar to

regular qualifiers or arithmetic conversions. There are no unexpected address space

conversions from explicit address spaces to generic/default otherwise.

It feels to me that what you actually need semantically is a flat memory. Then the

embedded C model is just overkill. I feel the address space attribute might just not be

a good conceptual fit for your design. Have you considered adding a new custom

attribute to annotate pointer variable classes or variables with memory segments

without propagating this into a type qualifier?

Yes. Originally we used separate attributes, but we replaced them with OpenCL

attributes. At some point we had OpenCL “parsed attribute representation” with

new semantic attributes - https://github.com/intel/llvm/pull/968/.

I can rebase the patch and upload it to Phabricator if it okay (and add new

parsed representation if needed).

I imagine it would be pretty easy to implement in the frontend as you just need to propagate

this to IR. Then your middle-end passes can use this annotation to remap from

default/generic address space into any exact one. I think you can even achieve higher flexibility

by using such annotation only as some sort of a hint and allow an optimizer to choose alternative

memory regions if it can result in higher performance.

Feel free to join today’s sync meeting at 9AM PT to have an online discussion.

Thanks, but sorry it was short notice. Also I think it’s good to use LLVM channels so we can

keep everyone else in the loop and also record information for future reference during the

code review or other documentation purposes.

Agree. Our sync meetings are not intended to replace LLVM channels, but rather

supplement them. In some cases phone calls are more efficient than email chains.

We keep notes from these syncs here:

https://github.com/intel/llvm/wiki/SYCL-upstreaming-working-group-meeting-notes,

so that information is also available for reference.

Thanks,

Alexey

I don’t see how it can be done without breaking C++ semantics demonstrated in
https://reviews.llvm.org/D80932#2073542.

I am not well-versed in this, but just thinking about these as arbitrary type qualifiers: could the issue be simply that the implicitly-generated address space qualifiers are only being added to the types of VarDecls, rather than to every type, including pointee types, template argument types, etc.?

I.e., referring to the examples linked to above: perhaps the problem is not that that OpenCL changes int var to __private int var, but rather that it does not also change int* ptr1 = &var to __private int* __private ptr1 = &var (or whatever the proper default qualifiers are) and std::is_same<T, int> to std::is_same<T, __private int> when in OpenCL (or SYCL) mode.

If it did, I believe those examples would all compile, and code would only break when the user specified began specifying non-default address spaces, i.e. when they actually used the feature in their code. In this way, the non-standard semantics could be represented in the AST without affecting the standard semantics.

In any case that is the form of the ideal solution: sure, don’t break the standard C++ semantics, but also, try to keep a clear representation of any supported-but-non-standard semantics in the AST, I think.

I haven’t been paying close attention to this thread, so I’m not
sure exactly what’s being asked here, and I couldn’t figure it
out from the original post. If you have specific questiions,
I’d be happy to help.

It does sound like it would be useful for me to point out that
IRGen already has the ability to emit local and global variables
in an address space that doesn’t match the formal address space
of a declaration’s type. This was added for the AMD GPU target
so that it could support arbitrary C++ code. IRGen just has some
target hooks that you can override to set the default address spaces
for these kinds of variables.

To give you an idea of how this is used, for AMD GPU, IRGen emits
local/global declarations using the target’s private and global
address spaces, but it translates pointers and references to use
the generic address space. When IRGen emits a reference to one
of these variables, therefore, it has to immediately promote the
pointer to the variable into the generic address space, and all
the subsequent operations work on that generic pointer. Like
most GPU targets, AMD GPU then has extensive optimizations
to try to strength-reduce operations on generic pointers down to
operations on specific address spaces.

That’s probably the cleanest alternative if you don’t want to
take the OpenCL route of inferring specific address spaces based
on other characteristics of the declaration. Any approach that
exposes address spaces in the source language can certainly lead
to rejecting code which would be valid in standard C++.

John.

Why don’t you just do regular address space deduction in Sema and then cast the

deduced address space to generic straight after? You already add similar casts for

pointers that are annotated with address spaces through the user code, right?

This approach will probably allow to reuse the logic from OpenCL and simplify CodeGen.

I don’t see how it can be done without breaking C++ semantics demonstrated in

https://reviews.llvm.org/D80932#2073542.

Can you be more specific, please? If this strategy works for types attributed by address

space from the user annotations I don’t see why it wouldn’t work for the types with the

address spaces deduced. There is no difference in AST between types with address

spaces deduced and specified explicitly in the source.

I am not well-versed in this, but just thinking about these as arbitrary type qualifiers: could the issue be simply that the implicitly-generated address space qualifiers are only being added to the types of VarDecls, rather than to every type, including pointee types, template argument types, etc.?

It is a little bit more complex than that. Most of the types used with objects in OpenCL will get an address space deduced including pointer types. This is because OpenCL is a language dialect for memory segmented architectures and the memory segments pose some limitations resulting in extra language rules. Clang strictly follows OpenCL language spec and I don’t see any issue in the examples Alexey has referred to. If the types differ by address space is_same is expected to return false.

What I struggle to understand how does this affects SYCL at all? The deduction of address spaces is guarded by OpenCL language mode and it is not set for SYCL as far as I am aware.

If it did, I believe those examples would all compile, and code would only break when the user specified began specifying non-default address spaces

If I understand the design Alexey is proposing correctly the user-specified address spaces are cast to the default address spaces “hiddenly” and the AST always ends up to be in flat address space. This is why I don’t see the address space as a good fit. However, I am not sure this is explained explicitly in the RFC, I might have remembered this from some other discussions.

If I understand their proposal correctly, from the discussion Alexey linked to in https://reviews.llvm.org/D80932#2073542, the primary motivation for their implementation — its main distinction from OpenCL’s approach – is that they want to support address spaces in such a way that existing C++ files without any address space markings can still compile as is.

That definitely seems like a worthy goal if it could be properly accomplished.

Take, for instance, the “const” qualifier. Code which never uses it whatsoever still works by default; only when we start adding “const” into our code could we possibly start breaking other code. That is the ideal way to introduce a language feature: old code still works, but now people can opt in to the new stuff.

If instead const-ness had been implemented by allowing each type to be either “const” or (let’s say) “mutable” or neither, and what is more we implicitly added “mutable” when no such marking was provided to some but not all types, then the users would not have the option of ignoring it altogether, it would be a real headache.

This seems to be how OpenCL is implemented, as Alexey et al identify in the discussion linked above: certain VarDecl types get an implicit __private qualifier, but e.g. template argument types, and certain other types (they give another example beyond std::is_same which presents problems) get no such implicit qualifier, resulting in possible breakage in any code whose address spaces have not been closely considered.

Alexey et al’s alternative to prevent this breakage, if I understand correctly, is to remove the type qualifier, and instead handle all address space semantics in CodeGen (I believe this is what you refer to as keeping the address space “flat”).

It seems to me that approach is not ideal, though, because
a) it seems they would lose the type-checking benefits of implementing as a type qualifier (e.g. imagine if “const” qualifiers were removed and handled instead in CodeGen), and
b) I think it really is important for the AST to maintain a clear representation of all target-independent semantics, including address spaces, so that users can easily reason about their code in AST matchers etc.

So the ideal, it seems to me, for everyone’s sake, would be for OpenCL qualifiers to behave more like “const” — there would be a default, a la “non-const”, that is applied to all types not explicitly qualified, so that one could enable OpenCL mode and regular code would still work by default.

In reality though, I imagine this has all already been thought over thoroughly, and it has been determined it really is unavoidable to break standard C++ semantics in order to support address spaces; that there really is no default that could be inferred for arbitrary types like those used in template arguments.

But perhaps it is worthwhile to think it through one more time, to question whether there may be some way to deduce type qualifiers properly in every case, because the issue that Alexey et al raise is a good one I think.

David’s understanding is right. We would like to be able to call existing C++

functions from SYCL applications as much as possible.

I don’t know if analogy with the “const” qualifier is a perfect match here, but

casting a names address space pointer to “generic” pointer is valid operation in

all GPGPU programming models I’m aware of (including OpenCL). “generic” means

that memory can be allocation in any address space, so compiler can’t assume any

specific address space and must generate code which is valid for any address

space. Usually it implies runtime overhead on checking the value of “generic”

pointer to handle it properly.

Alexey et al’s alternative to prevent this breakage, if I understand

correctly, is to remove the type qualifier, and instead handle all address

space semantics in CodeGen (I believe this is what you refer to as keeping the

address space “flat”).

Not exactly. It works as what you described below - “the ideal”. We keep address

space qualifier if user explicitly applied it, but the difference with OpenCL

mode is that “implicit” address space is the same as in regular C++ (i.e.

default) and we allow conversion to “default” C++ address spaces from explicitly

set named address spaces. C++ “default” is treated as “generic” w/o changing C++

default address space to OpenCL “generic” address space in Sema type system.

When SYCL users access memory though SYCL library API memory pointers will be

annotated with address space attributes to ensure good performance. Users can

obtain a “raw” pointer from SYCL objects to pass it to some generic C++ function

and this use case is enabled by the patch https://reviews.llvm.org/D80932.

It seems to me that approach is not ideal, though, because

a) it seems they would lose the type-checking benefits of implementing as a

type qualifier (e.g. imagine if “const” qualifiers were removed and handled

instead in CodeGen), and

b) I think it really is important for the AST to maintain a clear

representation of all target-independent semantics, including address

spaces, so that users can easily reason about their code in AST matchers

etc.

Address space attributes are preserved in AST if they are applied explicitly on

source code, but they are not implicitly applied to all types.

Type-checking is performed for OpenCL address space attributes (e.g. casts

between “named” address spaces are not allowed) and C++ type qualifiers like

“const” are respected.

So the ideal, it seems to me, for everyone’s sake, would be for OpenCL

qualifiers to behave more like “const” — there would be a default, a la

“non-const”, that is applied to all types not explicitly qualified, so that

one could enable OpenCL mode and regular code would still work by default.

I believe it’s what we have in our implementation.

In reality though, I imagine this has all already been thought over

thoroughly, and it has been determined it really is unavoidable to break

standard C++ semantics in order to support address spaces; that there really

is no default that could be inferred for arbitrary types like those used in

template arguments.

But perhaps it is worthwhile to think it through one more time, to question

whether there may be some way to deduce type qualifiers properly in every

case, because the issue that Alexey et al raise is a good one I think.

There is LLVM transformation pass which infers address space information at LLVM

IR level: https://llvm.org/doxygen/InferAddressSpaces_8cpp_source.html

It helps to avoid performance overhead for regular C++ code called from SYCL

annotated parts.

>>> I don't think (2) deal with language semantics. I assume we both
>>> talking about the same case when variable declaration is not
>>> explicitly annotated with address space attribute. According to
>>> language semantics such objects are allocated in generic address
>>> space, but the problem is that most OpenCL implementations have
>>> problems with consuming SPIR-V files with global variables in
>>> generic address space. As an alternative to CodeGen changes we can
>>> consider handling this issue in SPIR-V translator tool.
>>
>>
>> I am not really a CodeGen expert, maybe it will be ok. I think it's
>> better if you discuss it with John McCall or someone who is more
>> experienced with CodeGen architecture.

I haven’t been paying close attention to this thread, so I’m not sure exactly
what’s being asked here, and I couldn’t figure it out from the original post. If
you have specific questiions, I’d be happy to help.

It does sound like it would be useful for me to point out that IRGen already
has the ability to emit local and global variables in an address space that
doesn’t match the formal address space of a declaration’s type. This was
added for the AMD GPU target so that it could support arbitrary C++ code.
IRGen just has some target hooks that you can override to set the default
address spaces for these kinds of variables.

To give you an idea of how this is used, for AMD GPU, IRGen emits
local/global declarations using the target’s private and global address spaces,
but it translates pointers and references to use the generic address space.
When IRGen emits a reference to one of these variables, therefore, it has to
immediately promote the pointer to the variable into the generic address
space, and all the subsequent operations work on that generic pointer. Like
most GPU targets, AMD GPU then has extensive optimizations to try to
strength-reduce operations on generic pointers down to operations on
specific address spaces.

We applied similar approach for SPIR target. In addition to arbitrary C++ code,
we need to allow users explicitly set address space. We are trying to achieve
this by supporting address space attributes in SYCL mode, which can be casted to
"default" address space (including implicit casts). We have a few IRGen changes
to emit "addrspacecast" instead of "bitcast" LLVM instruction to keep LLVM IR
module consistent.

That’s probably the cleanest alternative if you don’t want to take the OpenCL
route of inferring specific address spaces based on other characteristics of
the declaration. Any approach that exposes address spaces in the source
language can certainly lead to rejecting code which would be valid in standard
C++.

We arrived at the same conclusion as well while we were trying to re-use OpenCL
mode for SYCL.

I don't think (2) deal with language semantics. I assume we both
talking about the same case when variable declaration is not
explicitly annotated with address space attribute. According to
language semantics such objects are allocated in generic address
space, but the problem is that most OpenCL implementations have
problems with consuming SPIR-V files with global variables in
generic address space. As an alternative to CodeGen changes we can
consider handling this issue in SPIR-V translator tool.

I am not really a CodeGen expert, maybe it will be ok. I think it's
better if you discuss it with John McCall or someone who is more
experienced with CodeGen architecture.

I haven’t been paying close attention to this thread, so I’m not sure exactly
what’s being asked here, and I couldn’t figure it out from the original post. If
you have specific questiions, I’d be happy to help.

It does sound like it would be useful for me to point out that IRGen already
has the ability to emit local and global variables in an address space that
doesn’t match the formal address space of a declaration’s type. This was
added for the AMD GPU target so that it could support arbitrary C++ code.
IRGen just has some target hooks that you can override to set the default
address spaces for these kinds of variables.

To give you an idea of how this is used, for AMD GPU, IRGen emits
local/global declarations using the target’s private and global address spaces,
but it translates pointers and references to use the generic address space.
When IRGen emits a reference to one of these variables, therefore, it has to
immediately promote the pointer to the variable into the generic address
space, and all the subsequent operations work on that generic pointer. Like
most GPU targets, AMD GPU then has extensive optimizations to try to
strength-reduce operations on generic pointers down to operations on
specific address spaces.

We applied similar approach for SPIR target. In addition to arbitrary C++ code,
we need to allow users explicitly set address space. We are trying to achieve
this by supporting address space attributes in SYCL mode, which can be casted to
"default" address space (including implicit casts).

I believe AMD GPU does also allow the use of explicit address
spaces in their C++ frontend.

Treating the default address space as a superspace of some (all?)
of the other address spaces is sensible and should definitely work.
There are already predicates in the AST for deciding whether two
address spaces are related this way; they probably need to be
target-customizable, but that should be straightforward.

We have a few IRGen changes to emit "addrspacecast" instead of
"bitcast" LLVM instruction to keep LLVM IR module consistent.

It’s not unlikely that there are bugs in IRGen, both with the
implicit address space promotion for locals/globals and just
generally with address spaces in C++. Fixes for those would
be welcome.

John.

Hi,

Treating the default address space as a superspace of some (all?) of the other address spaces is sensible and should definitely work.
There are already predicates in the AST for deciding whether two address spaces are related this way; they probably need to be target-customizable, but that should be straightforward.

Just to chime in here; I have a patch for making address space relations target-configurable here: https://reviews.llvm.org/D62574

/ Bevin

Just to make it clear - C++ libraries or any existing C++ code doesn’t
necessarily stop to compile if address space attributes are being used. It only
fails on illegal use of address spaces defined by Embedded C spec from where
the implementation originates.

For example, we have made an evaluation by running type trait tests from libcxx
with a standard header in OpenCL mode and only a few tests failed due to
the address spaces. None of those required changes in the libcxx code. Instead,
they have revealed bugs in clang (that were fixed in release 11 btw) and issues in
tests due to illegal behavior. These issues are expected for the address space
attributes as there are extra restriction and rules on types attributed by the
address spaces that come from Embedded C. If these restrictions are not desirable
I feel the desirable behavior might be significantly different from the intended use
of address space attributes and therefore it might be better to look at alternative
approaches including adding a dedicated attribute that doesn’t propagate into a type
qualifier at all.

casting a names address space pointer to “generic” pointer is valid operation
in all GPGPU programming models I’m aware of (including OpenCL)

I would like to highlight that generic address space and address space
conversions have only been added in OpenCL 2.0. Furthermore, generic address
space is becoming optional functionality in OpenCL 3.0.

Cheers,
Anastasia

Just to make it clear - C++ libraries or any existing C++ code doesn’t
necessarily stop to compile if address space attributes are being used. It only
fails on illegal use of address spaces defined by Embedded C spec from where
the implementation originates.

Alexey is correct that C and C++ require that if a variable x has type
T, the expression &x must have type T *. Anything else is not
strictly conformant, whether it compiles or not. OpenCL is therefore
not strictly conformant to the C/C++ specification, and that’s fine
because OpenCL is an independent language, not just a language extension;
it happens to be defined using C/C++ as a base specification, but its
modifications are always authoritative. But as I understand it, Alexey
just wants a conformant C++ implementation on a different target, and
so needs to follow the rules.

It’s good that libc++ is broadly permissive about types qualified
with address spaces, but that doesn’t really change anything.

John.

Alexey is correct that C and C++ require that if a variable x has type
T, the expression &x must have type T *. Anything else is not
strictly conformant, whether it compiles or not.

Btw OpenCL doesn’t violate these rules. Moreover, it uses conventional
embedded C address space qualifier logic. It only defines a concrete
memory hierarchy and adds an inference logic for address spaces that is
not present in an embedded C extension. For example:

void foo() {
int i; // type of i is __private int
}

The type of “i” is not “int” it is “__private int”. So it technically has a “shortcut
syntax” to specify types with an address space attribute.

Alexey is correct that C and C++ require that if a variable x has type

T, the expression &x must have type T *. Anything else is not
strictly conformant, whether it compiles or not.

Btw OpenCL doesn’t violate these rules.

It does. C requires the type of a variable to be exactly the type
that it was written with, notwithstanding a handful of exceptions
around e.g. parameters of array type or int x[] = {1,2,3};).
In OpenCL, a local variable int x; has type __private int,
and that is not conformant. And there’s nothing wrong with that,
because OpenCL is its own language and its programs do not need
to conform to the C standard.

Moreover, it uses conventional
embedded C address space qualifier logic. It only defines a concrete
memory hierarchy and adds an inference logic for address spaces that is
not present in an embedded C extension.

Inferring non-generic address spaces for variables is not conformant
to the Embedded C specification. The mechanics of address spaces in
OpenCL otherwise conform to the Embedded C specification, but
Embedded C does not change the rules for how unqualified types work.
Unlike OpenCL, Embedded C is a pure language extension: programs
that do not use the extension’s features behave exactly as specified
by the C standard. Only when an address space is actually uttered
do those rules have any effect.

John.

Hi Bevin,

Thank you for the head-up.
If we can commit this patch soon, we can easily limit conversion between default and OpenCL address spaces to SYCL mode to avoid any interference with OpenCL mode.
I think it's more preferable to do for the language mode to avoid overloading TargetInfo methods for all supported targets (currently SPIR and NVTPX).

John, Anastasia, does https://reviews.llvm.org/D62574 seem reasonable to you? If so, I can rebase my patch of top of this one.

Particularly this change makes it easier to define different behavior for different language modes:
"Moves QualType/Qualifiers accessors which deal with qualifier relations (such as compatiblyIncludes, etc.) to ASTContext, as Qualifiers cannot be made aware of the relations between address spaces on their own."

Thanks,
Alexey

Hi Alexey,

John, Anastasia, does https://reviews.llvm.org/D62574 seem reasonable to you?
If so, I can rebase my patch of top of this one.

I have just reviewed the patch again and it generally looks fine.

However, I guess it will still not solve the address space deduction that you
discussed earlier? It also doesn’t help to eliminate address spaces from AST by
converting them to generic that is what you need too? I still think that the main
concept of the address space attribute in Clang is the ability to propagate and
preserve different address spaces in types of AST, should they occur explicitly in
the source, and provide checks/diagnostics for mismatching address spaces.
But if this logic is undesirable for you I would encourage you to look at different
attributes, perhaps even at adding a new one. After all, you don’t need to have
address spaces in AST types in order to propagate them to IR.

Cheers,
Anastasia