memory annotation treatment in the compilation of OpenCL code

Hi,

I’d like to see how clang handles local memory in OpenCL. So I compiled the following code with “clang -S -emit-llvm -o test.ll -x cl test.cl”.

int get_global_id(int index);

/* Test kernel */
__kernel void test(__global float *in, __global float *out) {
int index = get_global_id(0);
out[index] = 3.14159f * in[index] + in[index];
}

Then I changed “__global” to “__local” and generate the IR again, but the generated IR is the same as the first one.

I am confused because if the “__local” information is lost when generating LLVM IR, how can local memory be used anyway?

Thanks,
Bo

Hello,

Then I changed "__global" to "__local" and generate the IR again, but
the generated IR is the same as the first one.

When compiling for x86(_64) the address spaces to global and local
probably point to the same target address space (0) so there's no
difference in the final IR.

I am confused because if the "__local" information is lost when
generating LLVM IR, how can local memory be used anyway?

Please check this thread from the last March where the local
memory implementation is discussed:

http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-March/013737.html

Best regards,

Hi Bo,

try the following:

clang -ccc-host-triple ptx32 \
      -Xclang -target-feature -Xclang +ptx23 \
      -Xclang -target-feature -Xclang +sm20 \
      -I$LIBCLC/generic/include -I$LIBCLC/ptx-nvidiacl/include \
      -include $LIBCLC/generic/include/clc/clc.h
-Dcl_clang_storage_class_specifiers \
      -O0 kernels/matrixMul.cl -S -emit-llvm -o matrixMul.ll

where $LIBCLC points to the libclc [1] installation directory.
(This command is a modification of the one by Justin Holewinski you
can find here [2])

I am wondering why the frontend does not lower this piece of
information to the IR-level
when compiling for x86. I think that should be backend's
responsibility to discard the
address space if the target supports only the default one.

Am I missing something ?

[1] http://www.pcc.me.uk/~peter/libclc/
[2] https://jholewinski.org/blog/llvm-3-0-ptx-backend

Hi all,

The easiest way to make Clang add the address space information is to add the command line flag "-ffake-address-space-map", but it should be used for OpenCL testing purposes only.

The true intention was to let the TargetInfo specify a map for the address spaces to be used. The default of this map is all zeros, so all the variables are defined without an address space.

Currently only the PTX and the TCE TargetInfo classes specify address space maps different from the default.

When instantiating TargetInfo, Clang doesn't pass the language opotions, so the X86_32/X86_64 target info classes don't know that an address space map should be defined, or if the "long" variables should be 64 bit wide. A possible solution would be to pass the information about OpenCL being the current environment as part of the tripple, so the target info can change its behavior, or even to use a different target. Other solution would be to pass the language options, but it seems a bit messy to me.

Thanks
     Guy