[RFC] Add memory scope to GPU barrier

FMarno · September 16, 2024, 4:46pm

I misunderstood the use-case I was designing for and this is not what I want currently. Actually what I need is a memory scope for the level of synchronization done with a gpu.barrier. I’ve create a proof of concept here: Gpu barrier memfence by FMarno · Pull Request #3 · FMarno/llvm-project · GitHub

Right now there are a couple of issues:

GPU_StorageClass seems to overlap a lot with address space
local memory is a bit overloaded in terminology and has different meanings in CUDA and OpenCL terminology

Topic		Replies	Views
memory scopes in atomic instructions LLVM Dev List Archives	14	156	February 21, 2015
[PATCH 1/1] r600: Add fence implementation, rework barrier OpenCL	12	153	May 1, 2014
[PATCH 1/2] amdgcn,waitcnt: Add datalayout info OpenCL	5	107	September 4, 2017
Loads moving across barriers LLVM Dev List Archives	25	281	January 4, 2014
[PATCH] R600: Use new barrier intrinsic OpenCL	5	166	July 15, 2016