[RFC] Add memory scope to GPU barrier

I misunderstood the use-case I was designing for and this is not what I want currently. Actually what I need is a memory scope for the level of synchronization done with a gpu.barrier. I’ve create a proof of concept here: Gpu barrier memfence by FMarno · Pull Request #3 · FMarno/llvm-project · GitHub

Right now there are a couple of issues:

  1. GPU_StorageClass seems to overlap a lot with address space
  2. local memory is a bit overloaded in terminology and has different meanings in CUDA and OpenCL terminology