I'm currently struggling a bit with some problems regarding address spaces and
(implicit) casts. I'll explain some context first and then proceed to the
actual question I'd like to have answered.
In our target platform, we have a number of distinctly different memory banks.
To access these from our C code, we declare a global array for each memory,
with the address space attribute to mark in which memory the array should be
allocated. This allows the backend to map the load and store instructions from
and to this array on the right target instructions, depending on the memory
used. For example:
__attribute__((address_space(1))) char mem1;
__attribute__((address_space(2))) char mem2;
Now, we are using a function which reads a value from one of these memories
and does some processing. Since we want to execute this function for multiple
memories, we make it accept a pointer in the generic address space (ie, no
address space attribute):
void do_stuff(char* mem);
Somewhere later on, we call the function as follows:
As expected, the LLVM IR resulting from this contains a bitcast to remove the
address space from mem1 and mem2 (and also cast from [100 x i8] to i8*, but
that is less interesting). Now, this bitcast really isn't supported by
our backend (more specifically, the do_stuff function can't be codegen'd since
we need to know from which address space we are reading at copile time).
To solve this, we make sure the do_stuff function gets inlined. When this
happens, our problems should go away, since we now know at compile time which
address space is used for every load and store instruction. However, the
bitcast that removes the address space won't go away.
What I would like to see, is that the bitcast be removed and the addrspace
annotated type be propagated to the gep/load/store instructions that use it.
However, this brings me to my actual question: How are address spaces
semantically defined? I see two options here:
a) Every address space has the full range of addresses and they completely
live side by side. This means that, for example, i32 addrspace(1) * 100 points
to a different piece of memory than i32 addrspace(2) * 100. Also, this means
that a bitcast from one address space to another (possibly 0), makes the
pointer point to something different when loaded.
b) Every address space is really a subspace of the full range of addresses,
but always disjoint. This means that, for example, i32 addrspace(1) * 100
points to the same memory as i32 addrspace(2) * 100, though one, or possibly
both of them can be invalid (since the pointer lies outside of that address
space). This also means that bitcasting a pointer from one address space to
another doesn't change it's meaning, though it can potentially become invalid.
This approach also allows addrspace(0) to become a bit more special: Any
pointer can be valid in that addrspace. This means that casting to
addrspace(0) and then loading is semantically the same as loading directly
(though it can be that the hardware can't map something like that). This
approach would also allow for removing the bitcast in my above problem, since
the loads from the generic address space can be replaced by loads from a
specific address space, without changing semantics. This does not currently
happen, but I would be willing to implement this. Finally, this approach is
consistent with the address spaces defined by DSP-C.
There are probably a few more approaches, that are similar to the ones above.
I would suggest that whatever approach we pick, we should document it
somewhere (langref?), since leaving this stuff backend-specific makes
transformations not so useful. Being consistent with DSP-C is probably the
best track here?