RFC: Supporting different sized address space arithmetic

Currently LLVM only supports address calculations for a single address space(the default). This is problematic when the device pointer type is 32/64bits, but there are small distinct memories that only have 16 bits of addressing(think GPU’s with small software controlled memory segments).

I am proposing a modification to a few API’s in SelectionDAG that I believe will fix this problem.

In TargetLowering

· Add getPointerTy to take an argument, the address space, this defaults to 0.

· Add a virtual API call that returns the default address space for the target, this defaults to returning 0.

· Have the original getPointerTy implementation with no arguments query for the default address space.

In SelectionDAG

· Add a new API to getIntPtrConstant that takes an address space as the second argument

· Modify the implementation of the original getIntPtrConstant function to call the new function with getDefaultAddressSpace().

Modify SelectionDAGBuilder::visitGetElementPtr to get the address space of pointer argument and passing it into the getIntrPtrConstant and getPointerTy calls.

As far as I can tell, this should not affect any backends behavior, but will allow the targets with disjoint address spaces to directly address them in the most efficient manner.

So, what do you think? Is this the correct approach? Or does it actually require more fundamental changes.

Thanks,

Micah

Currently LLVM only supports address calculations for a single address
space(the default). This is problematic when the device pointer type is
32/64bits, but there are small distinct memories that only have 16 bits of
addressing(think GPU's with small software controlled memory segments).

I am proposing a modification to a few API's in SelectionDAG that I believe
will fix this problem.

In TargetLowering

· Add getPointerTy to take an argument, the address space, this
defaults to 0.

· Add a virtual API call that returns the default address space for
the target, this defaults to returning 0.

Under what circumstances would the default address space for a target not be 0?

· Have the original getPointerTy implementation with no arguments
query for the default address space.

In SelectionDAG

· Add a new API to getIntPtrConstant that takes an address space as
the second argument

Do we actually need this API? Normally, if you need a pointer-sized
constant, it means you have a value of pointer type somewhere nearby;
it's probably easier to just use the existing API which takes a
constant and a type than to try to dig the address space out of a
memory operation.

· Modify the implementation of the original getIntPtrConstant
function to call the new function with getDefaultAddressSpace().

As far as I can tell, this should not affect any backends behavior, but will
allow the targets with disjoint address spaces to directly address them in
the most efficient manner.

So, what do you think? Is this the correct approach? Or does it actually
require more fundamental changes.

This is the right direction; I don't think you need anything else, at
least not in the SelectionDAG.

Are you planning on introducing an in-tree target that actually uses
this functionality? I'm afraid it'll bitrot without good testing.

-Eli

From: Eli Friedman [mailto:eli.friedman@gmail.com]
Sent: Friday, August 17, 2012 3:16 PM
To: Villmow, Micah
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] RFC: Supporting different sized address space
arithmetic

> Currently LLVM only supports address calculations for a single address
> space(the default). This is problematic when the device pointer type
> is 32/64bits, but there are small distinct memories that only have 16
> bits of addressing(think GPU's with small software controlled memory
segments).
>
>
>
> I am proposing a modification to a few API's in SelectionDAG that I
> believe will fix this problem.
>
>
>
> In TargetLowering
>
> * Add getPointerTy to take an argument, the address space,
this
> defaults to 0.
>
> * Add a virtual API call that returns the default address
space for
> the target, this defaults to returning 0.

Under what circumstances would the default address space for a target
not be 0?

[Villmow, Micah] In OpenCL, the default address space is the private address space. This is closer to TLS in the X86 world, whereas the
llvm default address space of 0, is closer to the global address space in OpenCL. For our GPU's when supporting 64bit pointers, there
is a huge drop(think ~60%) in LDS bandwidth performance because of the extra computation 64bit pointers requires. What would be ideal is
that our default would be treated as global(AS 1) and be 64bit pointers, the private(AS 0) would be 32bit pointers and constant(AS 2) and local(AS 3)
address spaces can be considered as 16bit pointers. This would allow us to highly optimize our memory access and only pay for the 64bit
pointer addressing costs where it is required by hardware. So we want the default address space to be the global address space for all
computations the use the pointer type, but on non-OpenCL systems, this behavior might not be warranted.

> * Have the original getPointerTy implementation with no
arguments
> query for the default address space.
>
> In SelectionDAG
>
> * Add a new API to getIntPtrConstant that takes an address
space as
> the second argument

Do we actually need this API? Normally, if you need a pointer-sized
constant, it means you have a value of pointer type somewhere nearby;
it's probably easier to just use the existing API which takes a constant
and a type than to try to dig the address space out of a memory
operation.

[Villmow, Micah] This is only there so that the current implementation does not
change and break anything. Instead of changing all of the code to the new API,
I leave the code alone and only change the locations that need the new functionality.

> * Modify the implementation of the original getIntPtrConstant
> function to call the new function with getDefaultAddressSpace().
>
> As far as I can tell, this should not affect any backends behavior,
> but will allow the targets with disjoint address spaces to directly
> address them in the most efficient manner.
>
>
>
> So, what do you think? Is this the correct approach? Or does it
> actually require more fundamental changes.

This is the right direction; I don't think you need anything else, at
least not in the SelectionDAG.

Are you planning on introducing an in-tree target that actually uses
this functionality? I'm afraid it'll bitrot without good testing.

[Villmow, Micah] We have the AMDIL branch that this will be added to.

From: Eli Friedman [mailto:eli.friedman@gmail.com]
Sent: Friday, August 17, 2012 3:16 PM
To: Villmow, Micah
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] RFC: Supporting different sized address space
arithmetic

> Currently LLVM only supports address calculations for a single address
> space(the default). This is problematic when the device pointer type
> is 32/64bits, but there are small distinct memories that only have 16
> bits of addressing(think GPU's with small software controlled memory
segments).
>
>
>
> I am proposing a modification to a few API's in SelectionDAG that I
> believe will fix this problem.
>
>
>
> In TargetLowering
>
> * Add getPointerTy to take an argument, the address space,
this
> defaults to 0.
>
> * Add a virtual API call that returns the default address
space for
> the target, this defaults to returning 0.

Under what circumstances would the default address space for a target
not be 0?

[Villmow, Micah] In OpenCL, the default address space is the private address space. This is closer to TLS in the X86 world, whereas the
llvm default address space of 0, is closer to the global address space in OpenCL. For our GPU's when supporting 64bit pointers, there
is a huge drop(think ~60%) in LDS bandwidth performance because of the extra computation 64bit pointers requires. What would be ideal is
that our default would be treated as global(AS 1) and be 64bit pointers, the private(AS 0) would be 32bit pointers and constant(AS 2) and local(AS 3)
address spaces can be considered as 16bit pointers. This would allow us to highly optimize our memory access and only pay for the 64bit
pointer addressing costs where it is required by hardware. So we want the default address space to be the global address space for all
computations the use the pointer type, but on non-OpenCL systems, this behavior might not be warranted.

Okay... that mostly makes sense. (Although it sounds like long-term,
you'd want to eliminate all uses of getDefaultAddressSpace from
target-independent code.)

> * Have the original getPointerTy implementation with no
arguments
> query for the default address space.
>
> In SelectionDAG
>
> * Add a new API to getIntPtrConstant that takes an address
space as
> the second argument

Do we actually need this API? Normally, if you need a pointer-sized
constant, it means you have a value of pointer type somewhere nearby;
it's probably easier to just use the existing API which takes a constant
and a type than to try to dig the address space out of a memory
operation.

[Villmow, Micah] This is only there so that the current implementation does not
change and break anything. Instead of changing all of the code to the new API,
I leave the code alone and only change the locations that need the new functionality.

I understand that part; it just seems like nobody will actually end up
using the two-argument form getIntPtrConstant(), in favor of just
using getConstant().

> * Modify the implementation of the original getIntPtrConstant
> function to call the new function with getDefaultAddressSpace().
>
> As far as I can tell, this should not affect any backends behavior,
> but will allow the targets with disjoint address spaces to directly
> address them in the most efficient manner.
>
>
>
> So, what do you think? Is this the correct approach? Or does it
> actually require more fundamental changes.

This is the right direction; I don't think you need anything else, at
least not in the SelectionDAG.

Are you planning on introducing an in-tree target that actually uses
this functionality? I'm afraid it'll bitrot without good testing.

[Villmow, Micah] We have the AMDIL branch that this will be added to.

That's good.

-Eli

> Under what circumstances would the default address space for a target
> not be 0?

[Villmow, Micah] In OpenCL, the default address space is the
private address space. This is closer to TLS in the X86 world,
whereas the llvm default address space of 0, is closer to the
global address space in OpenCL. For our GPU's when supporting
64bit pointers, there is a huge drop(think ~60%) in LDS bandwidth
performance because of the extra computation 64bit pointers
requires. What would be ideal is that our default would be
treated as global(AS 1) and be 64bit pointers, the private(AS 0)
would be 32bit pointers and constant(AS 2) and local(AS 3)
address spaces can be considered as 16bit pointers. This would
allow us to highly optimize our memory access and only pay for
the 64bit pointer addressing costs where it is required by
hardware. So we want the default address space to be the global
address space for all computations the use the pointer type, but
on non-OpenCL systems, this behavior might not be warranted.

Sorry, I don't understand. Are you saying that you want to switch
the default address space to different numbers in different
contexts? (Otherwise I don't understand why you can't use zero.)

FYI: We also need pointers of different sizes for our target, and
we plan to look into this before the end of this year.

/Patrik Hägglund