FW: Proposal: New IR instruction for casting between address spaces

Problem:
Bit casting between pointers of different address spaces only works if all address space pointers are the same size. With changes from email chain [1][2], support for different pointer sizes breaks the bitcast instruction since there is no guarantee that the pointer size for the address space is on the source and destination arguments are of the same size.

Solution:
Remove the ability of bitcast to cast between pointers of different address spaces and replace with an instruction that handles this case explicitely.

Proposed changes:

* Add restriction to the verifier on the bitcast instruction making bitcasting between address spaces illegal.

* Change documentation[3] to state the bitcast to pointers of different address spaces is illegal.

* Add in a new IR node, addrspacecast, that allows conversions between address spaces

* Updated the reader/writer to handle these cases

* Update the documentation to insert the new IR node.

* Add the following documentation:

'addrspacecast .. to' Instruction
Syntax:

  <result> = addrspacecast <ty> <value> to <ty2> ; yields ty2

Overview:

The ' addrspacecast ' instruction converts value to type ty2 without changing any bits.

Arguments:

The ' addrspacecast ' instruction takes a value to cast, which must be a non-aggregate first class value with a pointer type, and a type to cast it to, which must also be a pointer type. The pointer types of value and the destination type, ty2, must be identical, except for the address space.

Semantics:

The ' addrspacecast ' instruction converts value to type ty2. It converts the type to the type that is implied by the address space of the destination pointer. If the destination pointer is smaller than the source pointer, the upper bits are truncated. If the inverse is true, the upper bits are sign extended, otherwise the operation is a no-op.

Example:

  %X = addrspacecast i8 addrspace(1) 255 to i8* addrspace(2) ; yields i8* addrspace(2) :-1

  %Y = addrspacecast i32* %x to sint* addrspace(31) ; yields sint* addrspace(31):%x

Thanks,

Micah

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053166.html
[2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052639.html
[3] http://llvm.org/docs/LangRef.html#i_bitcast

From: Villmow, Micah
Sent: Tuesday, September 11, 2012 12:51 PM
To: llvm-commits@cs.uiuc.edu
Subject: Proposal: New IR instruction for casting between address spaces

Problem:
Bit casting between pointers of different address spaces only works if all address space pointers are the same size. With changes from email chain [1][2], support for different pointer sizes breaks the bitcast instruction since there is no guarantee that the pointer size for the address space is on the source and destination arguments are of the same size.

Can you comment on whether the need for this seems like a fundamental
need, in your field, or more of a limitation of the current generation of
architectures?

Solution:
Remove the ability of bitcast to cast between pointers of different address spaces and replace with an instruction that handles this case explicitely.

Proposed changes:
· Add restriction to the verifier on the bitcast instruction making bitcasting between address spaces illegal.
· Change documentation[3] to state the bitcast to pointers of different address spaces is illegal.
· Add in a new IR node, addrspacecast, that allows conversions between address spaces
· Updated the reader/writer to handle these cases
· Update the documentation to insert the new IR node.
· Add the following documentation:
'addrspacecast .. to' Instruction

Syntax:

  <result> = addrspacecast <ty> <value> to <ty2> ; yields ty2
Overview:

The ' addrspacecast ' instruction converts value to type ty2 without changing any bits.

This is mildly imprecise, because the whole point of this instruction is that
it can change the bit width.

Arguments:

The ' addrspacecast ' instruction takes a value to cast, which must be a non-aggregate first class value with a pointer type, and a type to cast it to, which must also be a pointer type. The pointer types of value and the destination type, ty2, must be identical, except for the address space.

Having a "pointer type" is sufficient to imply that it is a
"non-aggregate first class value".

Semantics:

The ' addrspacecast ' instruction converts value to type ty2. It converts the type to the type that is implied by the address space of the destination pointer. If the destination pointer is smaller than the source pointer, the upper bits are truncated. If the inverse is true, the upper bits are sign extended, otherwise the operation is a no-op.

Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's surprising
that addrspacecast would be different from them.

Dan

From: Dan Gohman [mailto:gohman@apple.com]
Sent: Tuesday, September 11, 2012 1:28 PM
To: Villmow, Micah
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

>
> From: Villmow, Micah
> Sent: Tuesday, September 11, 2012 12:51 PM
> To: llvm-commits@cs.uiuc.edu
> Subject: Proposal: New IR instruction for casting between address
> spaces
>
> Problem:
> Bit casting between pointers of different address spaces only works if
all address space pointers are the same size. With changes from email
chain [1][2], support for different pointer sizes breaks the bitcast
instruction since there is no guarantee that the pointer size for the
address space is on the source and destination arguments are of the same
size.

Can you comment on whether the need for this seems like a fundamental
need, in your field, or more of a limitation of the current generation
of architectures?

[Villmow, Micah] I think this is a little of both. While the current and previous generation of GPU architectures are limited in what they are capable of doing based on hardware restrictions, I also think there is a fundamental need. Not all devices will run with 64bit operations at full speed(or even have native instructions in any case), but memory sizes will soon eclipse what is addressable with 32bit pointers on non-PC systems. What this is causing is 32bit systems requiring addressing into 64bit memory and switching over to 64bit for address calculations destroys the performance advantage that the segmented memory provides.
In the CPU world, this isn't that much of a problem that I can tell as they have already been solved(32bit vs 64bit math is 1-1 in most cases), but in non-CPU architectures, this is a huge performance penalty(64bit mul runs 6x slower than 32bit mul). So being able to switch to 32bit in the cases where it is required and switch to 64bit where it is required is a fundamental need that I don't think will go away even if the architectures improve their memory infrastructure.

> Solution:
> Remove the ability of bitcast to cast between pointers of different
address spaces and replace with an instruction that handles this case
explicitely.
>
> Proposed changes:
> * Add restriction to the verifier on the bitcast instruction
making bitcasting between address spaces illegal.
> * Change documentation[3] to state the bitcast to pointers of
different address spaces is illegal.
> * Add in a new IR node, addrspacecast, that allows conversions
between address spaces
> * Updated the reader/writer to handle these cases
> * Update the documentation to insert the new IR node.
> * Add the following documentation:
> 'addrspacecast .. to' Instruction
>
> Syntax:
>
> <result> = addrspacecast <ty> <value> to <ty2> ; yields
ty2
> Overview:
>
> The ' addrspacecast ' instruction converts value to type ty2 without
changing any bits.

This is mildly imprecise, because the whole point of this instruction is
that it can change the bit width.

[Villmow, Micah] Doh, cut and paste error, will fix it.

>
> Arguments:
>
> The ' addrspacecast ' instruction takes a value to cast, which must be
a non-aggregate first class value with a pointer type, and a type to
cast it to, which must also be a pointer type. The pointer types of
value and the destination type, ty2, must be identical, except for the
address space.

Having a "pointer type" is sufficient to imply that it is a "non-
aggregate first class value".

>
> Semantics:
>
> The ' addrspacecast ' instruction converts value to type ty2. It
converts the type to the type that is implied by the address space of
the destination pointer. If the destination pointer is smaller than the
source pointer, the upper bits are truncated. If the inverse is true,
the upper bits are sign extended, otherwise the operation is a no-op.

Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
surprising that addrspacecast would be different from them.

[Villmow, Micah] Take for example a pointer representing a negative pointer offset into a 16 bit address space, if this is converted to a 64bit address space, the upper 48 bits would be zero and your negative offset just became positive. The difference between these two instruction types is that addrspacecast does not explicitly convert to any size, only implicitly, so the bits would need to be filled correctly.

Hi,

From: Dan Gohman [mailto:gohman@apple.com]
Sent: Tuesday, September 11, 2012 1:28 PM
To: Villmow, Micah
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

From: Villmow, Micah
Sent: Tuesday, September 11, 2012 12:51 PM
To: llvm-commits@cs.uiuc.edu
Subject: Proposal: New IR instruction for casting between address
spaces

Problem:
Bit casting between pointers of different address spaces only works if

all address space pointers are the same size. With changes from email
chain [1][2], support for different pointer sizes breaks the bitcast
instruction since there is no guarantee that the pointer size for the
address space is on the source and destination arguments are of the same
size.

Can you comment on whether the need for this seems like a fundamental
need, in your field, or more of a limitation of the current generation
of architectures?

[Villmow, Micah] I think this is a little of both. While the current and previous generation of GPU architectures are limited in what they are capable of doing based on hardware restrictions, I also think there is a fundamental need. Not all devices will run with 64bit operations at full speed(or even have native instructions in any case), but memory sizes will soon eclipse what is addressable with 32bit pointers on non-PC systems. What this is causing is 32bit systems requiring addressing into 64bit memory and switching over to 64bit for address calculations destroys the performance advantage that the segmented memory provides.
In the CPU world, this isn't that much of a problem that I can tell as they have already been solved(32bit vs 64bit math is 1-1 in most cases), but in non-CPU architectures, this is a huge performance penalty(64bit mul runs 6x slower than 32bit mul). So being able to switch to 32bit in the cases where it is required and switch to 64bit where it is required is a fundamental need that I don't think will go away even if the architectures improve their memory infrastructure.

Solution:
Remove the ability of bitcast to cast between pointers of different

address spaces and replace with an instruction that handles this case
explicitely.

Proposed changes:
* Add restriction to the verifier on the bitcast instruction

making bitcasting between address spaces illegal.

* Change documentation[3] to state the bitcast to pointers of

different address spaces is illegal.

* Add in a new IR node, addrspacecast, that allows conversions

between address spaces

* Updated the reader/writer to handle these cases
* Update the documentation to insert the new IR node.
* Add the following documentation:
'addrspacecast .. to' Instruction

Syntax:

<result> = addrspacecast <ty> <value> to <ty2> ; yields

ty2

Overview:

The ' addrspacecast ' instruction converts value to type ty2 without

changing any bits.

This is mildly imprecise, because the whole point of this instruction is
that it can change the bit width.

[Villmow, Micah] Doh, cut and paste error, will fix it.

Arguments:

The ' addrspacecast ' instruction takes a value to cast, which must be

a non-aggregate first class value with a pointer type, and a type to
cast it to, which must also be a pointer type. The pointer types of
value and the destination type, ty2, must be identical, except for the
address space.

Having a "pointer type" is sufficient to imply that it is a "non-
aggregate first class value".

Semantics:

The ' addrspacecast ' instruction converts value to type ty2. It

converts the type to the type that is implied by the address space of
the destination pointer. If the destination pointer is smaller than the
source pointer, the upper bits are truncated. If the inverse is true,
the upper bits are sign extended, otherwise the operation is a no-op.

Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
surprising that addrspacecast would be different from them.

[Villmow, Micah] Take for example a pointer representing a negative pointer offset into a 16 bit address space, if this is converted to a 64bit address space, the upper 48 bits would be zero and your negative offset just became positive. The difference between these two instruction types is that addrspacecast does not explicitly convert to any size, only implicitly, so the bits would need to be filled correctly.

I view a pointer as pointing to a location in memory and not as an offset relative to some base register. I think the proper semantic here is the same as inttoptr where it does a zero-extension.

  -- Mon Ping

From: Mon P Wang [mailto:monping@apple.com]
Sent: Wednesday, September 12, 2012 1:12 PM
To: Villmow, Micah
Cc: Dan Gohman; llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

Hi,

>
>
>> From: Dan Gohman [mailto:gohman@apple.com]
>> Sent: Tuesday, September 11, 2012 1:28 PM
>> To: Villmow, Micah
>> Cc: llvmdev@cs.uiuc.edu
>> Subject: Re: [LLVMdev] Proposal: New IR instruction for casting
between
>> address spaces
>>
>>
>>>
>>> From: Villmow, Micah
>>> Sent: Tuesday, September 11, 2012 12:51 PM
>>> To: llvm-commits@cs.uiuc.edu
>>> Subject: Proposal: New IR instruction for casting between address
>>> spaces
>>>
>>> Problem:
>>> Bit casting between pointers of different address spaces only works
if
>> all address space pointers are the same size. With changes from
email
>> chain [1][2], support for different pointer sizes breaks the bitcast
>> instruction since there is no guarantee that the pointer size for
the
>> address space is on the source and destination arguments are of the
same
>> size.
>>
>> Can you comment on whether the need for this seems like a
fundamental
>> need, in your field, or more of a limitation of the current
generation
>> of architectures?
> [Villmow, Micah] I think this is a little of both. While the current
and previous generation of GPU architectures are limited in what they
are capable of doing based on hardware restrictions, I also think there
is a fundamental need. Not all devices will run with 64bit operations
at full speed(or even have native instructions in any case), but memory
sizes will soon eclipse what is addressable with 32bit pointers on non-
PC systems. What this is causing is 32bit systems requiring addressing
into 64bit memory and switching over to 64bit for address calculations
destroys the performance advantage that the segmented memory provides.
> In the CPU world, this isn't that much of a problem that I can tell
as they have already been solved(32bit vs 64bit math is 1-1 in most
cases), but in non-CPU architectures, this is a huge performance
penalty(64bit mul runs 6x slower than 32bit mul). So being able to
switch to 32bit in the cases where it is required and switch to 64bit
where it is required is a fundamental need that I don't think will go
away even if the architectures improve their memory infrastructure.
>>
>>> Solution:
>>> Remove the ability of bitcast to cast between pointers of different
>> address spaces and replace with an instruction that handles this
case
>> explicitely.
>>>
>>> Proposed changes:
>>> * Add restriction to the verifier on the bitcast
instruction
>> making bitcasting between address spaces illegal.
>>> * Change documentation[3] to state the bitcast to pointers
of
>> different address spaces is illegal.
>>> * Add in a new IR node, addrspacecast, that allows
conversions
>> between address spaces
>>> * Updated the reader/writer to handle these cases
>>> * Update the documentation to insert the new IR node.
>>> * Add the following documentation:
>>> 'addrspacecast .. to' Instruction
>>>
>>> Syntax:
>>>
>>> <result> = addrspacecast <ty> <value> to <ty2> ;
yields
>> ty2
>>> Overview:
>>>
>>> The ' addrspacecast ' instruction converts value to type ty2
without
>> changing any bits.
>>
>> This is mildly imprecise, because the whole point of this
instruction is
>> that it can change the bit width.
> [Villmow, Micah] Doh, cut and paste error, will fix it.
>>
>>>
>>> Arguments:
>>>
>>> The ' addrspacecast ' instruction takes a value to cast, which must
be
>> a non-aggregate first class value with a pointer type, and a type to
>> cast it to, which must also be a pointer type. The pointer types of
>> value and the destination type, ty2, must be identical, except for
the
>> address space.
>>
>> Having a "pointer type" is sufficient to imply that it is a "non-
>> aggregate first class value".
>>
>>>
>>> Semantics:
>>>
>>> The ' addrspacecast ' instruction converts value to type ty2. It
>> converts the type to the type that is implied by the address space
of
>> the destination pointer. If the destination pointer is smaller than
the
>> source pointer, the upper bits are truncated. If the inverse is
true,
>> the upper bits are sign extended, otherwise the operation is a no-
op.
>>
>> Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
>> surprising that addrspacecast would be different from them.
> [Villmow, Micah] Take for example a pointer representing a negative
pointer offset into a 16 bit address space, if this is converted to a
64bit address space, the upper 48 bits would be zero and your negative
offset just became positive. The difference between these two
instruction types is that addrspacecast does not explicitly convert to
any size, only implicitly, so the bits would need to be filled
correctly.
>>

I view a pointer as pointing to a location in memory and not as an
offset relative to some base register. I think the proper semantic
here is the same as inttoptr where it does a zero-extension.

[Villmow, Micah] Yeah, but the pointer won't point to the same location if the conversion from a smaller pointer to a larger pointer is zero extended.
Take two address spaces(1 and 2) that are 16 and 64 bits in size.
int(1) *a = 0xFFFFFFF9;
int(2) *b = *a;
Is b -10(SExt), or is it 4294967289(ZExt)?
This works for inttoptr and ptrtoint because there is an assumption that the pointer is always the same size. Maybe we even need to extend ptrtoint/inttoptr to handle this case by adding unsigned versions?

From: Mon P Wang [mailto:monping@apple.com]
Sent: Wednesday, September 12, 2012 1:12 PM
To: Villmow, Micah
Cc: Dan Gohman; llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

Hi,

From: Dan Gohman [mailto:gohman@apple.com]
Sent: Tuesday, September 11, 2012 1:28 PM
To: Villmow, Micah
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting

between

address spaces

From: Villmow, Micah
Sent: Tuesday, September 11, 2012 12:51 PM
To: llvm-commits@cs.uiuc.edu
Subject: Proposal: New IR instruction for casting between address
spaces

Problem:
Bit casting between pointers of different address spaces only works

if

all address space pointers are the same size. With changes from

email

chain [1][2], support for different pointer sizes breaks the bitcast
instruction since there is no guarantee that the pointer size for

the

address space is on the source and destination arguments are of the

same

size.

Can you comment on whether the need for this seems like a

fundamental

need, in your field, or more of a limitation of the current

generation

of architectures?

[Villmow, Micah] I think this is a little of both. While the current

and previous generation of GPU architectures are limited in what they
are capable of doing based on hardware restrictions, I also think there
is a fundamental need. Not all devices will run with 64bit operations
at full speed(or even have native instructions in any case), but memory
sizes will soon eclipse what is addressable with 32bit pointers on non-
PC systems. What this is causing is 32bit systems requiring addressing
into 64bit memory and switching over to 64bit for address calculations
destroys the performance advantage that the segmented memory provides.

In the CPU world, this isn't that much of a problem that I can tell

as they have already been solved(32bit vs 64bit math is 1-1 in most
cases), but in non-CPU architectures, this is a huge performance
penalty(64bit mul runs 6x slower than 32bit mul). So being able to
switch to 32bit in the cases where it is required and switch to 64bit
where it is required is a fundamental need that I don't think will go
away even if the architectures improve their memory infrastructure.

Solution:
Remove the ability of bitcast to cast between pointers of different

address spaces and replace with an instruction that handles this

case

explicitely.

Proposed changes:
* Add restriction to the verifier on the bitcast

instruction

making bitcasting between address spaces illegal.

* Change documentation[3] to state the bitcast to pointers

of

different address spaces is illegal.

* Add in a new IR node, addrspacecast, that allows

conversions

between address spaces

* Updated the reader/writer to handle these cases
* Update the documentation to insert the new IR node.
* Add the following documentation:
'addrspacecast .. to' Instruction

Syntax:

<result> = addrspacecast <ty> <value> to <ty2> ;

yields

ty2

Overview:

The ' addrspacecast ' instruction converts value to type ty2

without

changing any bits.

This is mildly imprecise, because the whole point of this

instruction is

that it can change the bit width.

[Villmow, Micah] Doh, cut and paste error, will fix it.

Arguments:

The ' addrspacecast ' instruction takes a value to cast, which must

be

a non-aggregate first class value with a pointer type, and a type to
cast it to, which must also be a pointer type. The pointer types of
value and the destination type, ty2, must be identical, except for

the

address space.

Having a "pointer type" is sufficient to imply that it is a "non-
aggregate first class value".

Semantics:

The ' addrspacecast ' instruction converts value to type ty2. It

converts the type to the type that is implied by the address space

of

the destination pointer. If the destination pointer is smaller than

the

source pointer, the upper bits are truncated. If the inverse is

true,

the upper bits are sign extended, otherwise the operation is a no-

op.

Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
surprising that addrspacecast would be different from them.

[Villmow, Micah] Take for example a pointer representing a negative

pointer offset into a 16 bit address space, if this is converted to a
64bit address space, the upper 48 bits would be zero and your negative
offset just became positive. The difference between these two
instruction types is that addrspacecast does not explicitly convert to
any size, only implicitly, so the bits would need to be filled
correctly.

I view a pointer as pointing to a location in memory and not as an
offset relative to some base register. I think the proper semantic
here is the same as inttoptr where it does a zero-extension.

[Villmow, Micah] Yeah, but the pointer won't point to the same location if the conversion from a smaller pointer to a larger pointer is zero extended.
Take two address spaces(1 and 2) that are 16 and 64 bits in size.
int(1) *a = 0xFFFFFFF9;
int(2) *b = *a;
Is b -10(SExt), or is it 4294967289(ZExt)?

I think you mean if is it -10 (Sext) or 65529 (Zext from 16b to 64b)?

I would expect the same result if I wrote
  int(1) *a = 0x0FFF9;
  int(2) *b = *a;

In C, integer to point conversions are implementation defined and depends on what the addressing structure of the execution environment is. Given the current definition of ptrtoint and intoptr, I feel that the addressing structure feels like a flat memory model starting from 0 and the value "b" should be 65529. In your example where we know the largest pointer is 64b, I would expect the final result to be the same as doing a ptrtoint from int(1) to i64 and intotptr to int(2)*.

-- Mon Ping

From: Mon Ping Wang [mailto:monping@apple.com]
Sent: Thursday, September 13, 2012 1:55 AM
To: Villmow, Micah
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

>
>
>> From: Mon P Wang [mailto:monping@apple.com]
>> Sent: Wednesday, September 12, 2012 1:12 PM
>> To: Villmow, Micah
>> Cc: Dan Gohman; llvmdev@cs.uiuc.edu
>> Subject: Re: [LLVMdev] Proposal: New IR instruction for casting
>> between address spaces
>>
>> Hi,
>>
>>
>>>
>>>
>>>> From: Dan Gohman [mailto:gohman@apple.com]
>>>> Sent: Tuesday, September 11, 2012 1:28 PM
>>>> To: Villmow, Micah
>>>> Cc: llvmdev@cs.uiuc.edu
>>>> Subject: Re: [LLVMdev] Proposal: New IR instruction for casting
>> between
>>>> address spaces
>>>>
>>>>
>>>>>
>>>>> From: Villmow, Micah
>>>>> Sent: Tuesday, September 11, 2012 12:51 PM
>>>>> To: llvm-commits@cs.uiuc.edu
>>>>> Subject: Proposal: New IR instruction for casting between address
>>>>> spaces
>>>>>
>>>>> Problem:
>>>>> Bit casting between pointers of different address spaces only
>>>>> works
>> if
>>>> all address space pointers are the same size. With changes from
>> email
>>>> chain [1][2], support for different pointer sizes breaks the
>>>> bitcast instruction since there is no guarantee that the pointer
>>>> size for
>> the
>>>> address space is on the source and destination arguments are of the
>> same
>>>> size.
>>>>
>>>> Can you comment on whether the need for this seems like a
>> fundamental
>>>> need, in your field, or more of a limitation of the current
>> generation
>>>> of architectures?
>>> [Villmow, Micah] I think this is a little of both. While the current
>> and previous generation of GPU architectures are limited in what they
>> are capable of doing based on hardware restrictions, I also think
>> there is a fundamental need. Not all devices will run with 64bit
>> operations at full speed(or even have native instructions in any
>> case), but memory sizes will soon eclipse what is addressable with
>> 32bit pointers on non- PC systems. What this is causing is 32bit
>> systems requiring addressing into 64bit memory and switching over to
>> 64bit for address calculations destroys the performance advantage
that the segmented memory provides.
>>> In the CPU world, this isn't that much of a problem that I can tell
>> as they have already been solved(32bit vs 64bit math is 1-1 in most
>> cases), but in non-CPU architectures, this is a huge performance
>> penalty(64bit mul runs 6x slower than 32bit mul). So being able to
>> switch to 32bit in the cases where it is required and switch to 64bit
>> where it is required is a fundamental need that I don't think will go
>> away even if the architectures improve their memory infrastructure.
>>>>
>>>>> Solution:
>>>>> Remove the ability of bitcast to cast between pointers of
>>>>> different
>>>> address spaces and replace with an instruction that handles this
>> case
>>>> explicitely.
>>>>>
>>>>> Proposed changes:
>>>>> * Add restriction to the verifier on the bitcast
>> instruction
>>>> making bitcasting between address spaces illegal.
>>>>> * Change documentation[3] to state the bitcast to pointers
>> of
>>>> different address spaces is illegal.
>>>>> * Add in a new IR node, addrspacecast, that allows
>> conversions
>>>> between address spaces
>>>>> * Updated the reader/writer to handle these cases
>>>>> * Update the documentation to insert the new IR node.
>>>>> * Add the following documentation:
>>>>> 'addrspacecast .. to' Instruction
>>>>>
>>>>> Syntax:
>>>>>
>>>>> <result> = addrspacecast <ty> <value> to <ty2> ;
>> yields
>>>> ty2
>>>>> Overview:
>>>>>
>>>>> The ' addrspacecast ' instruction converts value to type ty2
>> without
>>>> changing any bits.
>>>>
>>>> This is mildly imprecise, because the whole point of this
>> instruction is
>>>> that it can change the bit width.
>>> [Villmow, Micah] Doh, cut and paste error, will fix it.
>>>>
>>>>>
>>>>> Arguments:
>>>>>
>>>>> The ' addrspacecast ' instruction takes a value to cast, which
>>>>> must
>> be
>>>> a non-aggregate first class value with a pointer type, and a type
>>>> to cast it to, which must also be a pointer type. The pointer types
>>>> of value and the destination type, ty2, must be identical, except
>>>> for
>> the
>>>> address space.
>>>>
>>>> Having a "pointer type" is sufficient to imply that it is a "non-
>>>> aggregate first class value".
>>>>
>>>>>
>>>>> Semantics:
>>>>>
>>>>> The ' addrspacecast ' instruction converts value to type ty2. It
>>>> converts the type to the type that is implied by the address space
>> of
>>>> the destination pointer. If the destination pointer is smaller than
>> the
>>>> source pointer, the upper bits are truncated. If the inverse is
>> true,
>>>> the upper bits are sign extended, otherwise the operation is a no-
>> op.
>>>>
>>>> Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
>>>> surprising that addrspacecast would be different from them.
>>> [Villmow, Micah] Take for example a pointer representing a negative
>> pointer offset into a 16 bit address space, if this is converted to a
>> 64bit address space, the upper 48 bits would be zero and your
>> negative offset just became positive. The difference between these
>> two instruction types is that addrspacecast does not explicitly
>> convert to any size, only implicitly, so the bits would need to be
>> filled correctly.
>>>>
>>
>> I view a pointer as pointing to a location in memory and not as an
>> offset relative to some base register. I think the proper semantic
>> here is the same as inttoptr where it does a zero-extension.
> [Villmow, Micah] Yeah, but the pointer won't point to the same
location if the conversion from a smaller pointer to a larger pointer is
zero extended.
> Take two address spaces(1 and 2) that are 16 and 64 bits in size.
> int(1) *a = 0xFFFFFFF9;
> int(2) *b = *a;
> Is b -10(SExt), or is it 4294967289(ZExt)?

I think you mean if is it -10 (Sext) or 65529 (Zext from 16b to 64b)?

I would expect the same result if I wrote
  int(1) *a = 0x0FFF9;
  int(2) *b = *a;

In C, integer to point conversions are implementation defined and
depends on what the addressing structure of the execution environment
is. Given the current definition of ptrtoint and intoptr, I feel that
the addressing structure feels like a flat memory model starting from 0
and the value "b" should be 65529. In your example where we know the
largest pointer is 64b, I would expect the final result to be the same
as doing a ptrtoint from int(1) to i64 and intotptr to int(2)*.

[Villmow, Micah] So then if there is already a way to do this, what really is the benefit of adding a new instruction?
Also there is a typo in my example, the second assignment should not have the '*'. I can add a new instruction if that
is the recommended behavior, but I think it would also be fine to force ptrtoint and inttoptr, although it does take one instruction more.

From: Mon Ping Wang [mailto:monping@apple.com]
Sent: Thursday, September 13, 2012 1:55 AM
To: Villmow, Micah
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

From: Mon P Wang [mailto:monping@apple.com]
Sent: Wednesday, September 12, 2012 1:12 PM
To: Villmow, Micah
Cc: Dan Gohman; llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting
between address spaces

Hi,

From: Dan Gohman [mailto:gohman@apple.com]
Sent: Tuesday, September 11, 2012 1:28 PM
To: Villmow, Micah
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting

between

address spaces

From: Villmow, Micah
Sent: Tuesday, September 11, 2012 12:51 PM
To: llvm-commits@cs.uiuc.edu
Subject: Proposal: New IR instruction for casting between address
spaces

Problem:
Bit casting between pointers of different address spaces only
works

if

all address space pointers are the same size. With changes from

email

chain [1][2], support for different pointer sizes breaks the
bitcast instruction since there is no guarantee that the pointer
size for

the

address space is on the source and destination arguments are of the

same

size.

Can you comment on whether the need for this seems like a

fundamental

need, in your field, or more of a limitation of the current

generation

of architectures?

[Villmow, Micah] I think this is a little of both. While the current

and previous generation of GPU architectures are limited in what they
are capable of doing based on hardware restrictions, I also think
there is a fundamental need. Not all devices will run with 64bit
operations at full speed(or even have native instructions in any
case), but memory sizes will soon eclipse what is addressable with
32bit pointers on non- PC systems. What this is causing is 32bit
systems requiring addressing into 64bit memory and switching over to
64bit for address calculations destroys the performance advantage

that the segmented memory provides.

In the CPU world, this isn't that much of a problem that I can tell

as they have already been solved(32bit vs 64bit math is 1-1 in most
cases), but in non-CPU architectures, this is a huge performance
penalty(64bit mul runs 6x slower than 32bit mul). So being able to
switch to 32bit in the cases where it is required and switch to 64bit
where it is required is a fundamental need that I don't think will go
away even if the architectures improve their memory infrastructure.

Solution:
Remove the ability of bitcast to cast between pointers of
different

address spaces and replace with an instruction that handles this

case

explicitely.

Proposed changes:
* Add restriction to the verifier on the bitcast

instruction

making bitcasting between address spaces illegal.

* Change documentation[3] to state the bitcast to pointers

of

different address spaces is illegal.

* Add in a new IR node, addrspacecast, that allows

conversions

between address spaces

* Updated the reader/writer to handle these cases
* Update the documentation to insert the new IR node.
* Add the following documentation:
'addrspacecast .. to' Instruction

Syntax:

<result> = addrspacecast <ty> <value> to <ty2> ;

yields

ty2

Overview:

The ' addrspacecast ' instruction converts value to type ty2

without

changing any bits.

This is mildly imprecise, because the whole point of this

instruction is

that it can change the bit width.

[Villmow, Micah] Doh, cut and paste error, will fix it.

Arguments:

The ' addrspacecast ' instruction takes a value to cast, which
must

be

a non-aggregate first class value with a pointer type, and a type
to cast it to, which must also be a pointer type. The pointer types
of value and the destination type, ty2, must be identical, except
for

the

address space.

Having a "pointer type" is sufficient to imply that it is a "non-
aggregate first class value".

Semantics:

The ' addrspacecast ' instruction converts value to type ty2. It

converts the type to the type that is implied by the address space

of

the destination pointer. If the destination pointer is smaller than

the

source pointer, the upper bits are truncated. If the inverse is

true,

the upper bits are sign extended, otherwise the operation is a no-

op.

Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
surprising that addrspacecast would be different from them.

[Villmow, Micah] Take for example a pointer representing a negative

pointer offset into a 16 bit address space, if this is converted to a
64bit address space, the upper 48 bits would be zero and your
negative offset just became positive. The difference between these
two instruction types is that addrspacecast does not explicitly
convert to any size, only implicitly, so the bits would need to be
filled correctly.

I view a pointer as pointing to a location in memory and not as an
offset relative to some base register. I think the proper semantic
here is the same as inttoptr where it does a zero-extension.

[Villmow, Micah] Yeah, but the pointer won't point to the same

location if the conversion from a smaller pointer to a larger pointer is
zero extended.

Take two address spaces(1 and 2) that are 16 and 64 bits in size.
int(1) *a = 0xFFFFFFF9;
int(2) *b = *a;
Is b -10(SExt), or is it 4294967289(ZExt)?

I think you mean if is it -10 (Sext) or 65529 (Zext from 16b to 64b)?

I would expect the same result if I wrote
int(1) *a = 0x0FFF9;
int(2) *b = *a;

In C, integer to point conversions are implementation defined and
depends on what the addressing structure of the execution environment
is. Given the current definition of ptrtoint and intoptr, I feel that
the addressing structure feels like a flat memory model starting from 0
and the value "b" should be 65529. In your example where we know the
largest pointer is 64b, I would expect the final result to be the same
as doing a ptrtoint from int(1) to i64 and intotptr to int(2)*.

[Villmow, Micah] So then if there is already a way to do this, what really is the benefit of adding a new instruction?
Also there is a typo in my example, the second assignment should not have the '*'. I can add a new instruction if that
is the recommended behavior, but I think it would also be fine to force ptrtoint and inttoptr, although it does take one instruction more.

The problem with using ptrtoint and inttoptr is that one has to pick an intermediate integer type that is safe to convert to and from. Since the pointer size is target dependent, it seems unnatural to use ptrtoint and inttoptr for that.

-- Mon Ping

If we were to add first class support for this, we'd have to add three new instructions: trunc_ptr, sext_ptr, and zext_ptr, all of which would be used for pointer to pointer conversions.

Why is it so bad to use ptrtoint/inttoptr with some large-enough integer size? Neither solution avoids exposing target information into IR.

-Chris

From: Chris Lattner [mailto:clattner@apple.com]
Sent: Thursday, September 13, 2012 2:24 PM
To: Villmow, Micah
Cc: Mon Ping Wang; llvmdev@cs.uiuc.edu Mailing List
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

>>> In C, integer to point conversions are implementation defined and
>>> depends on what the addressing structure of the execution
environment
>>> is. Given the current definition of ptrtoint and intoptr, I feel
that
>>> the addressing structure feels like a flat memory model starting
from 0
>>> and the value "b" should be 65529. In your example where we know
the
>>> largest pointer is 64b, I would expect the final result to be the
same
>>> as doing a ptrtoint from int(1) to i64 and intotptr to int(2)*.
>> [Villmow, Micah] So then if there is already a way to do this, what
really is the benefit of adding a new instruction?
>> Also there is a typo in my example, the second assignment should not
have the '*'. I can add a new instruction if that
>> is the recommended behavior, but I think it would also be fine to
force ptrtoint and inttoptr, although it does take one instruction
more.
>>
>
> The problem with using ptrtoint and inttoptr is that one has to pick
an intermediate integer type that is safe to convert to and from.
Since the pointer size is target dependent, it seems unnatural to use
ptrtoint and inttoptr for that.

If we were to add first class support for this, we'd have to add three
new instructions: trunc_ptr, sext_ptr, and zext_ptr, all of which would
be used for pointer to pointer conversions.

[Villmow, Micah] Would we really need to add pointer specific versions for these? Or could we as an alternative allow sext/zext/trunc to work on pointer types?

The pointer size is target dependent so it seems strange to choose an arbitrary size to convert to and from. Are you making a practical argument that 64b is sufficient on all machines so all targets can use that? In other words, pointers > 64 doesn't make any sense in terms of the address space? (A pointer to be > 64 if clients want to use some upper bits to track some state I guess).

In terms of the three new instructions, one could argue that ptrtoint and intoptr has the same issue or those can also explode in a similar way. To use them, this seems target dependent so unless we really want to support all the various addressing structures, I rather not have them.

  -- Mon Ping

>>>> In C, integer to point conversions are implementation defined and
>>>> depends on what the addressing structure of the execution
>>>> environment is. Given the current definition of ptrtoint and
>>>> intoptr, I feel that the addressing structure feels like a flat
>>>> memory model starting from 0 and the value "b" should be 65529.
>>>> In your example where we know the largest pointer is 64b, I
>>>> would expect the final result to be the same as doing a ptrtoint
>>>> from int(1) to i64 and intotptr to int(2)*.
>>> [Villmow, Micah] So then if there is already a way to do this,
>>> what really is the benefit of adding a new instruction? Also
>>> there is a typo in my example, the second assignment should not
>>> have the '*'. I can add a new instruction if that is the
>>> recommended behavior, but I think it would also be fine to force
>>> ptrtoint and inttoptr, although it does take one instruction more.
>>>
>>
>> The problem with using ptrtoint and inttoptr is that one has to
>> pick an intermediate integer type that is safe to convert to and
>> from. Since the pointer size is target dependent, it seems
>> unnatural to use ptrtoint and inttoptr for that.
>
> If we were to add first class support for this, we'd have to add
> three new instructions: trunc_ptr, sext_ptr, and zext_ptr, all of
> which would be used for pointer to pointer conversions.
>
> Why is it so bad to use ptrtoint/inttoptr with some large-enough
> integer size? Neither solution avoids exposing target information
> into IR.
>

The pointer size is target dependent so it seems strange to choose an
arbitrary size to convert to and from. Are you making a practical
argument that 64b is sufficient on all machines so all targets can
use that?

If we do it this way, then I'd recommend putting the required integer
size in the TargetData string. Having alternate address spaces with a
pointer size > 64b might certainly be useful for implementing
distributed shared memory, for example.

-Hal

The problem with using ptrtoint and inttoptr is that one has to pick

an intermediate integer type that is safe to convert to and from.
Since the pointer size is target dependent, it seems unnatural to use
ptrtoint and inttoptr for that.

If we were to add first class support for this, we'd have to add three
new instructions: trunc_ptr, sext_ptr, and zext_ptr, all of which would
be used for pointer to pointer conversions.

[Villmow, Micah] Would we really need to add pointer specific versions for these?

Yes.

Or could we as an alternative allow sext/zext/trunc to work on pointer types?

No.

-Chris

My point is that any producer of this sort of pointer cast is already necessarily target specific (it is generating target-specific address space numbers!). If the front-end knows the address space to use, it can know a safe integer size to use.

-Chris

It depends on what the address space is used for. If I'm logically partitioning an address space that overlap my pointer size may all be the same size so this issue doesn't come up other than I know the pointer size are the same. My understanding is that is becoming an issue since a pointer type size could be different for different address space. I agree for the case where the pointer size is address space dependent that the client has to understand the size and the properties to decide if they need to do truncation, sign extension or zero extensions.

This is a problem for auto upgrade as well. Today, we have bit cast between same size pointers for different address space. We would need to do something special for auto upgrade here since the proposal is to not allow bit cast between pointers of different address spaces.

-- Mon Ping

The pointer size is target dependent so it seems strange to choose an arbitrary size to convert to and from. Are you making a practical argument that 64b is sufficient on all machines so all targets can use that? In other words, pointers > 64 doesn't make any sense in terms of the address space? (A pointer to be > 64 if clients want to use some upper bits to track some state I guess).

In terms of the three new instructions, one could argue that ptrtoint and intoptr has the same issue or those can also explode in a similar way. To use them, this seems target dependent so unless we really want to support all the various addressing structures, I rather not have them.

My point is that any producer of this sort of pointer cast is already necessarily target specific (it is generating target-specific address space numbers!). If the front-end knows the address space to use, it can know a safe integer size to use.

It depends on what the address space is used for. If I'm logically partitioning an address space that overlap my pointer size may all be the same size so this issue doesn't come up other than I know the pointer size are the same.

Sure, in that case, use bitcast.

My understanding is that is becoming an issue since a pointer type size could be different for different address space. I agree for the case where the pointer size is address space dependent that the client has to understand the size and the properties to decide if they need to do truncation, sign extension or zero extensions.

Right.

This is a problem for auto upgrade as well. Today, we have bit cast between same size pointers for different address space. We would need to do something special for auto upgrade here since the proposal is to not allow bit cast between pointers of different address spaces.

I haven't followed the details of the proposal, but I think it makes perfect sense to continue using bitcast for ptr/ptr casts within the same pointer size. If you do that, then there is no auto-upgrade issue: all existing bc files can just be assumed to have the same pointer size.

-Chris

Chris Lattner wrote:

The pointer size is target dependent so it seems strange to choose an arbitrary size to convert to and from. Are you making a practical argument that 64b is sufficient on all machines so all targets can use that? In other words, pointers> 64 doesn't make any sense in terms of the address space? (A pointer to be> 64 if clients want to use some upper bits to track some state I guess).

In terms of the three new instructions, one could argue that ptrtoint and intoptr has the same issue or those can also explode in a similar way. To use them, this seems target dependent so unless we really want to support all the various addressing structures, I rather not have them.

My point is that any producer of this sort of pointer cast is already necessarily target specific (it is generating target-specific address space numbers!). If the front-end knows the address space to use, it can know a safe integer size to use.

It depends on what the address space is used for. If I'm logically partitioning an address space that overlap my pointer size may all be the same size so this issue doesn't come up other than I know the pointer size are the same.

Sure, in that case, use bitcast.

My understanding is that is becoming an issue since a pointer type size could be different for different address space. I agree for the case where the pointer size is address space dependent that the client has to understand the size and the properties to decide if they need to do truncation, sign extension or zero extensions.

Right.

This is a problem for auto upgrade as well. Today, we have bit cast between same size pointers for different address space. We would need to do something special for auto upgrade here since the proposal is to not allow bit cast between pointers of different address spaces.

I haven't followed the details of the proposal, but I think it makes perfect sense to continue using bitcast for ptr/ptr casts within the same pointer size.

I don't want whether a module passes the verifier to depend on target data. I also don't want bitcasts that change width to pass the verifier. Hence the thinking was that bitcasts across address spaces would be rejected and replaced with something else (initially ptrtoint/inttoptr pairs were proposed, then a new instruction).

Nick

   If you do that, then there is no auto-upgrade issue: all existing bc files can just be assumed to have the same pointer size.

From: Chris Lattner [mailto:clattner@apple.com]
Sent: Thursday, September 13, 2012 11:53 PM
To: Mon Ping Wang
Cc: Villmow, Micah; llvmdev@cs.uiuc.edu Mailing List
Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
address spaces

>>> The pointer size is target dependent so it seems strange to choose
an arbitrary size to convert to and from. Are you making a practical
argument that 64b is sufficient on all machines so all targets can use
that? In other words, pointers > 64 doesn't make any sense in terms of
the address space? (A pointer to be > 64 if clients want to use some
upper bits to track some state I guess).
>>>
>>> In terms of the three new instructions, one could argue that
ptrtoint and intoptr has the same issue or those can also explode in a
similar way. To use them, this seems target dependent so unless we
really want to support all the various addressing structures, I rather
not have them.
>>
>> My point is that any producer of this sort of pointer cast is
already necessarily target specific (it is generating target-specific
address space numbers!). If the front-end knows the address space to
use, it can know a safe integer size to use.
>>
>
> It depends on what the address space is used for. If I'm logically
partitioning an address space that overlap my pointer size may all be
the same size so this issue doesn't come up other than I know the
pointer size are the same.

Sure, in that case, use bitcast.

> My understanding is that is becoming an issue since a pointer type
size could be different for different address space. I agree for the
case where the pointer size is address space dependent that the client
has to understand the size and the properties to decide if they need to
do truncation, sign extension or zero extensions.

Right.

> This is a problem for auto upgrade as well. Today, we have bit cast
between same size pointers for different address space. We would need
to do something special for auto upgrade here since the proposal is to
not allow bit cast between pointers of different address spaces.

I haven't followed the details of the proposal, but I think it makes
perfect sense to continue using bitcast for ptr/ptr casts within the
same pointer size. If you do that, then there is no auto-upgrade
issue: all existing bc files can just be assumed to have the same
pointer size.

[Villmow, Micah] So basically we don't need a new IR instructions, but instead
1) bitcasts between pointers of different size is illegal, the proper approach is inttoptr/ptrtoint.
2) bitcasts between pointers of the same size stays legal.
3) No new IR instruction is needed, as converting between pointers of different sizes requires inttoptr/ptrtoint.

The only issues are then to update the verifier to assert on bitcasts between pointers of different sizes and add in auto-upgrade of binaries to switch to inttoptr/ptrtoint. By doing this, I then can clear the way for allowing LLVM to support multiple pointer sizes.

Sound good?

Micah

Makes sense to me!

-Chris

Also, pointers on AS/400 are 16-bytes long (i.e. 128 bits), if I recall correctly.

-K