Casting between address spaces and address space semantics

Hi Mon Ping,

I don't have a problem having another class, TargetAddrSpace, to store this
information. However, I don't think it make sense being a standalone pass.
Address spaces seems to part of the TargetData and it seems more natural
to ask the TargetData to return the TargetAddrSpace object (similar to
struct layout) to describe the relationships between address spaces.

This is pretty much what I did in my first patch, but Chris didn't like it.
Currently, the TargetData class can be completely described using it's string
representation (which is also stored inside a module). Adding address space
information to target data breaks this, unless we also make a string
representation of the address spaces that we can put in a module.

I think that having a seperate class is somewhat cleaner, since it also makes
sure that this info is only made available to the passes that actually use it.

BTW, there is a comment in TargetAddrspaces.h that indicate the default is
that all address spaces are equivalent. I assume you meant disjoint here.

Uh, yeah :slight_smile:

The last part of this patch is an addition to InstCombine to make use of
this information: It removes any bitcasts from a subset to a superset
address space. It gets at the address space information by requiring the
TargetAddrspaces analysis, which will give it the default implementation in
all current tools.

For the case of a GetElementPointer, we are replacing a bitcast to a
pointer, getelem with a getelem bitcast. The assumption is the latter
bitcast will hopefully go away when we iterate through those uses.

Uh? Is this a comment about what the current code or my patch does, or what it
should do? I don't understand what you mean here.

Mon Ping suggests using address space information for alias analysis as
well, which seems to make sense. In effect this is a form of type-based
alias analysis, but different address spaces don't preclude pointers from
being equal. A problem here is that pointers in disjoint address spaces
would be marked as not aliasing, but when the default relation is disjoint
this is not so conservative. This might require an extra option "Unknown",
which can be used as the default instead of "Disjoint". For Unknown, any
pass can do the conservative thing.

What I'm suggesting is that Alias Analysis can be a client to where we
store address space information. In the example you gave, alias analysis
will examine two memory locations, ask the TargetAddressSpace what the
relationship is and if it is disjoint, it will return no alias. If the
address spaces are in subset relationship, the alias analysis returns maybe
unless it has more information. If a client doesn't tell the compiler the
correct address space information, the client shouldn't expect correct
answers from coming out of the compiler.

True, anyone actually using address space should make sure that this info is
correct anyway. So, no need for an unknown default?

Lastly, I'm still not so sure if InstCombine is the right place for this
simplification. This needs some more thought, but currently it is a problem
that instcombine does not process BitCastConstantExprs. I might end up
writing a seperate pass for just this.

I'm not sure either. At some level, what we want is to propagate the most
precise address space (or restrict) information to its use.

Exactly.

This means that ideally we would want to be able to handle copies of the
value stored in some temporary and track it all the way through to it use.
InstCombine will not handle this case, e.g, address space 1 is a subset of 2
   int<1>* ptr = ...
   int<2>* ptr2 = ptr1+4
   *ptr2 = ...

Won't this code produce a bitcast in the IR, which can be propagated? My
current patch doesn't do this, but it should be easy to extend it to also
propagate a bitcast past pointer arithmetic.

Ie, it should change

%tmp = bitcast i32 addrspace(1)* %ptr1 to i32 addrspace(2)*
%ptr2 = add i32 addrspace(2)* %tmp, 4
store i32 0 i32 addrspace(2)* %ptr2

to

%tmp = add i32 addrspace(1)* %ptr, 4
%ptr2 = bitcast %tmp to i32 addrspace(2)
store i32 0 i32 addrspace(2)* %ptr2

and then to

%ptr2 = add i32 addrspace(1)* %ptr, 4
store i32 0 i32 addrspace(1)* %ptr2

Which shouldn't be too hard?

Coming to think of it, the above won't even use the add instruction, but will
probably use a GEP to do the +4. The current code already propagates bitcasts
past GEP instructions. Also, I suspect that our C frontends will already
produce the second version of the IR for the C code you give.

Or am I totally missing the point you are making here?

Gr.

Matthijs

Hi Matthijs,

I don’t have a problem having another class, TargetAddrSpace, to store this

information. However, I don’t think it make sense being a standalone pass.

Address spaces seems to part of the TargetData and it seems more natural

to ask the TargetData to return the TargetAddrSpace object (similar to

struct layout) to describe the relationships between address spaces.

This is pretty much what I did in my first patch, but Chris didn’t like it.
Currently, the TargetData class can be completely described using it’s string
representation (which is also stored inside a module). Adding address space
information to target data breaks this, unless we also make a string
representation of the address spaces that we can put in a module.

I think that having a seperate class is somewhat cleaner, since it also makes
sure that this info is only made available to the passes that actually use it.

If I remember correctly, I was also not fond of passing another TargetAddrSpace reference to the TargetData object. I was hoping that we could encode the information as a target description string like we do for ABI information. I just don’t want to end up with too many objects that describe the machine. One can argue that we shouldn’t pollute the TargetData since it describes the ABI alignment, size, and object layoutbut I feel that this data fits naturally there. If you and other people feel it is cleaner with a separate pass, I’m fine with it.

I want to treat my next point with some delicacy as I don’t want to start a religious war. I just want to get clarification from the community on the use of multiple inheritance for the case of Phases like AllDisjointAddrspaces. From what I can gather, the use of multiple inheritance is to separate the interface (TargetAddrSpace) to access data from the interface of the phase (ImmutablePhase). In this case, will we ever create a concrete class from TargetAddrSpace that doesn’t also derive from ImmutablePass? If not, I don’t think is worth using multiple inheritance in this case.

[Deleted text]

The last part of this patch is an addition to InstCombine to make use of

this information: It removes any bitcasts from a subset to a superset

address space. It gets at the address space information by requiring the

TargetAddrspaces analysis, which will give it the default implementation in

all current tools.

For the case of a GetElementPointer, we are replacing a bitcast to a

pointer, getelem with a getelem bitcast. The assumption is the latter

bitcast will hopefully go away when we iterate through those uses.

Uh? Is this a comment about what the current code or my patch does, or what it
should do? I don’t understand what you mean here.

My comment was more on what I thought the patch did and I wanted to confirm that it will cleanup newly generated bit cast that are created.

[Deleted Text]

What I’m suggesting is that Alias Analysis can be a client to where we

store address space information. In the example you gave, alias analysis

will examine two memory locations, ask the TargetAddressSpace what the

relationship is and if it is disjoint, it will return no alias. If the

address spaces are in subset relationship, the alias analysis returns maybe

unless it has more information. If a client doesn’t tell the compiler the

correct address space information, the client shouldn’t expect correct

answers from coming out of the compiler.

True, anyone actually using address space should make sure that this info is
correct anyway. So, no need for an unknown default?

That is my feeling.

Lastly, I’m still not so sure if InstCombine is the right place for this

simplification. This needs some more thought, but currently it is a problem

that instcombine does not process BitCastConstantExprs. I might end up

writing a seperate pass for just this.

I’m not sure either. At some level, what we want is to propagate the most

precise address space (or restrict) information to its use.

Exactly.

This means that ideally we would want to be able to handle copies of the

value stored in some temporary and track it all the way through to it use.

InstCombine will not handle this case, e.g, address space 1 is a subset of 2

int<1>* ptr = …

int<2>* ptr2 = ptr1+4

*ptr2 = …

Won’t this code produce a bitcast in the IR, which can be propagated? My
current patch doesn’t do this, but it should be easy to extend it to also
propagate a bitcast past pointer arithmetic.

Ie, it should change

%tmp = bitcast i32 addrspace(1)* %ptr1 to i32 addrspace(2)*
%ptr2 = add i32 addrspace(2)* %tmp, 4
store i32 0 i32 addrspace(2)* %ptr2

to

%tmp = add i32 addrspace(1)* %ptr, 4
%ptr2 = bitcast %tmp to i32 addrspace(2)
store i32 0 i32 addrspace(2)* %ptr2

and then to

%ptr2 = add i32 addrspace(1)* %ptr, 4
store i32 0 i32 addrspace(1)* %ptr2

Which shouldn’t be too hard?

Coming to think of it, the above won’t even use the add instruction, but will
probably use a GEP to do the +4. The current code already propagates bitcasts
past GEP instructions. Also, I suspect that our C frontends will already
produce the second version of the IR for the C code you give.

Or am I totally missing the point you are making here?

No, you got my point even though my example is not a good one. If the address calculation was using a variable, I don’t think we can fold it into the GEP and we might lose this information. The point I was trying to make is that the information needs to be propagated through any address calculation when possible.

Cheers,
– Mon Ping

Hi Mon Ping,

If I remember correctly, I was also not fond of passing another
TargetAddrSpace reference to the TargetData object. I was hoping that we
could encode the information as a target description string like we do for
ABI information. I just don't want to end up with too many objects that
describe the machine. One can argue that we shouldn't pollute the
TargetData since it describes the ABI alignment, size, and object
layoutbut I feel that this data fits naturally there. If you and other
people feel it is cleaner with a separate pass, I'm fine with it.

Perhaps encoding it in TargetData makes sense. However, I was avoiding this
for now, since Chris commented a while back that he wanted to have it in
TargetData "only if absolutely required".

However, thinking of this a bit more I do see your point about TargetData.
Another interesting advantage is that it would be a lot easier to make things
consistent between clang and LLVM, by simply using the TargetData string
that gets embedded in the module.

So, I guess embedding this info in TargetData makes more sense. How would this
look like? I would think of something like:

as1:<2:=3:>4:!3

This would mean address space 1 is a subset of 2, equivalent to 3, a superset
of 4 and disjoint with 3. A number of these could be present in a TargetData
string, to fully describe the situation. Any relations not described mean
disjoint. Relations can also be implicitely defined, ie, as1:<2-as2:<3 also
implies as1:<3.

I'm not sure if the > should be present, since that's always reversible. Also,
! is probably not so nice for disjoint, any other suggestions?

As to implementing this, I'm thinking of a equivalency table (for every
address space, store the equivalent address space with the lowest id) and for
each of those lowest id spaces in each equivalency group store a set of
address spaces that are a subset. This set should be complete, so when the
string says A > B > C, the set should store both B and C as subsets of A.

This allows for resolving the relation between two address spaces by two
lookups in the equivalency table and (one or) two lookups in the subset table.
No results in the subset table means the relation is disjoint, then.

Any comments on this? Chris, would this be acceptable?

I want to treat my next point with some delicacy as I don't want to start a
religious war. I just want to get clarification from the community on the
use of multiple inheritance for the case of Phases like
AllDisjointAddrspaces. From what I can gather, the use of multiple
inheritance is to separate the interface (TargetAddrSpace) to access data
from the interface of the phase (ImmutablePhase). In this case, will we
ever create a concrete class from TargetAddrSpace that doesn't also derive
from ImmutablePass? If not, I don't think is worth using multiple
inheritance in this case.

I think you are right here, changing the inheritance in this way also works
fine.

For the case of a GetElementPointer, we are replacing a bitcast to a
pointer, getelem with a getelem bitcast. The assumption is the latter
bitcast will hopefully go away when we iterate through those uses.

Uh? Is this a comment about what the current code or my patch does, or what
it should do? I don't understand what you mean here.

My comment was more on what I thought the patch did and I wanted to confirm
that it will cleanup newly generated bit cast that are created.

In that case, yes, the newly generated bitcasts should be iteratively cleaned
up whenever possible.

True, anyone actually using address space should make sure that this info
is correct anyway. So, no need for an unknown default?

That is my feeling.

Ok.

No, you got my point even though my example is not a good one. If the
address calculation was using a variable, I don't think we can fold it into
the GEP and we might lose this information.

Ie, a variable that is stored to, you mean? In that case, the address space is
probably propagated until the store instruction. Perhaps it can even be
propagated through the store instruction, so it stores to a bitcasted pointer
(ie, bitcast i32 addrspace(2)* * to i32 addrspace(1) * *). I suspect other
parts of instcombine might handle this from here and change the variable's
type to i32 addrspace(1), if possible. I guess this is something for later on.

The point I was trying to make is that the information needs to be
propagated through any address calculation when possible.

Yes, but that shouldn't be too hard to add to the existing code.

Gr.

Matthijs

Hi Matthijs,

Hi Mon Ping,

If I remember correctly, I was also not fond of passing another
TargetAddrSpace reference to the TargetData object. I was hoping that we
could encode the information as a target description string like we do for
ABI information. I just don't want to end up with too many objects that
describe the machine. One can argue that we shouldn't pollute the
TargetData since it describes the ABI alignment, size, and object
layoutbut I feel that this data fits naturally there. If you and other
people feel it is cleaner with a separate pass, I'm fine with it.

Perhaps encoding it in TargetData makes sense. However, I was avoiding this
for now, since Chris commented a while back that he wanted to have it in
TargetData "only if absolutely required".

However, thinking of this a bit more I do see your point about TargetData.
Another interesting advantage is that it would be a lot easier to make things
consistent between clang and LLVM, by simply using the TargetData string
that gets embedded in the module.

So, I guess embedding this info in TargetData makes more sense. How would this
look like? I would think of something like:

as1:<2:=3:>4:!3

This would mean address space 1 is a subset of 2, equivalent to 3, a superset
of 4 and disjoint with 3. A number of these could be present in a TargetData
string, to fully describe the situation. Any relations not described mean
disjoint. Relations can also be implicitely defined, ie, as1:<2-as2:<3 also
implies as1:<3.

I'm not sure if the > should be present, since that's always reversible. Also,
! is probably not so nice for disjoint, any other suggestions?

I'm thinking it is sufficient to just have subset and equivalence. If superset makes the description more concise, it might be worth to put in. Since the default is disjoint, I don't think we need it. BTW, in the example, I assume you are just showing syntax as the system would flag an error if 1 is equivalent to 3 and disjoint to 3.

As to implementing this, I'm thinking of a equivalency table (for every
address space, store the equivalent address space with the lowest id) and for
each of those lowest id spaces in each equivalency group store a set of
address spaces that are a subset. This set should be complete, so when the
string says A > B > C, the set should store both B and C as subsets of A.

This allows for resolving the relation between two address spaces by two
lookups in the equivalency table and (one or) two lookups in the subset table.
No results in the subset table means the relation is disjoint, then.

I think this is a reasonable way of going here.

Any comments on this? Chris, would this be acceptable?

I want to treat my next point with some delicacy as I don't want to start a
religious war. I just want to get clarification from the community on the
use of multiple inheritance for the case of Phases like
AllDisjointAddrspaces. From what I can gather, the use of multiple
inheritance is to separate the interface (TargetAddrSpace) to access data
from the interface of the phase (ImmutablePhase). In this case, will we
ever create a concrete class from TargetAddrSpace that doesn't also derive
from ImmutablePass? If not, I don't think is worth using multiple
inheritance in this case.

I think you are right here, changing the inheritance in this way also works
fine.

For the case of a GetElementPointer, we are replacing a bitcast to a
pointer, getelem with a getelem bitcast. The assumption is the latter
bitcast will hopefully go away when we iterate through those uses.

Uh? Is this a comment about what the current code or my patch does, or what
it should do? I don't understand what you mean here.

My comment was more on what I thought the patch did and I wanted to confirm
that it will cleanup newly generated bit cast that are created.

In that case, yes, the newly generated bitcasts should be iteratively cleaned
up whenever possible.

True, anyone actually using address space should make sure that this info
is correct anyway. So, no need for an unknown default?

That is my feeling.

Ok.

No, you got my point even though my example is not a good one. If the
address calculation was using a variable, I don't think we can fold it into
the GEP and we might lose this information.

Ie, a variable that is stored to, you mean? In that case, the address space is
probably propagated until the store instruction. Perhaps it can even be
propagated through the store instruction, so it stores to a bitcasted pointer
(ie, bitcast i32 addrspace(2)* * to i32 addrspace(1) * *). I suspect other
parts of instcombine might handle this from here and change the variable's
type to i32 addrspace(1), if possible. I guess this is something for later on.

I think I meant that the address space information needs to be propagated through any temporary that is used to hold the value that might be using a different address spaces. In general, this isn't difficult but one just have to be aware of it when manipulating pointers.

The point I was trying to make is that the information needs to be
propagated through any address calculation when possible.

Yes, but that shouldn't be too hard to add to the existing code.

I agree,

Cheers,
   -- Mon Ping