[RFC] Introducing the maxobjsize attribute

Hi All,

We’ve prepared a new attribute maxobjsize(<n>) that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The maxobjsize of an object, and number of dereferenceable bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

These changes are in:

  • D87975 - [IR] Introduce MaxObjSize Attribute
  • D87978 - [Attributor] Adds deduction for the MaxObjSize Attribute
  • D88353 - [BasicAA] Integrate MaxobjSize for NoAlias

These are the Statistics changes for CTMark without the actual deduction (https://reviews.llvm.org/D88353#2301597):
CHANGED: branch-folder NumHoist 438 → 431 ( -1.598%)
CHANGED: codegenprepare NumBlocksElim 16093 → 15885 ( -1.292%)
CHANGED: codegenprepare NumExtsMoved 6373 → 6439 ( +1.036%)
CHANGED: gvn IsValueFullyAvailableInBlockNumSpeculationsMax 6746 → 6858 ( +1.660%)
CHANGED: gvn NumGVNInstr 78434 → 79330 ( +1.142%)
CHANGED: instcombine NumReassoc 22830 → 23213 ( +1.678%)
CHANGED: instsimplify NumSimplified 21278 → 21495 ( +1.020%)
CHANGED: licm NumPromoted 407 → 497 ( +22.113%)
CHANGED: loop-rotate NumNotRotatedDueToHeaderSize 37 → 35 ( -5.405%)
CHANGED: loop-simplify NumNested 126 → 128 ( +1.587%)
CHANGED: machinelicm NumPostRAHoisted 131 → 134 ( +2.290%)
CHANGED: memory-builtins ObjectVisitorLoad 96077 → 97496 ( +1.477%)
CHANGED: regalloc NumDCEFoldedLoads 38 → 37 ( -2.632%)
CHANGED: regalloc NumLaneConflicts 4408 → 4332 ( -1.724%)
CHANGED: regalloc NumReloadsRemoved 1062 → 1050 ( -1.130%)
CHANGED: regalloc NumSnippets 1168 → 1152 ( -1.370%)
CHANGED: regalloc NumSpillsRemoved 672 → 665 ( -1.042%)
CHANGED: stack-slot-coloring NumDead 14 → 18 ( +28.571%)
CHANGED: twoaddressinstruction NumConvertedTo3Addr 27054 → 26695 ( -1.327%)

These are the Statistic Changes in CTMark w/O3 before/after these patches (https://reviews.llvm.org/D87978#2307622):
CHANGED: codegenprepare NumExtsMoved 3631 → 3699 ( +1.873%)
CHANGED: dse NumFastOther 192 → 194 ( +1.042%)
CHANGED: gvn IsValueFullyAvailableInBlockNumSpeculationsMax 4958 → 5060 ( +2.057%)
CHANGED: gvn NumGVNInstr 46657 → 47534 ( +1.880%)
CHANGED: jump-threading NumDupes 91 → 92 ( +1.099%)
CHANGED: licm NumMovedLoads 6272 → 6344 ( +1.148%)
CHANGED: licm NumPromoted 381 → 438 ( +14.961%)
CHANGED: loop-rotate NumNotRotatedDueToHeaderSize 31 → 29 ( -6.452%)
CHANGED: machinelicm NumPostRAHoisted 88 → 89 ( +1.136%)
CHANGED: memdep NumCacheNonLocalPtr 1005887 → 1016671 ( +1.072%)
CHANGED: memory-builtins ObjectVisitorLoad 62048 → 63473 ( +2.297%)
CHANGED: peephole-opt NumCmps 532 → 526 ( -1.128%)
CHANGED: regalloc NumDCEFoldedLoads 27 → 26 ( -3.704%)
CHANGED: regalloc NumLocalSplits 1891 → 1870 ( -1.111%)

Feedback Welcome.

Atmn and Johannes

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

We already perform exactly this deduction in BasicAA right now, except that max object size is not made explicit.
You just assumed the allocated object is 16 bytes and therefore it will imply `maxobjsize(16)` while the underlying memory region is `dereferenceable(32)`.
You cannot have both in our object-driven model.

~ Johannes

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

We already perform exactly this deduction in BasicAA right now, except that max object size is not made explicit.
You just assumed the allocated object is 16 bytes and therefore it will imply `maxobjsize(16)` while the underlying memory region is `dereferenceable(32)`.
You cannot have both in our object-driven model.

Then, as I stated, at minimum, you have a wording problem. In my example, I can reasonable state the object size is 16 bytes. If it's not legal to also state "maxobjectsize(16)" you need to either a) pick another attribute name, or b) be very very pedantic about defining the terminology. I'll note that neither the original email or your response does the later.

I'll also note that the existing LangRef wording for dereferenceability says nothing about object sizes. You seem to be implying the opposite, but I don't follow your claim as it doesn't seem to match the actual wording.

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

We already perform exactly this deduction in BasicAA right now, except that max object size is not made explicit.
You just assumed the allocated object is 16 bytes and therefore it will imply `maxobjsize(16)` while the underlying memory region is `dereferenceable(32)`.
You cannot have both in our object-driven model.

Then, as I stated, at minimum, you have a wording problem. In my example, I can reasonable state the object size is 16 bytes. If it's not legal to also state "maxobjectsize(16)" you need to either a) pick another attribute name, or b) be very very pedantic about defining the terminology. I'll note that neither the original email or your response does the later.

I'll also note that the existing LangRef wording for dereferenceability says nothing about object sizes. You seem to be implying the opposite, but I don't follow your claim as it doesn't seem to match the actual wording.

The problem you describe, at least the way I understand it, is not with maxobjsize but with the current (and here proposed) used of dereferenceable.

We do (for a while now) use dereferenceable as a lower bound of an object without specifying so in the lang ref. (I don't oppose that we do though).
At the same time we would (conceptually) extend dereferenceable for a pointer if we know a second object is placed right behind the first.
I don't think that can ever happen if the addresses are not given by the user though.

Going back to the allocator example, the allocator should not say it allocated 32 bytes or it should say the object it allocated is 32 bytes.
Having only one of the two breaks our model. I'm unsure what to change for this though because I think maxobjsize in this example is 32 if the
allocator returns 32 usable bytes, or the dereferenceable is and will be 16 if the bytes are not usable.

Maybe it's just late and I miss your point :frowning:

~ Johannes

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

We already perform exactly this deduction in BasicAA right now, except that max object size is not made explicit.
You just assumed the allocated object is 16 bytes and therefore it will imply `maxobjsize(16)` while the underlying memory region is `dereferenceable(32)`.
You cannot have both in our object-driven model.

Then, as I stated, at minimum, you have a wording problem. In my example, I can reasonable state the object size is 16 bytes. If it's not legal to also state "maxobjectsize(16)" you need to either a) pick another attribute name, or b) be very very pedantic about defining the terminology. I'll note that neither the original email or your response does the later.

I'll also note that the existing LangRef wording for dereferenceability says nothing about object sizes. You seem to be implying the opposite, but I don't follow your claim as it doesn't seem to match the actual wording.

The problem you describe, at least the way I understand it, is not with maxobjsize but with the current (and here proposed) used of dereferenceable.

We do (for a while now) use dereferenceable as a lower bound of an object without specifying so in the lang ref. (I don't oppose that we do though).

If the implementation of the optimizer disagrees with the specification (LangRef), we should fix one or the other. In particular, we should do that *before* extending the problematic implementation. In this particular case, I believe the specification is correct and we may need to revisit problematic optimizations. (Well, given current knowledge; I may change my mind with more info.)

Can you give a pointer to the specific bit of optimizer reasoning you're concerned by? I think it would help to discuss this concretely instead of abstractly.

At the same time we would (conceptually) extend dereferenceable for a pointer if we know a second object is placed right behind the first.
I don't think that can ever happen if the addresses are not given by the user though.

It can. Consider the vectorizor deciding to version a loop for alignment purposes. I'm not 100% sure we actually do this today (haven't checked), but it's definitely a common enough transform to be considered. Now, this is a slightly different case as we'd have a contextual fact about the larger deref, but not a base fact on the object, but I'd really not want to split that hair. Combining that hair splitting with (say) the returned attribute on some call down the dominated path would get error prone real fast.

Going back to the allocator example, the allocator should not say it allocated 32 bytes or it should say the object it allocated is 32 bytes.
Having only one of the two breaks our model. I'm unsure what to change for this though because I think maxobjsize in this example is 32 if the
allocator returns 32 usable bytes, or the dereferenceable is and will be 16 if the bytes are not usable.

I disagree with this. The size of the object is the size of the object. This may for instance be a call to malloc with a runtime commutable size. It's entirely reasonable and expected to both support "minimum dereferenceable span is 32 bytes", and "can allocate objects of arbitrary size".

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

We already perform exactly this deduction in BasicAA right now, except that max object size is not made explicit.
You just assumed the allocated object is 16 bytes and therefore it will imply `maxobjsize(16)` while the underlying memory region is `dereferenceable(32)`.
You cannot have both in our object-driven model.

Then, as I stated, at minimum, you have a wording problem. In my example, I can reasonable state the object size is 16 bytes. If it's not legal to also state "maxobjectsize(16)" you need to either a) pick another attribute name, or b) be very very pedantic about defining the terminology. I'll note that neither the original email or your response does the later.

I'll also note that the existing LangRef wording for dereferenceability says nothing about object sizes. You seem to be implying the opposite, but I don't follow your claim as it doesn't seem to match the actual wording.

The problem you describe, at least the way I understand it, is not with maxobjsize but with the current (and here proposed) used of dereferenceable.

We do (for a while now) use dereferenceable as a lower bound of an object without specifying so in the lang ref. (I don't oppose that we do though).

If the implementation of the optimizer disagrees with the specification (LangRef), we should fix one or the other. In particular, we should do that *before* extending the problematic implementation. In this particular case, I believe the specification is correct and we may need to revisit problematic optimizations. (Well, given current knowledge; I may change my mind with more info.)

Can you give a pointer to the specific bit of optimizer reasoning you're concerned by? I think it would help to discuss this concretely instead of abstractly.

At the same time we would (conceptually) extend dereferenceable for a pointer if we know a second object is placed right behind the first.
I don't think that can ever happen if the addresses are not given by the user though.

It can. Consider the vectorizor deciding to version a loop for alignment purposes. I'm not 100% sure we actually do this today (haven't checked), but it's definitely a common enough transform to be considered. Now, this is a slightly different case as we'd have a contextual fact about the larger deref, but not a base fact on the object, but I'd really not want to split that hair. Combining that hair splitting with (say) the returned attribute on some call down the dominated path would get error prone real fast.

Going back to the allocator example, the allocator should not say it allocated 32 bytes or it should say the object it allocated is 32 bytes.
Having only one of the two breaks our model. I'm unsure what to change for this though because I think maxobjsize in this example is 32 if the
allocator returns 32 usable bytes, or the dereferenceable is and will be 16 if the bytes are not usable.

I disagree with this. The size of the object is the size of the object. This may for instance be a call to malloc with a runtime commutable size. It's entirely reasonable and expected to both support "minimum dereferenceable span is 32 bytes", and "can allocate objects of arbitrary size".

I honestly do not understand your points. I don't say allocators cannot allocate objects of arbitrary size. I say that if you allocate an object of size X it has X dereferenceable bytes and maxobjsize is X. It is wrong to mark you allocator (manually since we don't do it for you) the way you described it earlier, e.g., that the "object" is smaller than the "allocated memory that is visible/usable". That doesn't mean you cannot allocate more memory, but it cannot be exposed outside the allocator. (FWIW, similarly, `noalias` for allocator return values only works because we cannot look into them.)

I also did not follow the vectorizer comment. The vectorizer, nor any other pass, can assume relative ordering between objects. That would break everything. It can extend or merge objects/allocations but that is going to impact both dereferenceable as well as maxobjsize.

Can you provide a small example where you think things go wrong, with or without this attribute? I know you asked me for an example but as I said, the issue I described should not occur in practice as we don't allow users to specify the addresses of objects in IR (only later).

~ Johannes

Hi All,

We've prepared a new attribute `maxobjsize(<n>)` that tracks the maximum size of the object that a pointer points to. This attribute will be deduced through the Attributor framework and it is used for aliasing queries. The `maxobjsize` of an object, and number of `dereferenceable` bytes can be used as upper and lower bounds on the object size, and if there is no overlap, we can determine that the underlying objects cannot alias. Basically, an object that is at most N bytes long is not aliasing one that is at least N+1 bytes long.

This is commingling two separate concerns. At minimum, a wording clarification is needed, it's possible the proposed use case does not work.

Deferenceability is the amount of space which can be accessed without a runtime fault. Of key importance is the dereferenceability is disconnected from object size. There may be space beyond an object which is dereferenceable, but outside the object.

As a simple example, imagine an allocator which allocates 32 byte blocks of memory, 32 byte aligned. If the actual object allocated is only 16 bytes, the pointer is still known to be 32 byte aligned and deref for 32 bytes. The contents past the object are simply unspecified.

Saying that a 32 byte derefenceable pointer doesn't alias one with a maximum object size of 16 bytes would be wrong and lead to miscompiles in practice.

We already perform exactly this deduction in BasicAA right now, except that max object size is not made explicit.
You just assumed the allocated object is 16 bytes and therefore it will imply `maxobjsize(16)` while the underlying memory region is `dereferenceable(32)`.
You cannot have both in our object-driven model.

Then, as I stated, at minimum, you have a wording problem. In my example, I can reasonable state the object size is 16 bytes. If it's not legal to also state "maxobjectsize(16)" you need to either a) pick another attribute name, or b) be very very pedantic about defining the terminology. I'll note that neither the original email or your response does the later.

I'll also note that the existing LangRef wording for dereferenceability says nothing about object sizes. You seem to be implying the opposite, but I don't follow your claim as it doesn't seem to match the actual wording.

The problem you describe, at least the way I understand it, is not with maxobjsize but with the current (and here proposed) used of dereferenceable.

We do (for a while now) use dereferenceable as a lower bound of an object without specifying so in the lang ref. (I don't oppose that we do though).

If the implementation of the optimizer disagrees with the specification (LangRef), we should fix one or the other. In particular, we should do that *before* extending the problematic implementation. In this particular case, I believe the specification is correct and we may need to revisit problematic optimizations. (Well, given current knowledge; I may change my mind with more info.)

Can you give a pointer to the specific bit of optimizer reasoning you're concerned by? I think it would help to discuss this concretely instead of abstractly.

At the same time we would (conceptually) extend dereferenceable for a pointer if we know a second object is placed right behind the first.
I don't think that can ever happen if the addresses are not given by the user though.

It can. Consider the vectorizor deciding to version a loop for alignment purposes. I'm not 100% sure we actually do this today (haven't checked), but it's definitely a common enough transform to be considered. Now, this is a slightly different case as we'd have a contextual fact about the larger deref, but not a base fact on the object, but I'd really not want to split that hair. Combining that hair splitting with (say) the returned attribute on some call down the dominated path would get error prone real fast.

Going back to the allocator example, the allocator should not say it allocated 32 bytes or it should say the object it allocated is 32 bytes.
Having only one of the two breaks our model. I'm unsure what to change for this though because I think maxobjsize in this example is 32 if the
allocator returns 32 usable bytes, or the dereferenceable is and will be 16 if the bytes are not usable.

I disagree with this. The size of the object is the size of the object. This may for instance be a call to malloc with a runtime commutable size. It's entirely reasonable and expected to both support "minimum dereferenceable span is 32 bytes", and "can allocate objects of arbitrary size".

I honestly do not understand your points. I don't say allocators cannot allocate objects of arbitrary size. I say that if you allocate an object of size X it has X dereferenceable bytes and maxobjsize is X. It is wrong to mark you allocator (manually since we don't do it for you) the way you described it earlier, e.g., that the "object" is smaller than the "allocated memory that is visible/usable". That doesn't mean you cannot allocate more memory, but it cannot be exposed outside the allocator. (FWIW, similarly, `noalias` for allocator return values only works because we cannot look into them.)

I also did not follow the vectorizer comment. The vectorizer, nor any other pass, can assume relative ordering between objects. That would break everything. It can extend or merge objects/allocations but that is going to impact both dereferenceable as well as maxobjsize.

Can you provide a small example where you think things go wrong, with or without this attribute? I know you asked me for an example but as I said, the issue I described should not occur in practice as we don't allow users to specify the addresses of objects in IR (only later).

Johannes,

I'm going to have to reverse this on you. I have expressed concern with the design. Please treat those concerns as blocking review comments. (i.e. please do not land until reconciled) I have tried to explain in writing, and that doesn't seem to be working. Let's talk offline.

Philip

Slight change of plans.

After talking to Philip, we agreed that it's unclear if we want to use
`dereferenceable` for the object-based property or the "hardware-based"
property. So if it is the extend remaining of the allocated object or
of the memory region that can be accessed without causing observable
behavior.

Since the use of `dereferenceable` was out of convenience, it is already
existing, there is a good argument to be made to go for the other option
instead: Create a "min-object-size" attribute matching the one described
below. Instead of introducing two attributes we'll go with a single one
that takes two values instead, a lower and upper bound for the minimal
and maximal object size can always be provided. Given the way attributes
are implemented right now, the min and max size will probably be limited
to ~2^32, though that seems OK to me. (Bigger objects just don't get an
attribute.)

The code will be updated shortly and the logic in BasicAA will be adjusted
to use the minimal object size not the dereferenceable bytes. We'll also
teach the ObjectSize deduction about the new attribute.

Please let us know if you have concerns with this, any feedback is welcome.

~ Johannes