[RFC] dereferenceable metadata

Hi,
While working on PR21780, I used "dereferenceable_or_null" metadata
and I realized now that it is not correct for my solution to use this
metadata type since it might point to an address that it is not
dereferenceable but null. I think that we need another new metadata
type, something like "dereferenceable" with that we could annotate
any load (not just pointer type like with dereferenceable_or_null).
For example, we could annotate this load : "%ld0 = load double,
double* %ptr, align 8" with dereferenceable<2> and that means that for
%ptr address memory with length 16-bytes is known to be
accessible(dereferenceable). Originally in PR21780, InstCombine pass
removed some loads as dead(in a Scalar IR form) and later that
prevented us to vectorize the operation, but before removing such
loads we could annotate the remaining loads in the series with
"dereferenceable" and later restore IR in a way that is suitable for
vectorization(see ⚙ D35139 [SLPVectorizer][InstCombine] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shuffles ). Also, please note
that’s the information that is lost when InstCombine kills the last
load in the series and there is no way to restore this information
later in following passes.
                                         Thanks, Dinar.

When a pointer is passed to load/store, it's already implicit that it is dereferenceable for the size of the access (otherwise it would be undefined behavior).
Isn't that information sufficient for your use case? (sorry, didn't read the whole thread carefully).

Nuno

Indeed. But the problem here is that Dinar is trying to keep information after a load/store is removed by instcombine

For example:

v4sf v = {p[0], p[1], p[2], p[3]};
v4sf v2 = shuffle(v, 0, 0, 2, 2);

Some pass comes in and removes the p[3] and p[1].

Now you have smaller code, but lost the ability to use a vector load for all those values + shuffle. The code got scalarized because we lost the information that p[3] is valid.

The attribute on a value/load/something might not be ideal (especially since we’d be changing other loads (ones we didn’t remove)). But I think Dinar wanted to start a conversation about how we can keep this information around even after we delete some loads.

Thank you,

Filipe

it's already implicit that it is dereferenceable for the size of the access (otherwise it would be undefined behavior).

The values we are loading after restoring those loads are not matter
for us and it is just that chain of loads for vectorization that
important for us. And those load would never get rematerialization if
vectorization is not profitable for the shuffle operation. Once we
remove the load in instcombine and there is no way to know that memory
is dereferenceable from the previous load in the chain of loads.
                                Thanks, Dinar.

it's already implicit that it is dereferenceable for the size of the access (otherwise it would be undefined behavior).

The values we are loading after restoring those loads are not matter
for us and it is just that chain of loads for vectorization that
important for us. And those load would never get rematerialization if
vectorization is not profitable for the shuffle operation. Once we
remove the load in instcombine and there is no way to know that memory
is dereferenceable from the previous load in the chain of loads.

I think you're looking for some kind of "dereferenceable assumption" or "phantom load" that we could put in the IR when we remove an actual load/store to preserve dereferencability information. I think this would need to be an intrinsic, not metadata, because you're interested in putting it at some point in the control flow (e.g. above a loop). There are other use cases for this kind of thing as well, but there are costs to keeping this kind of information, and we'd need to consider this carefully.

  -Hal

I think you're looking for some kind of "dereferenceable assumption" or "phantom load" that we could put in the IR when we remove an actual load/store to preserve dereferencability information. I think this would need to be an intrinsic, >not metadata, because you're interested in putting it at some point in the control flow (e.g. above a loop). There are other use cases for this kind of thing as well, but there are costs to keeping this kind of information, and we'd need to >consider this carefully.

Thanks, Hal. Looks good for now.
               Thanks, Dinar.

it’s already implicit that it is dereferenceable for the size of the access (otherwise it would be undefined behavior).

The values we are loading after restoring those loads are not matter
for us and it is just that chain of loads for vectorization that
important for us. And those load would never get rematerialization if
vectorization is not profitable for the shuffle operation. Once we
remove the load in instcombine and there is no way to know that memory
is dereferenceable from the previous load in the chain of loads.

I think you’re looking for some kind of “dereferenceable assumption” or “phantom load” that we could put in the IR when we remove an actual load/store to preserve dereferencability information. I think this would need to be an intrinsic, not metadata, because you’re interested in putting it at some point in the control flow (e.g. above a loop). There are other use cases for this kind of thing as well, but there are costs to keeping this kind of information, and we’d need to consider this carefully.

Here is another use case I encountered in the past. InstCombine sinks a load from a block that executed unconditionally in a loop to a conditional block because the load is only used there. Then we fail to vectorize because we think that that would make the load unconditional (which it originally was).

Adam