Expose aliasing information in getModRefInfo (or viceversa?)

Hi,

This came up in https://reviews.llvm.org/D38569, and I’d like some input on what’s the best way to get alias and mod-ref info without having two alias calls.

A couple of ideas:
(a) Extend the getModRefInfo interface (+getModRefBehavior, +gerArgModRefInfo) to return a pair {ModRefInfo, AliasResult}.

The AliasResult can be optional based on an argument
e.g.:
struct MRI_AR { ModRefInfo MRI, AliasResult AR };
MRI_AR getModRefInfoAlias (LoadInst *LI, MemoryLocation Loc, bool SetAliasResultField);

Add wrapper APIs to preserve current calls.
e.g.:
ModRefInfo getModRefInfo (LoadInst *LI, MemoryLocation Loc) {
return getModRefInfoAlias (LI, Loc, false).MRI;
}

(b) From talking offline with George, introducing a MRI_MustMod in ModRefInfo.

Open question: How to handle callsites.

In terms of whether this is worth doing, as a preliminary timing test I timed the llvm bootstrap build with 1 vs 2 alias calls in D38569: instructionClobbersQuery:296, and got the following:

2 alias calls:
real 62m52.627s
user 2769m46.964s
sys 17m48.072s
1 alias call:
real 62m56.659s
user 2766m40.452s
sys 17m46.312s

Thoughts?

Thanks,
Alina

FWIW: Bootstrap is probably not a good test of this, there are bugs filed where we end up with tons of loads and stores to test against each other. That’s actually fairly rare in bootstrap, as you can see.
Let me get you some test cases.

My guess is that we should go with mustmod.

As for callsites, adding mustmod works for call, memloc and call, call testing.

FWIW: Bootstrap is probably not a good test of this, there are bugs filed
where we end up with tons of loads and stores to test against each other.
That's actually fairly rare in bootstrap, as you can see.
Let me get you some test cases.

SG, thanks!

I agree. -Hal

FWIW: Bootstrap is probably not a good test of this, there are bugs filed
where we end up with tons of loads and stores to test against each other.
That's actually fairly rare in bootstrap, as you can see.
Let me get you some test cases.

My guess is that we should go with mustmod.

I agree.

Ok, will give this approach a shot.

FWIW: Bootstrap is probably not a good test of this, there are bugs
filed where we end up with tons of loads and stores to test against each
other. That's actually fairly rare in bootstrap, as you can see.
Let me get you some test cases.

SG, thanks!

I ran a few quick timings of "opt -memoryssa" for gvn_hoist.small.bc in
PR28670/PR28832, and a larger version extracted then.

Reporting:
mssalimit / single call getModRef / call getModRef followed by
alias().

Smaller test hits the case with 2 alias calls 1282 times. Timings, average
over 5-10 runs (s):
100 / 8.99 / 8.87
200 / 9.24 / 9.113
500 / 48.228 / 48.453

Larger case hits it 1872 times. Timings, average over 5 runs (s):
100 / 23.575 / 23.962
200 / 23.874 / 23.848

My guess is that we should go with mustmod.

Yes, this is odd.

On my clang.bc

Without:
2.2967 ( 53.8%) 0.0242 ( 26.4%) 2.3210 ( 53.2%) 2.3227 ( 53.2%) Memory SSA

2.3364 ( 53.7%) 0.0246 ( 25.7%) 2.3610 ( 53.1%) 2.3636 ( 53.1%) Memory SSA

2.3353 ( 54.0%) 0.0258 ( 27.0%) 2.3611 ( 53.4%) 2.3632 ( 53.3%) Memory SSA

With two getModRefInfo calls:
3.0302 ( 58.8%) 0.0328 ( 29.9%) 3.0630 ( 58.2%) 3.0858 ( 58.2%) Memory SSA

3.0097 ( 58.9%) 0.0325 ( 30.0%) 3.0422 ( 58.3%) 3.0590 ( 58.3%) Memory SSA
3.0486 ( 58.8%) 0.0317 ( 29.4%) 3.0804 ( 58.2%) 3.1331 ( 58.3%) Memory SSA

with alias followed by getModRefInfo
2.2487 ( 52.9%) 0.0259 ( 27.1%) 2.2746 ( 52.4%) 2.2820 ( 52.4%) Memory SSA

etc

Not entirely sure what is going on.

But if alias is free, problem solved!

I should have clarified, by 2 calls meant getModRef followed by alias, not two getModRef calls.
The timings you got seem to concur, but I’m quite skeptical about the alias call being free…

I’m trying to understand what is the result we’d seek in the example in D38569 (pasting here for quick access)

double f(double a)
 {
   double b;
   double c,d;
   double (*fp) (double) __attribute__ ((const));

   /* Partially redundant call */
   if (a < 2.0)
     {
       fp = sin;
       c = fp (a);
     }
   else
     {
       c = 1.0;
       fp = cos;
     }
   d = fp (a);
   return d + c;
 }

the fp call will get pushed into the else branch.
In the equivalent loop, we could promote it (PRE could promote it too if you let it speculate)
If we determine it is a must-alias of a single thing, we could do the same.
If we determine it’s a must-alias of a set of things, we could do the same
If we determine it’s a must-alias of the live on entry def, etc

If I understand correctly, the necessary info for fp in this example is:
isMustAlias (CallSite argument a, Value a) && getModRefBehavior(CallSite)==onlyReadsMemory.
So adding a MRI_MustMod is insufficient?

Are we instead looking to set a MRI_Must bit, disjunct of MRI_Mod, and test for MRI_Ref&MRI_Must or MRI_Mod&MRI_Must?
In getModRefInfo(CS, Loc), the MRI_Must bit would then be set if doesAccessArgPointees and ArgAlias == MustAlias for all Args, which seems correct.

I may be missing something here, I don’t want to dive into the wrong implementation.

I'm trying to understand what is the result we'd seek in the example
in D38569 (pasting here for quick access)

double f(double a)
{
   double b;
   double c,d;
   double (*fp) (double) __attribute__ ((const));

   /* Partially redundant call */
   if (a < 2.0)
     {
       fp = sin;
       c = fp (a);
     }
   else
     {
       c = 1.0;
       fp = cos;
     }
   d = fp (a);
   return d + c;
}

the fp call will get pushed into the else branch.
In the equivalent loop, we could promote it (PRE could promote it too if
you let it speculate)
If we determine it is a must-alias of a single thing, we could do the same.
If we determine it's a must-alias of a set of things, we could do the same
If we determine it's a must-alias of the live on entry def, etc

If I understand correctly, the necessary info for fp in this example is:
isMustAlias (CallSite argument a, Value a) &&
getModRefBehavior(CallSite)==onlyReadsMemory.
So adding a MRI_MustMod is insufficient?

Sigh
I should have taken the time to give a better example.
The must-alias part is irrelevant to an example (it only requires
read-onlyness)

You said "LICM doesn't move calls, so we'd never really care about
must-alias for promotion". I was just pointing out other things move calls
any may want to know.

If you want an example where the must-alias part would matter:

*a = something
foo(a)
b = *a

If foo mustalias a (and only a) not only can you move foo with a, you can
actually clone foo here, change it to be pass-by-value, and promote the
argument inside of it (if you wanted to).

So you can use this info to, for example, do interprocedural promotion.

Are we instead looking to set a MRI_Must bit, disjunct of MRI_Mod, and
test for MRI_Ref&MRI_Must or MRI_Mod&MRI_Must?

Yes.

In getModRefInfo(CS, Loc), the MRI_Must bit would then be set if

doesAccessArgPointees and ArgAlias == MustAlias for all Args, which seems
correct.

alias == MustAlias for Loc, not for all args.
(IE It it returns a singular result about Loc, not a result about all args)

To get the set answer for all args, we'd have to query further.

Sigh
I should have taken the time to give a better example.
The must-alias part is irrelevant to an example (it only requires
read-onlyness)

You said "LICM doesn't move calls, so we'd never really care about
must-alias for promotion". I was just pointing out other things move calls
any may want to know.

If you want an example where the must-alias part would matter:

*a = something
foo(a)
b = *a

If foo mustalias a (and only a) not only can you move foo with a, you can
actually clone foo here, change it to be pass-by-value, and promote the
argument inside of it (if you wanted to).

So you can use this info to, for example, do interprocedural promotion.

Are we instead looking to set a MRI_Must bit, disjunct of MRI_Mod, and
test for MRI_Ref&MRI_Must or MRI_Mod&MRI_Must?

Yes.

I didn't mean to pick on the example, sorry if that's how it came through.

Since the consensus is to expose the Must info in ModRefInfo, I'm trying to
figure out how to add it in a way that makes sense to me.
The way I see ModRefInfo is designed right now is to lower the lattice to
NoModRef as fast as possible (start with ModRef as top, get to NoModRef as
bottom). The implementation is based on having either Mod or Ref and
masking out everything else.
Introducing a Must bit, means setting it occasionally (since May is
conservative) and then preserving it, so the opposite: start lattice at
bottom, set to top.

What I was trying, that *somewhat* made sense:
enum ModRefInfo {
  MRI_NoModRef = 0,
  MRI_Ref = 1,
  MRI_Mod = 2,
  MRI_ModRef = MRI_Ref | MRI_Mod,
  MRI_Must = 4,
  MRI_MustRef = MRI_Ref | MRI_Must,
  MRI_MustMod = MRI_Mod | MRI_Must,
  MRI_MustModRef = MRI_ModRef | MRI_Must
};
// + shift values in FunctionModRefLocation to 8, 16, 32.

Recursive masking of MRI_Ref/MRI_Mod would get replaced by
MRI_MustRef/MRI_MustMod.
But the top of the lattice is still MRI_ModRef.
While the implementation details *may* be ok to resolve, there are calls
checking for equality to MRI_Ref or MRI_Mod (not &), so adding the
occasional Must bit seems bad.
So I guess my question is, what's the right approach here? I feel like I'm
not on the right path.

In getModRefInfo(CS, Loc), the MRI_Must bit would then be set if

doesAccessArgPointees and ArgAlias == MustAlias for all Args, which seems
correct.

alias == MustAlias for Loc, not for all args.
(IE It it returns a singular result about Loc, not a result about all args)

To get the set answer for all args, we'd have to query further.

Yes, that's what I meant. In getModRefInfo(CS, Loc) there is a loop over
all args, it checks alias() for each one.

I don’t see this as a major problem. Please feel free to fix these places by replacing the equality checks with mask checks. -Hal

Sigh

I should have taken the time to give a better example.
The must-alias part is irrelevant to an example (it only requires
read-onlyness)

You said "LICM doesn't move calls, so we'd never really care about
must-alias for promotion". I was just pointing out other things move calls
any may want to know.

If you want an example where the must-alias part would matter:

*a = something
foo(a)
b = *a

If foo mustalias a (and only a) not only can you move foo with a, you can
actually clone foo here, change it to be pass-by-value, and promote the
argument inside of it (if you wanted to).

So you can use this info to, for example, do interprocedural promotion.

Are we instead looking to set a MRI_Must bit, disjunct of MRI_Mod, and
test for MRI_Ref&MRI_Must or MRI_Mod&MRI_Must?

Yes.

I didn't mean to pick on the example, sorry if that's how it came through.

Since the consensus is to expose the Must info in ModRefInfo, I'm trying
to figure out how to add it in a way that makes sense to me.
The way I see ModRefInfo is designed right now is to lower the lattice to
NoModRef as fast as possible (start with ModRef as top, get to NoModRef as
bottom). The implementation is based on having either Mod or Ref and
masking out everything else.
Introducing a Must bit, means setting it occasionally (since May is
conservative) and then preserving it, so the opposite: start lattice at
bottom, set to top.

What I was trying, that *somewhat* made sense:
enum ModRefInfo {
  MRI_NoModRef = 0,
  MRI_Ref = 1,
  MRI_Mod = 2,
  MRI_ModRef = MRI_Ref | MRI_Mod,
  MRI_Must = 4,
  MRI_MustRef = MRI_Ref | MRI_Must,
  MRI_MustMod = MRI_Mod | MRI_Must,
  MRI_MustModRef = MRI_ModRef | MRI_Must
};
// + shift values in FunctionModRefLocation to 8, 16, 32.

Recursive masking of MRI_Ref/MRI_Mod would get replaced by
MRI_MustRef/MRI_MustMod.
But the top of the lattice is still MRI_ModRef.
While the implementation details *may* be ok to resolve, there are calls
checking for equality to MRI_Ref or MRI_Mod (not &), so adding the
occasional Must bit seems bad.

I don't see this as a major problem. Please feel free to fix these places
by replacing the equality checks with mask checks.

Ok, thank you!

Re-opening this discussion, to get some feedback.

Adding alias info is under review in https://reviews.llvm.org/D38862.

An issue that came up, that was bugging me before is how to reason of what is top/bottom of the lattice, and what is the default to test against.
So talking offline is Sanjoy, we reached a slightly different conclusion which makes more sense to me.

Current patch has:

enum ModRefInfo {
MRI_NoModRef = 0,
MRI_Ref = 1,
MRI_Mod = 2,
MRI_ModRef = MRI_Ref | MRI_Mod,
MRI_Must = 4,
MRI_MustRef = MRI_Ref | MRI_Must,
MRI_MustMod = MRI_Mod | MRI_Must,
MRI_MustModRef = MRI_ModRef | MRI_Must
};

Proposed change:

enum ModRefInfo {

MRI_Must = 0,
MRI_MustRef = 1,
MRI_MustMod = 2,
MRI_MustModRef = MRI_MustRef | MRI_MustMod
MRI_NoModRef = 4,
MRI_Ref = MRI_NoModRef | MRI_MustRef , /* Same semantics as right now, i.e. MayRef /
MRI_Mod = MRI_NoModRef | MRI_MustMod , /
Same semantics as right now, i.e. MayMod /
MRI_ModRef = MRI_Ref | MRI_Mod, /
Same semantics as right now, i.e. May Ref or Mod */

};

With this change:

  • the same approach of “set a bit to 0 when additional info is available” will apply to the Must bit, as it does to Ref and Mod.

  • we could keep the same checks with MRI_NoModRef

  • MRI_ModRef remains the most conservative answer (top).

  • finding MRI_Must gives more info than MRI_NoModRef, so it makes sense to be bottom.

  • MRI_NoModRef means “no mod or ref, and no must alias”.

The only obvious change I see right now will be to to add " | MRI_NoModRef", essentially setting the default to “not must alias”.
For GlobalsModRef, we can also always set MRI_NoModRef bit.

I may be missing details here, happy to elaborate.

Happy Thanksgiving!

Alina

Hi Alina,

My only concern with that design is that it seems that you can go from MustModRef into Ref or Mod, and that is not true.

Assuming my understanding of what ModRef & friends mean is correct, this is the lattice (where green are the official names, and black are my comments):

AFAIU, MustModRef means that an operation must read and write to the given location. Moreover, it must alias with that allocation.

Therefore, we cannot go from MustModRef into MayRef, because MayRef implies there’s no write; there’s at most a read.

What confused me first is that Mod has 2 “mays”: may read, and if it does it may be to the given location.

While MustMod has 2 “musts”: must read, and it must read exactly from the given location.

Your lattice doesn’t have the intermediate values (1 may + 1 must, like MustModMayAlias), but that would increase significantly the size of the lattice. I don’t know which clients would benefit of that precision increase (if any) – didn’t think about that.

Nuno

In your new proposal, doing & on the result of getModRef() may yield unexpected results.
Agreed. I made the change I proposed locally, and, while it simplifies some cases, it makes other bit-wise operations look unintuitive.

Maybe we should just hide all that in inline functions or something and make it an enum class
Noted, and looking into this option. Hoping a couple of static functions (e.g. mods(), refs(), isMust(), addMust()) will be more intuitive than the bit-wise ops.
Should also make it easier to understand and prove.

Alina

> In your new proposal, doing & on the result of getModRef() may yield
unexpected results.
Agreed. I made the change I proposed locally, and, while it
simplifies some cases, it makes other bit-wise operations look unintuitive.

> Maybe we should just hide all that in inline functions or something and
make it an enum class
Noted, and looking into this option. Hoping a couple of static functions
(e.g. mods(), refs(), isMust(), addMust()) will be more intuitive than the
bit-wise ops.
Should also make it easier to understand and prove.

All,

I put D40749 <https://reviews.llvm.org/D40749&gt; up for review. It's only a
NFC and adding the Must bit can be rebased on top of it.

It does not do full replacement of bit-wise operations across all AAs, but
I'm hoping to get some feedback on the approach, and I can update the patch
to add the changes for the other AAs.

Thanks,
Alina

> In your new proposal, doing & on the result of getModRef() may yield
unexpected results.
Agreed. I made the change I proposed locally, and, while it
simplifies some cases, it makes other bit-wise operations look unintuitive.

> Maybe we should just hide all that in inline functions or something
and make it an enum class
Noted, and looking into this option. Hoping a couple of static functions
(e.g. mods(), refs(), isMust(), addMust()) will be more intuitive than the
bit-wise ops.
Should also make it easier to understand and prove.

All,

I put D40749 <https://reviews.llvm.org/D40749&gt; up for review. It's only a
NFC and adding the Must bit can be rebased on top of it.

It does not do full replacement of bit-wise operations across all AAs, but
I'm hoping to get some feedback on the approach, and I can update the patch
to add the changes for the other AAs.

Thanks,
Alina

Following D40933 <https://reviews.llvm.org/D40933&gt;, ModRefInfo is an enum
class.
Since this change is potentially disruptive, I will wait pushing the patch
adding the Must bit until this change is sure to stick.