[LICM][MemorySSA] Converting LICM pass to use MemorySSA to avoid AliasSet collapse issue

Hi All,

I’m looking into converting LICM to use MemorySSA instead of AliasSets to determine when it is safe to hoist/sink/promote loads and stores to get around the issue of alias set collapse (see discussion [1]). I have a prototype implementation, but have run into two issues that I could use input from the designers of MemorySSA to resolve:

  1. Is MemorySSA intended to be maintained across passes? More specifically how is it intended to interact with the Pass Manager? In my current prototype I’m just building it at the start of LICM.

  2. What is the intended method for dealing with changes to the function that invalidate the MemorySSA form? I see that there is a way to remove nodes from the MemorySSA graph, but for other transformations (e.g. hoisting a load) am I supposed to just update the MemorySSA graph myself?

[1] http://lists.llvm.org/pipermail/llvm-dev/2015-April/084881.html

Hi All,

I’m looking into converting LICM to use MemorySSA instead of AliasSets to
determine when it is safe to hoist/sink/promote loads and stores to get
around the issue of alias set collapse (see discussion [1]).

Thank you

I have a prototype implementation, but have run into two issues that I
could use input from the designers of MemorySSA to resolve:

1) Is MemorySSA intended to be maintained across passes? More
specifically how is it intended to interact with the Pass Manager? In my
current prototype I’m just building it at the start of LICM.

I am working on it, it will be preserved/destroyed like any analysis pass.

2) What is the intended method for dealing with changes to the
function that invalidate the MemorySSA form? I see that there is a way to
remove nodes from the MemorySSA graph, but for other transformations (e.g.
hoisting a load) am I supposed to just update the MemorySSA graph myself?

For hoisting loads, i have an API that will work abotu to be under review
over here:
http://reviews.llvm.org/D18090
a load hoist will look something like this to update:

  // We also hoist operands of loads using this function, so check to see if
  // this is really a memory access before we try to update MemorySSA for
it.
  MemoryAccess *HoistCandAccess = MSSA->getMemoryAccess(HoistCand);
  if (HoistCandAccess) {
    MemoryUseOrDef *MUD = cast<MemoryUseOrDef>(HoistCandAccess);
    // What is happening here is that we are creating the hoisted access
    // and destroying the old accesses.
    MSSA->createMemoryAccess(HoistedInst, MUD->getDefiningAccess(), BB,
                             MemorySSA::End);
    MSSA->removeMemoryAccess(HoistCandAccess);
    MSSA->removeMemoryAccess(MSSA->getMemoryAccess(ElseInst));
  }

(this is "create a new access with this memorydef as it's definition, kill
these two old loads").

A store sink is slightly harder, but it will be similar.

Comments welcome!

Hi Daniel,

Thanks for the info. I’ve started looking into converting EarlyCSE to use MemorySSA first since 1) I don’t think it needs any additional MemorySSA update API and 2) the particular case I’m looking at needs EarlyCSE to catch more load cases before LICM to be profitable.

I have a prototype working, but have run into two issues:

  1. readonly calls are treated as clobbers by MemorySSA which leads to extra walking of MemoryDefs to not regress some EarlyCSE test cases. This isn’t a huge deal, I’m just wondering if it is intentional or something that just hasn’t been gotten to yet.

  2. There seems to be a bug in the CachingMemorySSAWalker invalidation causing it to return MemoryAccess nodes that have been removed. In the case I’m seeing, a call node is removed from MemorySSA which causes CachingMemorySSAWalker::invalidateInfo() to clear the CachedUpwardsClobberingCall map. However, this same call node is present as a value in the CachedUpwardsClobberingAccess map, so when a later query is made with the load whose immediate def used to be the call, the removed call node is returned. The simple fix below resolves the issue, but may be too big of a hammer?

@@ -802,17 +836,8 @@ void CachingMemorySSAWalker::invalidateInfo(MemoryAccess *MA) {

doCacheRemove(MA, Q, Q.StartingLoc);

return;

}

  • // If it is not a use, the best we can do right now is destroy the cache.

  • bool IsCall = false;

Hi Daniel,

Thanks for the info. I’ve started looking into converting EarlyCSE to use
MemorySSA first since 1) I don’t think it needs any additional MemorySSA
update API and 2) the particular case I’m looking at needs EarlyCSE to
catch more load cases before LICM to be profitable.

I have a prototype working, but have run into two issues:

1) readonly calls are treated as clobbers by MemorySSA which leads
to extra walking of MemoryDefs to not regress some EarlyCSE test cases.
This isn’t a huge deal, I’m just wondering if it is intentional or
something that just hasn’t been gotten to yet.

George is working on the optimizations, of which this is one.
I think this is one of the ones his current patch (under review) addresses.

2) There seems to be a bug in the CachingMemorySSAWalker
invalidation causing it to return MemoryAccess nodes that have been
removed. In the case I’m seeing, a call node is removed from MemorySSA
which causes CachingMemorySSAWalker::invalidateInfo() to clear the
CachedUpwardsClobberingCall map. However, this same call node is present
as a value in the CachedUpwardsClobberingAccess map,

Unless i'm missing something, this should not have happened, and we should
assert they are not being added to the cache.

The truth is the caching parts are complicated and ugly. It was meant to be
a pretty simple cache, but it's known to be inefficient (memory wise) and
it's on the list of things to clean up and make sane.

Do you have a testcase where this happens?

A quick glance says we check whether it's a call in all the right places,
which means there must be a place we are not *setting* isCall properly.

  1. Sounds good. This isn’t holding me up so I’ll just try to keep an eye out for these changes.

  2. I’ve attached an example IR file and debug log of where the caching is going bad. It depends on my changes to EarlyCSE, but hopefully it is clear from the debug output what is going on. Let me know if there is a better way to get this repro case to you. Also, I’ll be on IRC for the next couple of hours if you would like to have a quicker discussion.

memssa-remove-cache-bug.ll (263 Bytes)

memssa-remove-cache-bug.log (2.66 KB)

Hi!

readonly calls are treated as clobbers by MemorySSA which leads to extra walking of MemoryDefs to not regress some EarlyCSE test cases

Can you share a testcase for this too, please? Specifically, the first test in test/Transforms/Util/MemorySSA/function-mem-attrs.ll shows that we should treat readonly calls + calls to readonly functions as MemoryUses, so if we’re thinking they’re clobbers somehow, that sounds like a bug…

Thank you!
George

Oh, crap.
I wasn’t thinking hard enough about the case you described.

Okay. So your patch is right as the conservative thing to do.
The real problem here is that we don’t know what cache entries point to other cache entries.

We do in some special cases (If the use list contains only memoryuses, we know all the cache entries that need invalidation), but not in general.

Given most queries are about loads and not stores, and loads don’t even need to be cached after memoryssa is built anyway (they will already directly point to the nearest clobbering definition), i think we should just apply your patch.

Okay, do you think this case needs a unittest? I think I can construct one by comparing the results from getClobberingMemoryAccess before and after a call to removeMemoryAccess to make sure they’re different, but I don’t know how much of a pain it will be to construct the test IR programmatically.

Okay, do you think this case needs a unittest?

I would think the right answer is to have a verifyRemoved(X), and call it
with expensive checks on (from invalidateInfo)

Have it verify that, after invalidation, no removed memoryAccesses appear
in the cache.

This would be a quite expensive check with the current caching scheme, but
probably helpful for debugging.

I think I can construct one by comparing the results from
getClobberingMemoryAccess before and after a call to removeMemoryAccess to
make sure they’re different, but I don’t know how much of a pain it will be
to construct the test IR programmatically.

Constructing the testIR programmatically is pretty easy, but i don't know
that a unittest without access to the internals will be able to verify the
cache state.

Hi George,

After digging a little deeper, it appears that readonly calls showing up as MemoryDefs is only happening on an EarlyCSE test that is using the new pass manager (test/Transforms/EarlyCSE/basic.ll test5 if you’re curious), so I suspect it is an issue with the new pass manager setup code for either MemorySSA, my changes to EarlyCSE, the test run command line or something else not related to the MemorySSA code proper.

It sounds like basicaa is not getting added somehow.

Yes, that was it, the problem goes away when I add -aa-pipeline=basic-aa to the test run line. I don’t know enough about the new pass manager yet to say whether that argument should need to be passed or not, but I’ll just add this to my list of pass manager interaction issues to get resolved with this change.

Thanks,