Can someone give me some pointer on alias analysis ?

I’m trying to fix that bug: https://llvm.org/bugs/show_bug.cgi?id=20049

It turns out this is the kind of optimization that I really need, as when it isn’t done, all kind of other optimizations opportunities down the road are not realized as they are not exposed.

I have no idea where to start digging for this. I assume there is some kind of interaction between memory dependency and alias analysis that is overly conservative. Can someone gives me some infos on how the 2 interact together ? What is the code that is supposed to remove these kind of loads ?

First, it looks like you’re using an older version of LLVM (the load syntax has changed). That’s definitely not recommended since it will greatly limit your choices when encountering an issue like this. Second, the aliasing problem has to do with the effects of the “allocmemory” callsite on a memory location associated with an earlier alloca. There’s nothing that prevents your allocmemory function from writing to global state. You have to teach the alias analysis that an unescaped noalias pointer can’t alias the global state allocmemory might access. Slightly surprised we don’t get this today, but oh well. Take a look at the isNonEscapingLocalObject predicate in BasicAA. Then look at “getModRefInfo(ImmutableCallSite CS, const MemoryLocation &Loc)”. Double check to make sure this is the one that MDA actually calls. Third, you might want to take a look the new Fifth, some pointers on debugging this yourself. GVN (which is the one which does complicated *-load forwarding) calls MDA (MemoryDepedenceAnalysis). Using the appropriate -debug-only options will likely get you to the right area. You can also consider using the MemDepPrinter pass to bypass GVN. I don’t know of a way to issue raw AA queries for testing. That would be useful, but I don’t know that we have it. Hope that helps.

I'm trying to fix that bug: https://llvm.org/bugs/show_bug.cgi?id=20049

It turns out this is the kind of optimization that I really need, as when
it isn't done, all kind of other optimizations opportunities down the road
are not realized as they are not exposed.

I have no idea where to start digging for this. I assume there is some
kind of interaction between memory dependency and alias analysis that is
overly conservative. Can someone gives me some infos on how the 2 interact
together ? What is the code that is supposed to remove these kind of loads ?

First, it looks like you're using an older version of LLVM (the load
syntax has changed). That's definitely not recommended since it will
greatly limit your choices when encountering an issue like this.

I just reused an old code sample. The problem still exists in recent
version of LLVM.

Second, the aliasing problem has to do with the effects of the
"allocmemory" callsite on a memory location associated with an earlier
alloca. There's nothing that prevents your allocmemory function from
writing to global state. You have to teach the alias analysis that an
unescaped noalias pointer can't alias the global state allocmemory might
access. Slightly surprised we don't get this today, but oh well. Take a
look at the isNonEscapingLocalObject predicate in BasicAA. Then look at
"getModRefInfo(ImmutableCallSite CS, const MemoryLocation &Loc)". Double
check to make sure this is the one that MDA actually calls.

I don't think this is the problem. When there is only 2 calls to
allocmemory, loads are optimized away as expected. But it seems that the
analysis is confused with 3+ calls.

Third, you might want to take a look the new inaccessiblememonly
attribute. Depending on how you're modelling your allocation, this might
help resolve the aliasing problem in different way. However, be aware that
this is *very* new. In particular, I triggered an optimizer bug by adding
the new attribute to your particular example. :frowning:

Fourth, have you considered implementing a simple escape analysis pass? I
notice that the store-load forwarding would just fall out once you removed
the last allocation. I believe the fixed point would then become a
function which returns the constant answer and does nothing else.

Yes I did. More specifically, there is a pass that try to recognize memory
allocation like calls and optimize them, which right now have libc and some
overloads of operator new hardcoded in it but not much more. This could be
greatly improved IMO, to be language agnostic (and better support C++ in
the process), but really, I do think relying on this to get the load
eliminated, when noalias already provide this information to the optimizer,
is overkill.

Fifth, some pointers on debugging this yourself. GVN (which is the one
which does complicated *-load forwarding) calls MDA
(MemoryDepedenceAnalysis). Using the appropriate -debug-only options will
likely get you to the right area. You can also consider using the
MemDepPrinter pass to bypass GVN. I don't know of a way to issue raw AA
queries for testing. That would be useful, but I don't know that we have
it.

Hope that helps.

Yes, thanks. I'll probably come back with more question once I've played
with these options a bit. Sorry for the late answer, I was mostly off the
grid the past 2 weeks.

After a bit more investigation, it turns out that because %0 is stored into %1 (after bitcast) and so %3 may have access to it and clobber it.

After a bit of thought, it is correct in the general case, but definitively something stricter is needed here. Looking at inaccessiblememonly I’m not sure this is what is needed. What if the memory allocator is defined is the current module ?

This leads me to conclude this is way more linked to the memory allocation pass than I expected it to be in the first place. Can I ask what you plan to use inaccessiblememonly for ? Should the semantic be refined to fit the bill better ?

It turns out we already have something along these lines in InstCombine. Take a look at visitAllocSite in InstructionCombining.cpp. It’s rough and good use some refinement, but its a good starting place for a generalized escape analysis.

Can you give a bit more context? I’m not sure which of the examples you’re talking about. At the moment, inaccessiblememonly would require separate compilation of the allocation function. Well, I didn’t introduce the attribute, so I can’t speak for the original intent. For me, I plan on applying it to some of our out of line allocation functions and other helper routines which modify runtime state, but not java visible state. If you have specific suggestions for how to refine the semantics, please make them. Getting the details right is always the hard part. :slight_smile: You might also consider using a variant of your allocation function which takes a pointer to the global state it needs to modify. Doing this would allow you to use argmemonly to restrict the aliasing while still allowing whole program optimization. I haven’t tried this in practice, but it seems like it would probably work…

After a bit more investigation, it turns out that because %0 is stored
into %1 (after bitcast) and so %3 may have access to it and clobber it.

Can you give a bit more context? I'm not sure which of the examples
you're talking about.

Sure. Let's look at define i32 @_D8test01824mainFMZi() {body: %0 = tail call noalias i32* @allo - Pastebin.com

Because of the store line 7, it is assumed that the call line 8 may see %0
and even modify the memory it points to. As a result, it is assumed that
the load line 11 may not be eliminated.

Which seems actually correct in the general case.

After a bit of thought, it is correct in the general case, but
definitively something stricter is needed here. Looking at
inaccessiblememonly I'm not sure this is what is needed. What if the
memory allocator is defined is the current module ?

At the moment, inaccessiblememonly would require separate compilation of
the allocation function.

This leads me to conclude this is way more linked to the memory allocation
pass than I expected it to be in the first place. Can I ask what you plan
to use inaccessiblememonly for ? Should the semantic be refined to fit
the bill better ?

Well, I didn't introduce the attribute, so I can't speak for the original
intent. For me, I plan on applying it to some of our out of line
allocation functions and other helper routines which modify runtime state,
but not java visible state.

If you have specific suggestions for how to refine the semantics, please
make them. Getting the details right is always the hard part. :slight_smile:

You might also consider using a variant of your allocation function which
takes a pointer to the global state it needs to modify. Doing this would
allow you to use argmemonly to restrict the aliasing while still allowing
whole program optimization. I haven't tried this in practice, but it seems
like it would probably work...

I do not wish to make suggestion before I understand where this is coming
from. So far, from what I've collected, use cases are:
- Memory allocation
- Runtime isolation for managed languages.

I have some more though to put into this, but to boot, would that be
possible to only use this attribute on method that are declared, but not
defined and remove it when merging modules ? It doesn't look like it is
necessary to have it when the function may be exposed depending on the way
the software is built.

We can imagine a function defined in the current module, that does not modify any global, but calls malloc. Could it be inferred the argmemonly?

I meant inaccessiblememonly instead of argmemonly…

Can’t the GlobalsAA pass figure that out without requiring the annotation ?

I believe that it should be yes. (If we had malloc marked as inaccessiblememonly which we don’t currently.)

This seems like a restatement of what I said in my original response: "You have to teach the alias analysis that an unescaped noalias pointer can’t alias the global state allocmemory might access. Slightly surprised we don’t get this today, but oh well. " Or am I missing something? This seems semi reasonable. I haven’t thought through the implications, but it might be worth considering. Er, not sure what you meant here. Philip

After a bit more investigation, it turns out that because %0 is stored
into %1 (after bitcast) and so %3 may have access to it and clobber it.

Can you give a bit more context? I'm not sure which of the examples
you're talking about.

Sure. Let's look at define i32 @_D8test01824mainFMZi() {body: %0 = tail call noalias i32* @allo - Pastebin.com

Because of the store line 7, it is assumed that the call line 8 may see %0
and even modify the memory it points to. As a result, it is assumed that
the load line 11 may not be eliminated.

Which seems actually correct in the general case.

This seems like a restatement of what I said in my original response:
"You have to teach the alias analysis that an unescaped noalias pointer
can't alias the global state allocmemory might access. Slightly surprised
we don't get this today, but oh well. "

Or am I missing something?

No, you are not missing anything. I was simply able to confirm this by
dumping the guts of GVN.

It doesn't look like it is necessary to have it when the function may be
exposed depending on the way the software is built.

Er, not sure what you meant here.

I'm afraid that, depending on the way the software is built, you may end up
with a function marked as not accessing any visible memory can end up
modifying visible memory after modules are merged. I think this is the kind
of thing that may be exposed by LTO (I may be wrong on that one), this is
also something that can be exposed by languages like D, where people are
going to compile by D module, D packages or the whole program at once.