Unused malloc/free don't get optimized

Hi, i have some code that allocate some memory, store the pointer to a
variable, read it back and deallocates it, like this:

int %main(int %argc, ubyte** %argv)
{
%c_19 = alloca ubyte*
%malloc_206 = malloc ubyte, uint 10
store ubyte* %malloc_206, ubyte** %c_19
%tmp_207 = load ubyte** %c_19
free ubyte* %tmp_207

ret int 0
}

i expected the optimized to remove everything, but after running it
the code i get is:

int %main(int %argc, ubyte** %argv) {
  %malloc_206 = malloc [10 x ubyte]
  %malloc_206.sub = getelementptr [10 x ubyte]* %malloc_206, int 0, int 0
  free ubyte* %malloc_206.sub
  ret int 0
}

Why didn't he optimized it out? and where did that getelementptr came from?
I have the feeling that i'm missing something.

Hi, i have some code that allocate some memory, store the pointer to a
variable, read it back and deallocates it, like this:

ok

i expected the optimized to remove everything, but after running it
the code i get is:

int %main(int %argc, ubyte** %argv) {
  %malloc_206 = malloc [10 x ubyte]
  %malloc_206.sub = getelementptr [10 x ubyte]* %malloc_206, int 0, int 0
  free ubyte* %malloc_206.sub
  ret int 0
}

Why didn't he optimized it out?

This looks like a simple missed optimization.

and where did that getelementptr came from?

This is just passing a pointer to the first byte of the array into the free instruction. This has the same semantics as freeing the pointer to the array, since they are the same value.

I have the feeling that i'm missing something.

Nope, I don't think so. Please feel free to file an enhancement request for us to catch this case!

-Chris

I've made some other test and it looks like it don't remove even
simple malloc/free couple. Maybe there's something wrong in the way i
use the opt command. How can i tell him to optimize at best, whitout
specifing each pass? I'm not using c/c++ frontend, just the assembler.

Thanks in advance

Hi Nicola,

I've made some other test and it looks like it don't remove even
simple malloc/free couple. Maybe there's something wrong in the way i
use the opt command.

No, there's not. LLVM doesn't provide the transform you want. As Chris
mentioned, if you open a Bugzilla report (http://llvm.org/bugs) and ask
for the feature we are more likely to get it done than if you don't.

How can i tell him to optimize at best, whitout
specifing each pass? I'm not using c/c++ frontend, just the assembler.

The opt tool recently acquired a new option (--std-compile-opts) which
performs the set of optimizations that gccas used to provide. This set
of options is intended for use with code compiled by llvm-gcc. You might
also find it useful. However, the set of passes to run to produce the
best code depends on a lot of factors. The source language is one of
those factors. The program compiled is another. In general there is no
way to come up with the "ultimate" set of passes that produce the best
results for all inputs. However, --std-compile-opts is a close
approximation for C-like languages compiled by llvm-gcc.

Reid.

Reid Spencer wrote:

I've made some other test and it looks like it don't remove even
simple malloc/free couple. Maybe there's something wrong in the way i
use the opt command.

No, there's not. LLVM doesn't provide the transform you want. As Chris
mentioned, if you open a Bugzilla report (http://llvm.org/bugs) and ask
for the feature we are more likely to get it done than if you don't.

That's surprising to me. I thought there was a pass that converts
malloc's that trivially dominate all free's and whose pointer doesn't
escape would be lowered to alloca's -- but I looked and couldn't find one.

Why isn't there one? Because it wouldn't be profitable? Or because the
alias analysis is too complex? Or just because nobody's gotten to it yet?

I would've thought that this pattern would occur rather often. I know
I've written code that strdup's a value, reads/writes it, then frees it.
After inlining the strdup, changing the malloc/free to alloca should be
easy.

That said, you'd have to be careful to ensure that the malloc-free pair
isn't inside of a loop before converting it to alloca. Regardless, I'd
be interested in implementing this in a pass if noone else does.

Nick Lewycky

That's surprising to me. I thought there was a pass that converts
malloc's that trivially dominate all free's and whose pointer doesn't
escape would be lowered to alloca's -- but I looked and couldn't find one.

nope, there isn't one.

Why isn't there one? Because it wouldn't be profitable? Or because the
alias analysis is too complex? Or just because nobody's gotten to it yet?

the last one: "nobody's gotten to it yet" :slight_smile:

I would've thought that this pattern would occur rather often. I know
I've written code that strdup's a value, reads/writes it, then frees it.
After inlining the strdup, changing the malloc/free to alloca should be
easy.

Perhaps. I haven't see any C code where this occurs frequently, but the pypy people would like to have this.

That said, you'd have to be careful to ensure that the malloc-free pair
isn't inside of a loop before converting it to alloca. Regardless, I'd
be interested in implementing this in a pass if noone else does.

Right, also be careful not to turn 'large' allocations into large stack objects, etc.

-Chris

Hi,

That’s surprising to me. I thought there was a pass that converts
malloc’s that trivially dominate all free’s and whose pointer doesn’t
escape would be lowered to alloca’s – but I looked and couldn’t find one.

I implemented a pass like this a while back. It’s fairly sophisticated and relies on DSA to generate a call graph and do some other things. At one point I gave the pass to the PyPy folks; they reported that it worked except for a case that DSA couldn’t handle – I think it was variable length arrays. DSA may be able to handle that case now. I’m happy to give you my code if you want to use it and/or work it up for submission into LLVM. If you don’t want the dependence on DSA, you could probably easily modify it to make it simpler and more conservative.

Rob

Robert L. Bocchino Jr.

Ph.D. Student

University of Illinois, Urbana-Champaign

Robert L. Bocchino Jr. wrote:

Hi,

That's surprising to me. I thought there was a pass that converts
malloc's that trivially dominate all free's and whose pointer doesn't
escape would be lowered to alloca's -- but I looked and couldn't find
one.

I implemented a pass like this a while back. It's fairly sophisticated
and relies on DSA to generate a call graph and do some other things. At
one point I gave the pass to the PyPy folks; they reported that it
worked except for a case that DSA couldn't handle -- I think it was
variable length arrays. DSA may be able to handle that case now. I'm
happy to give you my code if you want to use it and/or work it up for
submission into LLVM. If you don't want the dependence on DSA, you
could probably easily modify it to make it simpler and more conservative.

Sounds good! I'd like to see the code, even though I can't promise that
I'll do anything with it in the near term.

I'd rather not depend on DSA in particular. If it could depend on
AliasAnalysis then it would grow stronger or weaker depending on what
aliasing analysis was loaded at the time. Are there any special features
of DSA that you're using?

Nick Lewycky

Sounds good! I’d like to see the code, even though I can’t promise that
I’ll do anything with it in the near term.

OK. I’ve sent you the code in a separate email.

I’d rather not depend on DSA in particular. If it could depend on
AliasAnalysis then it would grow stronger or weaker depending on what
aliasing analysis was loaded at the time. Are there any special features
of DSA that you’re using?

I glanced at the code again, and it looks like I use DSA to (1) construct the call graph and (2) identify things that would be unsafe to put on the stack, such as arrays, cyclic data structures, and allocations with escaping references. Right now these parts are pretty heavily dependent on DSA – e.g., they make explicit use of the various DSA graphs. I’m sure you could modify the code to use AliasAnalysis instead of DSA, but I’m not sure what would be involved in that.

Rob

Robert L. Bocchino Jr.

Ph.D. Student

University of Illinois, Urbana-Champaign

Unfortunately, I don't think the AliasAnalysis interface gives a way to check whether allocations escape a function. In DSA, you can do this because there is an explicit points-to graph: you can find all objects escaping "upwards" from a function by traversing the graph, starting at globals, formal arguments, and return values.

--Vikram

Vikram S. Adve wrote:

I glanced at the code again, and it looks like I use DSA to (1)
construct the call graph and (2) identify things that would be
unsafe to put on the stack, such as arrays, cyclic data structures,
and allocations with escaping references. Right now these parts
are pretty heavily dependent on DSA -- e.g., they make explicit use
of the various DSA graphs. I'm sure you could modify the code to
use AliasAnalysis instead of DSA, but I'm not sure what would be
involved in that.

Unfortunately, I don't think the AliasAnalysis interface gives a way
to check whether allocations escape a function. In DSA, you can do
this because there is an explicit points-to graph: you can find all
objects escaping "upwards" from a function by traversing the graph,
starting at globals, formal arguments, and return values.

I think you could use "getModRefInfo(CallSite, Value *, unsigned)".

"getModRefInfo (for call sites) - Return whether information about
whether a particular call site modifies or reads the memory specified by
the pointer."

If I read that right, it can only return no modify when the called
function and its callees never modify the pointer; BasicAA will check
whether the target is a pure function.

Whether the implementation is efficient or accurate is another matter.

Nick Lewycky

Vikram S. Adve wrote:

I glanced at the code again, and it looks like I use DSA to (1)
construct the call graph and (2) identify things that would be
unsafe to put on the stack, such as arrays, cyclic data structures,
and allocations with escaping references. Right now these parts
are pretty heavily dependent on DSA -- e.g., they make explicit use
of the various DSA graphs. I'm sure you could modify the code to
use AliasAnalysis instead of DSA, but I'm not sure what would be
involved in that.

Unfortunately, I don't think the AliasAnalysis interface gives a way
to check whether allocations escape a function. In DSA, you can do
this because there is an explicit points-to graph: you can find all
objects escaping "upwards" from a function by traversing the graph,
starting at globals, formal arguments, and return values.

I think you could use "getModRefInfo(CallSite, Value *, unsigned)".

I may be missing your point, but that doesn't tell you if an object allocated *within* a function escapes the function.

"getModRefInfo (for call sites) - Return whether information about
whether a particular call site modifies or reads the memory specified by
the pointer."

If I read that right, it can only return no modify when the called
function and its callees never modify the pointer; BasicAA will check
whether the target is a pure function.

Whether the implementation is efficient or accurate is another matter.

Nick Lewycky

--Vikram

Vikram S. Adve wrote:

I think you could use "getModRefInfo(CallSite, Value *, unsigned)".

I may be missing your point, but that doesn't tell you if an object
allocated *within* a function escapes the function.

No, I missed your point. I thought you could look at the allocations
within a function then use that method on all calls involving that
alloca as an argument, but that doesn't do what I wanted.

I have some hope for getModRefBehavior which returns a container of
PointerAccessInfo, but I haven't yet decided if that actually does what
I mean either.

Worst case, the optz'n just falls back to degraded behaviour without DSA.

Nick Lewycky