mem2reg for non entry blocks?

Sorry if this has been discussed before, but I would appreciate any pointers.

I am trying to understand why mem2reg only looks at allocas in entry blocks, and not for any allocas in a function. One case where allocas could be used to build local data structures like linked list (and so on make it unsafe), and for that the existing conditions in IsAllocaPromotable (i.e. the alloca pointer cannot escape or be stored or cast etc) should guard against that, regardless of the position of the alloca. right? Is there a reason for this?

Thanks
Vinod

Hi Vinod,

Hi Vinod,

Sorry if this has been discussed before, but I would appreciate any
pointers.
I am trying to understand why mem2reg only looks at allocas in entry
blocks, and
not for any allocas in a function. One case where allocas could be used
to build
local data structures like linked list (and so on make it unsafe), and
for that
the existing conditions in IsAllocaPromotable (i.e. the alloca pointer
cannot
escape or be stored or cast etc) should guard against that, regardless of
the
position of the alloca. right? Is there a reason for this?

an alloca outside of the entry block might be inside a loop, in which case
the
semantics are that it would allocate more stack space on every loop
iteration.
I think some of the optimizers that run later try to move allocas into the
entry
block if possible, but in general it is simpler to have the front-end just
put
them there in the first place.

Mem2reg is already changing that semantic, though. If I use an "alloca
i32" in the entry block, then I am saying I want 4 bytes of stack space,
but mem2reg may replace that with registers.

Hi Justin,

    an alloca outside of the entry block might be inside a loop, in which case the
    semantics are that it would allocate more stack space on every loop iteration.
    I think some of the optimizers that run later try to move allocas into the entry
    block if possible, but in general it is simpler to have the front-end just put
    them there in the first place.

Mem2reg is already changing that semantic, though. If I use an "alloca i32" in
the entry block, then I am saying I want 4 bytes of stack space, but mem2reg may
replace that with registers.

the problem isn't with mem2reg changing the amount of used stack space, it's
that those semantics get in the way of the mem2reg transform. For example,
as you get new stack space each time round the loop, a write to the alloca
can't be retrieved by reading it back from the alloca next time round the
loop, because it isn't the same memory. This is quite different to how things
work if the alloca is in the entry block.

Ciao, Duncan.

Hi Justin,

    an alloca outside of the entry block might be inside a loop, in which case the
    semantics are that it would allocate more stack space on every loop iteration.
    I think some of the optimizers that run later try to move allocas into the entry
    block if possible, but in general it is simpler to have the front-end just put
    them there in the first place.

Mem2reg is already changing that semantic, though. If I use an "alloca i32" in
the entry block, then I am saying I want 4 bytes of stack space, but mem2reg may
replace that with registers.

the problem isn't with mem2reg changing the amount of used stack space, it's
that those semantics get in the way of the mem2reg transform. For example,
as you get new stack space each time round the loop, a write to the alloca
can't be retrieved by reading it back from the alloca next time round the
loop, because it isn't the same memory. This is quite different to how things
work if the alloca is in the entry block.

I think it might be useful to separate what is possible versus what is done for practicality.

I believe that there is no technical reason for mem2reg to restrict itself to allocas within the entry block. It should be possible to promote allocas not within the entry block into SSA registers provided that they meet certain restrictions. Off the top of my head, those restrictions are ensuring that the alloca is not in a loop and ensuring that the returned pointer doesn't escape the function. I suspect that allocas that don't dominate all of the basic blocks in the function might require special handling.

My guess is that mem2reg limits itself to allocas within the entry block because that is where nearly all allocas that can be converted reside, and there's little benefit (at least for C/C++) in trying to promote allocas that aren't in the entry block. Mem2reg was originally built so that front-ends (namely the original llvm-gcc) wouldn't have to do SSA construction; since llvm-gcc put its allocas in the entry block, there probably wasn't a need to promote other allocas.

If you want to promote allocas outside the entry block, you can probably implement an algorithm to do it. I think mem2reg doesn't do it because it hasn't been worth the trouble.

-- John T.

My understanding of the current implementation and (limited) knowledge of LLVM IR leads me to believe that if the current mem2reg was modifed to not be restricted to allocas in entry block then it would still work correctly, and possibly catch more cases. I think that this is because the function IsAllocaPromotable(…) checks for direct loads and stores from the pointer returned by alloca and if the pointer is used in other ways it doesnt promote the loads and stores of the alloca pointer. So after scalar promotion of such loads and stores the alloca would be dead.

Can someone confirm that or point out some case wher that wouldnt be correct.

Thanks
Vinod