Removing dead code

Dear guys,

    I am working in a register allocator for LLVM, and I realized that,
after I perform register allocation, there is many move instructions that
are dead code, and can safely be removed. It is easy for the RA algorithm
to remove these instructions. It seems to me that the only instructions
with dead definitions that I should not remove are the calls. Is it true?
I would like to know if a code like this below is safe, that is, besides
call instructions, is there other instructions that must stay in the code
even if their definitions are dead?

MachineInstr * mi = iter;
opCode = // get the opcode of mi
if(!mi.isCall(opCode)) {
    mbb.remove(iter);
}

Thank you,

Fernando

You can't do that unless you can prove the instructions don't have side effects, which you can't. Higher-level passes will remove dead code. Are you seeing a case where dead code is making it down to the codegen level?

-Chris

> It seems to me that the only instructions
> with dead definitions that I should not remove are the calls. Is it true?
> I would like to know if a code like this below is safe, that is, besides
> call instructions, is there other instructions that must stay in the code
> even if their definitions are dead?
>
> MachineInstr * mi = iter;
> opCode = // get the opcode of mi
> if(!mi.isCall(opCode)) {
> mbb.remove(iter);
> }

You can't do that unless you can prove the instructions don't have side
effects, which you can't. Higher-level passes will remove dead code. Are
you seeing a case where dead code is making it down to the codegen level?

-Chris

I think so. LLVM is producing code like this one here, before RA:

> > It seems to me that the only instructions
> > with dead definitions that I should not remove are the calls. Is it true?
> > I would like to know if a code like this below is safe, that is, besides
> > call instructions, is there other instructions that must stay in the code
> > even if their definitions are dead?
> >
> > MachineInstr * mi = iter;
> > opCode = // get the opcode of mi
> > if(!mi.isCall(opCode)) {
> > mbb.remove(iter);
> > }
>
> You can't do that unless you can prove the instructions don't have side
> effects, which you can't. Higher-level passes will remove dead code. Are
> you seeing a case where dead code is making it down to the codegen level?
>
> -Chris
>

I think so. LLVM is producing code like this one here, before RA:

That is because the physical register has to be copied to a virtual
register. Without the copy, physical registers could be alive across
basic blocks. There is a function you can call for each target to tell
you if an instruction is a copy.

Andrew

where %reg1032 is dead.

Right. One of the jobs of the register allocator is to coallesce register copies. Once coallesced, they can be removed.

I'm removing these instructions. In Linear scan, they are removed too. I'm removing all the dead definitions from instructions that are not function calls, and the resulting programs seem to work fine. The ratio

I suggest you do more testing. For copies, this is clearly fine, but for general instructions, it isn't.

of these instructions is about 1:20, that is, for each 20 instructions (in PPC) the RA could remove 1 dead definition.

Again, how many of these are copies? We expect that the register allocator will make copies dead and remove them. Are you removing any non-copy instructions?

-Chris

> where %reg1032 is dead.

Right. One of the jobs of the register allocator is to coallesce register
copies. Once coallesced, they can be removed.

> I'm removing these instructions. In Linear scan, they are removed too.
> I'm removing all the dead definitions from instructions that are not
> function calls, and the resulting programs seem to work fine. The ratio

I suggest you do more testing. For copies, this is clearly fine, but for
general instructions, it isn't.

Again, how many of these are copies? We expect that the register
allocator will make copies dead and remove them. Are you removing any
non-copy instructions?

I think this is not a coalescing issue. The coalesced copies are removed
by the code generator of LLVM, and do not appear in the final assembly
code. I think that these dead definitions are due to function calls that
return void. In the programs that I tested, the only dead definitions were
in copy instructions (but I am not able yet to run the RA in big
programs). I will do as you guys suggested and will replace the test for
non-function call for a test for copy instructions. Actually, there are
other instructions besides CALLs that can also produce side effects, such
as DIVs.

Fernando