Working with X86 registers in MachineInstr

Hi all,

I am attempting to implement the “reaching definitions” data-flow algorithm on (X86) MachineBasicBlocks for an analysis pass. To do this, I need to compute gen/kill sets for machine basic blocks. To start with, I am only considering the general-purpose registers, RAX-R15 and their sub-registers. Thus, I need to examine each MachineInstr to determine which register(s) it defines and/or uses.

I see in the Doxygen that for a MachineOperand, I can call isReg() and getReg() to figure out which X86 register the operand corresponds to. These return an unsigned int “register number”; but I’m not sure how to identify which register actually corresponds to that number.

Also, I will need to identify definitions and uses of registers in instructions. I see that MachineOperand has methods such as isUse(), isDef(), and isKill(), which sound like they might be relevant to what I’m doing; but neither the Doxygen nor the source are particularly helpful as to what they actually do. >From the MachineInstr documentation, I gathered that instructions which define a value are always written so that the value being defined is the first operand; but since X86 has instructions that use more than one register as output (multiplication, for instance), I would need to manually account for the semantics of each instruction. Since X86 has a great many instructions to account for, many of them obscure, I would very much prefer not to go this route if LLVM already provides it! :slight_smile:

To summarize, my questions are as follows:

  1. How can I determine the actual X86 register that a MachineOperand corresponds to?

  2. What is the best/most straightforward way to determine whether a MachineInstr defines and/or uses a particular register?

Thanks,

Ethan Johnson

Ethan J. Johnson

Computer Science PhD student, Systems group, University of Rochester

ejohns48@cs.rochester.edu

ethanjohnson@acm.org

PGP pubkey available from public directory or on request

Hi Ethan,

Hi all,

I am attempting to implement the “reaching definitions” data-flow algorithm on (X86) MachineBasicBlocks for an analysis pass. To do this, I need to compute gen/kill sets for machine basic blocks. To start with, I am only considering the general-purpose registers, RAX-R15 and their sub-registers. Thus, I need to examine each MachineInstr to determine which register(s) it defines and/or uses.

There is an implementation of such algorithm in AArch64CollectLOH.cpp. I suggest you have a look there.
If the algorithm is of general use, we may want to move it into some generic place.

Let me know if you need more information.

Cheers,
-Quentin

I believe that the enum values X86:: correspond to the integers that you’re seeing. For example, X86::RAX is %rax, X86::RBX is %rbx, etc. I’m not sure if LLVM provides what you need, but I think it probably does: I believe the Tablegen files contain information on which registers are killed by each instruction so that the register allocator can do its work. You should look at the LLVM register allocator code and Quentin’s code and see what it does and what APIs LLVM provides for getting this information. In the unlikely event that LLVM does not provide the information you need, your code will need to understand the semantics of the X86 instructions and determine which are read and written itself. However, if you go that route, all is not lost; your code can initially make conservative assumptions about instructions that it does not recognize. Regards, John Criswell

Those numbers are virtual or physical registers. You can tell by using TargetRegisterInfo::isPhyscalRegister.

Also, I will need to identify definitions and uses of registers in instructions. I see that MachineOperand has methods such as isUse(), isDef(), and isKill(), which sound like they might be relevant to what I’m doing; but neither the Doxygen nor the source are particularly helpful as to what they actually do. From the MachineInstr documentation, I gathered that instructions which define a value are always written so that the value being defined is the first operand; but since X86 has instructions that use more than one register as output (multiplication, for instance), I would need to manually account for the semantics of each instruction. Since X86 has a great many instructions to account for, many of them obscure, I would very much prefer not to go this route if LLVM already provides it! :slight_smile:

To summarize, my questions are as follows:

  1. How can I determine the actual X86 register that a MachineOperand corresponds to?

  2. What is the best/most straightforward way to determine whether a MachineInstr defines and/or uses a particular register?

Walk through the operands, and check:

  • If the operand is a register.
  • If it is a use or def.

Some operands are also register masks, those give you the set of registers that are preserved through that instruction, i.e., the complement gives you a bunch of new “definitions”.

Anyway, look into the AArch64CollectLOH pass, you’ll see all of that in action. The register allocator code on the other hand is of little use for you, as the liveness information is only for virtual registers and the infrastructure does not work with allocated code.