problem trying to write an LLVM register-allocation pass

I'm trying to write a MachineFunctionPass to do register allocation. I have code that worked with an old version of LLVM. It does not work with llvm-3.1. (or various other versions that I've tried).

The first problem is that including this line:

  AU.addRequiredID(TwoAddressInstructionPassID);

in method getAnalysisUsage causes a runtime error:

Unable to schedule 'Eliminate PHI nodes for register allocation' required by 'Unnamed pass: implement Pass::getPassName()'
Unable to schedule pass
UNREACHABLE executed at ...

I'm invoking the pass like this (given input file foo.c):

clang -emit-llvm -O0 -c foo.c -o foo.bc
opt -mem2reg foo.bc > foo.ssa
mv foo.ssa foo.bc
llc -load Debug/lib/P4.so -regalloc=gc foo.bc

I've attached my entire file (it's very short). Any help would be much appreciated!

Susan Horwitz

Gcra.cpp (1.99 KB)

Hi Susan,

The meaning of “addRequired(X)” is that your pass needs X to be run, and for X to be preserved by all passes that run after X and before your pass. The PHIElemination and TwoAddressInstruction passes do not preserve each other, hence there’s no way for the pass manager to schedule them for you if you addRequire(…) them.

The trick is that CodeGen will schedule both of these passes to be run before any register allocation pass (see Passes.cpp), so you needn’t require them explicitly - you can just assume they have been run. If you just remove those lines from your getAnalysisUsage method your pass should now run as you expect.

Cheers,
Lang.

Thanks Lang!

Here's another question: I'm trying to process this input:

int main() {
   return 0;
}

but I'm getting an error
  Assertion `!Fn.getRegInfo().getNumVirtRegs() && "Regalloc must assign all vregs"' failed.

At the start of runOnMachineFunction I call Fn.getRegInfo().getNumVirtRegs();
and find that there is 1 virtual register. However, MRI->reg_empty(vreg)
tells me that it is not used or defined. So my register-allocation code never sees it, and thus can't allocate a preg for it. I tried using MRI->replaceRegWith(vreg, preg);
(where preg is available to vreg's register class) but that didn't work. When I look, the number of vregs in the function is still 1.

Can you help with this?

Thanks again!

Susan

Hi Susan,

I’m having trouble reproducing that error on my end, but I think the problem is probably that you’re not using the VirtRegRewriter infrastructure. What your allocator needs to do is populate the virtual register mapping (VirtRegMap pass) with your allocation, rather than rewriting the registers directly through MachineRegisterInfo.

Have your allocator require and preserve the VirtRegMap pass, then in your runOnMachineFunction pass grab a reference to the pass with:

VirtRegMap &vrm = getAnalysis();

You can then describe your register allocations with:

vrm.assignVirt2Phys(, )

The VirtRegRewriter pass (in VirtRegMap.cpp) will run after your allocator and apply the mapping that you described in the VirtRegMap.

I hope this helps. Let me know if it doesn’t fix your issue.

Cheers,
Lang.

Lang -

My previous problem (the failed assertion due to a vreg that was not used or defined) was because I was using a debugging version of LLVM with assertions enabled. It seems that there are "extra" assertions in that version.

When I run my register allocator in the "normal" version of LLVM, there is no problem with that simple example.

However, for other examples that have instructions that do use/define virtual registers, I'm getting bad assembly code.

I would really like to avoid changing to the "indirect" approach to register mapping that you suggest (if possible). My code using the "direct" approach worked fine with an old version of LLVM, and I would like to get it to work with llvm-3.1.

I wrote a very simple register allocator (attached) that just replaces each vreg in the code with the first preg in its class. I understand that this is not going to give me correct register usage, but it should at least give me assembly code that can be assembled. However, it does not. I get this error from the assembler:

  Error: Incorrect register `%rax' used with `l' suffix

cause by this instruction:

  cmpl $1, %rax

The input is tst.c (also attached).

Any help you can provide in understanding what is causing this problem would be much appreciated. I'm not even sure which instance of a vreg is causing the problem. I used to be able to print machine instructions by just using <<, but that no longer works. So even help with that aspect of the problem would be very helpful.

Thanks!

Susan

Gcra.cpp (5.65 KB)

tst.c (198 Bytes)

Hi again Lang,

I decided to try the approach you proposed to see whether it makes the assembly-code problem go away. Again, I tried a very simple register allocator (attached) that just calls vrm.assignVirt2Phys for every vreg in each function, mapping the vreg to the first preg in the register class. I tried two versions: one maps *every* vreg, and the other only maps those for which MRI->reg_empty(vreg) returns false. In both cases I get a core dump somewhere after my reg-allocation pass has run (when I use the "tst.c" file that I sent last time as input).

Note also that there is no VirtRegMap.h in the "include" directory of my installed llvm-3.1. I had to copy that file from the source directory. That seems suspicious.

Any thoughts?

Thanks!

Susan

Gcra.cpp (7.81 KB)

Hi Susan,

Sorry - I had missed that you’re using llvm-3.1, rather than the development branch. We encourage people to live on top-of-tree - it’s well tested, easier for active developers to offer help with, and keeping up with incremental changes is often easier than porting between stable versions.

It also sounds like you were building a Release version of LLVM. That will not have any asserts enabled (though it will have some other diagnostics). You will probably want to work with a Debug+Asserts version (/configure --disable-optimized --enable-assertions) while you’re developing your allocator and watch for any asserts that trigger.

In your case the Assertion that is triggering in PEI indicates that the MachineRegisterInfo object still contained some virtregs post register-allocation. You need to call MRI->clearVirtRegs() at the end of your allocator.

Hope this helps!

Cheers,
Lang.

I still get a coredump:

0 libLLVM-3.1.so 0x00007f0158a4e67f
1 libLLVM-3.1.so 0x00007f0158a500ca
2 libpthread.so.0 0x0000003a86c0f500
3 libLLVM-3.1.so 0x00007f01583c346c
4 libLLVM-3.1.so 0x00007f0158546349 llvm::FPPassManager::runOnFunction(llvm::Function&) + 521
5 libLLVM-3.1.so 0x00007f01585463e3 llvm::FPPassManager::runOnModule(llvm::Module&) + 51
6 libLLVM-3.1.so 0x00007f0158545fae llvm::MPPassManager::runOnModule(llvm::Module&) + 462
7 libLLVM-3.1.so 0x00007f01585460bd llvm::PassManagerImpl::run(llvm::Module&) + 125
8 llc 0x000000000040b012 main + 5218
9 libc.so.6 0x0000003a8601ecdd __libc_start_main + 253
10 llc 0x0000000000407d79
Stack dump:
0. Program arguments: llc -load Debug/lib/P4.so -regalloc=gc tst.bc
1. Running pass 'Function Pass Manager' on module 'tst.bc'.
2. Running pass 'Machine Loop Invariant Code Motion' on function '@main'
make: *** [tst.reg] Segmentation fault (core dumped)

Hi Susan,

Without debugging symbols I can’t make much out of that stack trace I’m afraid.

I’ve attached my modified version of Gcra.cpp. I built llvm 3.1 by dropping this file into lib/CodeGen, and adding references to createGcra to include/lib/CodeGen/Passes.h and include/lib/CodeGen/LinkAllCodeGenComponents.h. (If you search for createRegAllocPBQP you’ll see where to add the declarations).

With that setup, running your allocator on the tst.c file you attached previously yielded a sane assembly file.

Cheers,
Lang.

Gcra.cpp (5.75 KB)

Lang -

Your version does NOT work for me (i.e., I still get an error from the assembler when I run your code on my tst.c) unless I force compilation and assembly for a 32-bit X86 machine:

  llc -march=x86 -regalloc=gc tst.bc
  gcc -m32 tst.s

My machine is a 64-bit machine. Maybe you are working with a different architecture and that's why it worked for you?

I would be happy if the above worked in general, but when I try other C code (with my "real" register allocator, not the naive one I sent you) I get assembly that includes

     %r8d

which seems to be invalid for a 32-bit machine. Sigh. It looks to me like there's a problem with the LLVM-3.1 API for register allocation and/or the code-generation phase. What do you think?

Susan

Hi Susan,

I tested the version of Gcra.cpp that I sent you on x86-64 systems running MacOS 10.8 and Ubuntu 12.04 (Linux 3.2.0).

Could you send me the bitcode file you’re compiling? Different bitcodes (due to different clang versions or applied optimizations) could account for the different results we’re seeing. For reference I’ve attached the *.ll file that I have tested with, which was compiled from your tst.c file with:

clang -O0 -emit-llvm -S -o tst.ll tst.c

My clang version was built from a recent checkout from subversion.

It’s unlikely that there is any fundamental problem with the register allocation APIs or the code generator that would prevent you from building a working allocator. The APIs certainly could have changed in a way that would break existing allocators though.

  • Lang.

tst.ll (642 Bytes)

My tst.bc is attached. I had to use ssh to copy it from my office machine to my home laptop. In case that corrupts it, I also put a copy here:
     http://pages.cs.wisc.edu/~horwitz/LANG/tst.bc

I created the file like this:

clang -emit-llvm -O0 -c tst.c -o tst.bc
opt -mem2reg tst.bc > tst.mem2reg
mv tst.mem2reg tst.bc

Susan

tst.bc.txt (828 Bytes)

Hi Susan,

With your bitcode file I am now able to reproduce the issue you’re seeing. It looks like this is a problem with the naive rewriting from virtregs to physregs. It appears that the subreg field of physreg operands is ignored post-register allocation. In your testcase %vreg11:sub32 is being rewritten to RBX:sub32, but the :sub32 part is being quietly dropped when the assembly is written out. If this is expected behaviour, and is still happening in the development branch, then I’ll add some sort of verification to catch it.

The VirtRegMap::rewrite() method sidesteps this issue by rewriting physreg operands to remove the subreg field. The code for this is in VirtRegMap.cpp, around line 165. In short:

PhysReg = MO.getReg();
if (MO.getSubReg() != 0) {
PhysReg = TRI->getSubReg(PhysReg, MO.getSubReg());
MO.setSubReg(0);
}
MO.setReg(PhysReg);

Adding this code to Gcra fixes the assembly issue for me. I’ve attached my updated copy. Hope this helps.

Cheers,
Lang.

Gcra.cpp (6 KB)

Lang -

This is very helpful! Now all my small tests work. However, large tests are still failing. For example, with a debugging version of LLVM-3.1 I am getting this failed assertion:
  8-bit H register can not be copied outside GR8_NOREX

It seems that there are restrictions on register use that I don't know about. Can you tell me what the above means and/or where I can look to understand what the problem is and how to deal with this issue?

Thanks much!!

Susan

Hi Lang,

I looked more into one of the problems I'm now having, and I've attached 3 files:

Gcra.cpp is like your version except that for two specific vregs it uses hard-coded pregs instead of the first in the corresponding class.

bug1.c is an input that causes the failed assertion for me. If I use the non-debug version of LLVM-3.1 I instead get assembler errors like this:
   Error: can't encode register '%ah' in an instruction requiring REX prefix.

bug1.bc is my bitcode version of bug1.c.

The problematic vregs are both in register class 0. One is replaced with preg 1 and the other with preg 74. Those are both in register class 0, and are not aliased. Any idea why using those pregs causes trouble?

Thanks!

Susan

Gcra.cpp (6.13 KB)

bug1.c (250 Bytes)

bug1.bc (736 Bytes)

Hi Susan,

Sorry for the delayed response. Thanks for the test cases - I’m looking in to this now.

  • Lang.

Hi Susan,

In x86-64 the REX prefix must be used to access an extended register (r8-r15 and their aliases), but cannot be used when accessing the high byte of the ABCD regs (AH, BH, CH, DH). In your test case you have hardcoded %vreg1 to R8B, and %vreg15 to AH, and the test case contains a copy between these registers. The copy simultaneously must have a REX prefix, and cannot have a REX prefix, hence the assertion.

The problem is that not all registers in a class are allocable for all vregs. As you can see from the above constraint, which pregs are valid varies dynamically depending on the context that the register is used. The trick is to query the “allocation order” for a class (and as an added headache filter out any reserved registers). I’ve attached a test-case where I do this somewhat manually. In short:

int regClass = MRI->getRegClass(vreg)->getID();
const TargetRegisterClass *trc = TRI->getRegClass(regClass);
ArrayRef<uint16_t> rawOrder = trc->getRawAllocationOrder(Fn);
ArrayRef<uint16_t>::iterator rItr = rawOrder.begin();
while (reservedRegs.test(*rItr))
++rItr;
preg = *rItr;

Alternatively, you could use the AllocationOrder class (lib/CodeGen/AllocationOrder.h). This has the benefit of considering register hints for improved coalescing too. It does, however, require you to use VirtRegMap.

Hope this helps!

Cheers,
Lang.

Gcra.cpp (6.73 KB)

Thanks Lang, we are making progress! I no longer get the failed assertion, but the code I'm using for vregs that don't get allocated a preg, and thus need to be spilled and re-loaded is causing assembler errors.

I suspect the problem is my code for allocating space in the stack, but I don't know how to fix it.

I've attached a new version of the simple register-allocation code, a test program that causes the bad assembler to be produced, and the bc file. (I had to name everything with a .txt extension to copy the files to my laptop.)

As always, thank you for your help!

Susan

Gcra.cpp.txt (7.76 KB)

math.c.txt (363 Bytes)

math.bc.txt (983 Bytes)

Hi Susan,

It looks like the bitcode you have attached is corrupted. You should make sure to attach it as a binary file. Alternatively you can attach the LLVM assembly as text. You can generate an assembly file from bitcode with:

llvm-dis -o

Regards,
Lang.

Sorry about that. I created the assembly file and attached it (as math.txt).

Susan

math.txt (1.78 KB)