I have a rather large bitcode file which when run through "llc -march arm -O0" produces an asm file of about 500Mb. Trying to assemble this file with the ios assembler on osx gives me lots of "branch out of range" errors thanks to jump instructions overflowing the +/-32Mb relative jump limit.
I've tried running llc with the hidden "-arm-long-calls" option, which solves the problem but forces everything to be an indirect branch. That feels a bit like overkill, does anybody have a suggestion for what the right solution might be?
I think that writing smaller functions or optimize it to be smaller are about your only options. We have a constant island pass in the compiler that'll let it figure out constant displacements, but the branch island bits is only in the linker - and if you've got branches inside a single function of more than 32MB you're just asking for trouble on the architecture.
I don't think any other solutions are currently supported.
One problem is that the linker can move functions around as it pleases, so there is no way of knowing which functions are going to be far away.
I don't think it's branches within a single function that are a problem, it's branches between functions that are far apart in the same object file... I guess our only option is to split the bitcode file before codegenning, is there any existing code that would help with that approach?
But the linker will fix branches that become "long-calls" after it's shuffled things around right? so it would still be reasonable to try to get LLVM to at least codegen a single object file correctly, assuming that the codegen phase has some knowledge of roughly how big the branches will have to be when it is generating the asm, which on second thought it probably doesn't unless it knows the size of all functions before writing out the asm (I'm not too familiar with the codegen phase).
If the problem is that you have zillions of small functions (as opposed to some MB sized functions), then this is really a limitation of the mach-o object format. The iOS linker tool will happily synthesize branch islands when the target of a bl is too far away (that is, it will add an island of code between functions that is just jump instruction. The bl branches to the island which jumps to the final target).
In ARM mach-o, if the target of a bl is in the same translation unit, the assembler uses a local relocation and adjusts the instruction to branch to the target address. If the address is too far, there is no way to encode it ;-( You might think switching to an external relocation would work, but because mach-o has no RELA relocations, the addend is encoded in the instruction. And, unfortunately, the pc-rel instructions like bl, they are encoded as the target of the bl instruction. So an addend of zero means the instruction looks like it branches to address zero, which means the instruction must itself within branch range of zero. So you are limited in how big your .o file can be.
You might get lucky and be able to just chop up the big assembly text file into pieces and assemble each one...