Need help implementing relocations

Hi all,

I have reached the relocation phase of my backend implementation and I’m having some trouble. The LLVM code I’m trying to compile is this:

%struct.Date = type { i32, i32, i32 }

@date = global %struct.Date { i32 2012, i32 10, i32 120000 }, align 4

; Function Attrs: nounwind

define i32 @foo() #0 {

%1 = load i32, i32* getelementptr inbounds (%struct.Date, %struct.Date* @date, i32 0, i32 2), align 4

ret i32 %1

}

which yields the following assembly lines


MOVI $r0, date
LD $r4, $r0, 8 // load the content of [$r0 + 8] into return register $r4
JAL

When I look at the text section (llvm-objdump -s output), I see this

0000 00c2

This is the correct MOVI opcode but the instead of the 0s, I should see the address of ‘date’

The ouput of llvm-objdump -r -t is this:

RELOCATION RECORDS FOR [.rel.text]:

00000000 R_XXX_MOVI date (Note: correct relocation type)

RELOCATION RECORDS FOR [.rel.eh_frame]:

0000001c R_XXX_NONE .text

SYMBOL TABLE:

00000000 l df ABS 00000000 array.ll

00000000 l d .text 00000000 .text

00000000 g F .text 0000000e foo

00000000 g O .data 0000000c date

Why am I missing? Why didn’t fill the instruction with the address of ‘date’?

Thanks.

Hi Josh,

which yields the following assembly lines

...
MOVI $r0, date
LD $r4, $r0, 8 // load the content of [$r0 + 8] into return register $r4
JAL
...

When I look at the text section (llvm-objdump -s output), I see this

0000 00c2

This is the correct MOVI opcode but the instead of the 0s, I should see the address of 'date'

I don't think you should see "date" there yet. That's the whole reason
for R_XXX_MOVI to exist: it tells the linker to insert the address of
date when converting this .o file into a final executable.

The only time you might see something in that field is if you managed
to fold the GEP into the MOVI. This would end up written in the .s fie
as something like:

    MOVI $r0, date+8
    LD $r4, $r0
    JAL

and then because you've chosen to use ".rel" relocations[*] the offset
8 would be put into the instruction stream and used by the linker.
Instead of putting just "date" into the MOVI it would add the 8 that
was already there first so you would end up with date+8 and then the
load instruction might be simpler (depending on the target).

But that's all a more advanced use of global addressing and you'd have
to modify your CodeGen specifically to try and produce that kind of
thing. For a first pass, what you're seeing is exactly what I'd expect
if everything was working properly.

Cheers.

Tim.

[*] The alternative scheme, where relocations end up in a section
starting with ".rela" puts the offset in with the relocation itself.
IMO it's simpler and neater, but it makes the object file slightly
bigger. If I was designing a backend and had the freedom, it's what
I'd choose to do. If you want to change it you set the
"HasRelocationAddend" variable to true in your XYZELFObjectWriter.

Tim,
Thanks for the explanation.

“it tells the linker to insert the address of
date when converting this .o file into a final executable.”

Which utility do you use to convert .o to .elf and insert the address of ‘date’? llvm-objcopy?

The linker in the LLVM project is called "lld". It's pretty good for
ELF, but you'll have to implement the parts for your target
(specifically exactly how R_XXX_MOVI does its job in this case).
Alternatively, if your platform already has GCC support you can use ld
from GNU binutils, but you'll probably have to cross-compile it
yourself.

Cheers.

Tim.

In function relocateone() in lld/ELF/Arch/XXX.cpp, I need to compute the difference between the PC and the ‘Val’ function argument so that I can increment the PC by that difference. Is there a relationship between the PC and ‘Loc’? Thanks.

Hi Josh,

In function relocateone() in lld/ELF/Arch/XXX.cpp, I need to compute the difference between the PC and the 'Val' function argument so that I can increment the PC by that difference. Is there a relationship between the PC and 'Loc'? Thanks.

There isn't any relationship between those because Loc is a pointer
into lld's address space (your job in relocateOne is to modify what's
there). Arranging for all sections to be in lld at their final
destination location would be impossible in general and needlessly
complicated even when possible.

This kind of calculation looks like it's handled by getRelocTargetVA
in InputSection.cpp, which your target tells what to do by overriding
getRelExpr in its XXX.cpp. There's a pretty good chance this
relocation just needs to return R_PC, but you should cross-check that
actually is the calculation you want.

Cheers.

Tim.