We're at the point in our port of OpenVMS to x86 using LLVM to make choices
on mcmodel. Given OpenVMS's history, our linker will allocate static data
(ie, .data, .bss, .plt, GOT, etc.) in the bottom 32-bits of address space
(ie, 00000000.xxxxxxxx). However, we support code anywhere in the 64-bit
address space as PIC code (we do this on Itanium today using our own
code-generator and linker). Given this requirement, I'm looking at the
support for -fPIC and -mcmodel=large. Either I'm missing something or there
is something broken (and has been for quite a while).
Using the code samples in the AMD64 ABI document, I wrote a little abi.c
program to look at the generated code. The code from gcc matches almost
exactly what is listed in the ABI document. However, LLVM seems very
different. I don't see -fPIC has having any impact with mcmodel=large.
Thanks
John
For example,
static int src; // Lsrc: .long
static int dst; // Ldst: .long
extern int *dptr; // .extern dptr
void DataLoadAndStore() {
// Large Memory Model code sequences from AMD64 abi
// Figure 3.22: Position-Independent Global Data Load and Store
//
// Assume that %r15 has been loaded with GOT address by
// function prologue.
// movabs $Lsrc@GOTOFF,%rax ; R_X86_64_GOTOFF64
// movabs $Ldst@GOTOFF,%rdx ; R_X86_64_GOTOFF64
// movl (%rax,%r15),%ecx
// movl %ecx,(%rdx,%r15)
dst = src;
// movabs $dptr@GOT,%rax ; R_X86_64_GOT64
// movabs $Ldst@GOTOFF,%rdx ; R_X86_64_GOTOFF64
// movq (%rax,%r15),%rax
// leaq (%rdx,%r15),%rcx
// movq %rcx,(%rax)
dptr = &dst;
// movabs $Lsrc@GOTOFF,%rax ; R_X86_64_GOTOFF64
// movabs $dptr@GOT,%rdx ; R_X86_64_GOT64
// movl (%rax,%r15),%ecx
// movq (%rdx,%r15),%rdx
// movl %ecx,(%rdx)
*dptr = src;
generates (using 'clang -c -S -fPIC -mcmodel=large'):
DataLoadAndStore: # @DataLoadAndStore
.cfi_startproc
# BB#0:
pushq %rbp
.Ltmp0:
.cfi_def_cfa_offset 16
.Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp2:
.cfi_def_cfa_register %rbp
movabsq $src, %rax
movl (%rax), %ecx
movabsq $dst, %rdx
movl %ecx, (%rdx)
movabsq $dptr, %rsi
movq %rdx, (%rsi)
movl (%rax), %ecx
movl %ecx, (%rdx)
movl (%rax), %ecx
movq (%rsi), %rax
movl %ecx, (%rax)
popq %rbp
retq
Where are the GOT accesses?
Where is the computation of the GOT address? Since it is more than 2GB away
from the code,
the ABI says to generate:
// ABI document suggests:
//
// pushq %r15
// leaq 1f(%rip),%r11
// 1:
// movabs $_GLOBAL_OFFSET_TABLE_,%r15
// leaq (%r11,%r15),%r15
//
// gcc generates:
//
// .L2:
// leaq .L2(%rip), %rax
// movabsq $_GLOBAL_OFFSET_TABLE_-.L2, %r11
// addq %r11, %rax
//