Distinct code generation with -fPIE and mcmodel=large

I am a compiler engineer currently working on a project to run Grub2 with Clang + LLVM instead of GCC. I would like to discuss a specific expectation regarding the behavior of LLVM when using both the mcmodel=large flag and the Position-Independent Code (PIC) option (-fPIE/-fPIC).

Currently, I have observed that the generated code remains the same whether -fPIE is present or not when using mcmodel=large. Here are the examples:

test.c:

const char *test(void){
	return "xx";
}

execution with -fPIE:

$ clang -fPIE -o test.o -c test.c -mcmodel=large
$ readelf -r test.o

Relocation section '.rela.text' at offset 0x1c0 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000700000108 R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0
000000000004  00070000010a R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0
000000000008  00070000010c R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0
00000000000c  00070000010d R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0

Relocation section '.rela.eh_frame' at offset 0x220 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000001c  000600000104 R_AARCH64_PREL64  0000000000000000 .text + 0

execution without -fPIE:

$ clang -o test.o -c test.c -mcmodel=large
$ readelf -r test.o

Relocation section '.rela.text' at offset 0x1c0 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000700000108 R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0
000000000004  00070000010a R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0
000000000008  00070000010c R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0
00000000000c  00070000010d R_AARCH64_MOVW_UA 0000000000000000 .rodata.str1.1 + 0

Relocation section '.rela.eh_frame' at offset 0x220 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000001c  000600000104 R_AARCH64_PREL64  0000000000000000 .text + 0

I believe it would be beneficial to have a distinct generated code when both options are used together. Hence, I would like to suggest that LLVM generates group relocations, such as MOVW_PREL_G0, when both the -fPIC option and mcmodel=large flag are used during compilation.

Thank you for your attention, and I look forward to your response.

In the absence of -fno-pic,-fpie,-fpic, the default is usually -fPIE on Linux.

bool Linux::isPIEDefault(const llvm::opt::ArgList &Args) const {
  return CLANG_DEFAULT_PIE_ON_LINUX || getTriple().isAndroid() ||
         getTriple().isMusl() || getSanitizerArgs(Args).requiresPIE();
}

Actually, can you describe why you are using PIC -mcmodel=large with Clang?
Note that PIC -mcmodel=large is largely unimplemented for AArch64 in compilers. GCC even reports an error.

% aarch64-linux-gnu-gcc -fpie -mcmodel=large -c a.cc
cc1plus: sorry, unimplemented: code model ‘large’ with ‘-fpic’

Even for very large executables, the large code model is probably not so demanded for AArch64. Relocation overflow and code models | MaskRay

For data references from code, x86-64 uses R_X86_64_REX_GOTPCRELX /R_X86_64_PC32 relocations, which have a smaller range [-2**31,2**31) . In contrast, AArch64 employs R_AARCH64_ADR_PREL_PG_HI21 relocations, which has a doubled range of [-2**32,2**32) . This larger range makes it unlikely for AArch64 to encounter relocation overflow issues before the binary becomes excessively oversized for x86-64.

This is probably best directed at the AArch64 ABI rather than LLVM specifically. We do have the code models documented in https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#7code-models

It is true that not a lot of work has gone into the large-code model with position independent code.The large code-model in AArch64 is not well named, it is more of a large data-model than a large amounts of code-model.

We also think that we can do better than the large code model for a lot of cases that don’t fit into the default small code-model. The document defines a medium code-model that should be a better compromise, but as yet there hasn’t been enough demand to implement it in GCC or clang.

Thank you for your comment. I appreciate your input and will consider it as I continue my work on this project.