How to prevent LLVM from emitting R_X86_64_32 ELF relocations?

ELF module of type ET_REL (Relocable module), generated by LLVM, always has some R_X86_64_32 in debug information sections.
This happens with Reloc models Default,Static,PIC_ and with CodeModel set to Large.

What is the way to prevent R_X86_64_32 ELF from ever appearing in ELF?

Yuri

ELF module of type ET_REL (Relocable module), generated by LLVM, always has some R_X86_64_32 in debug information sections.
This happens with Reloc models Default,Static,PIC_ and with CodeModel set to Large.

What is the way to prevent R_X86_64_32 ELF from ever appearing in ELF?

Hi Yuri,

   why do you want to prevent R_X86_64_32 generation? for 32bit dwarf, I think generation of R_X86_64_32 is reasonable.

   you can check http://dwarfstd.org/doc/DWARF4.pdf.

Because R_X86_64_32 elements are 4-byte addresses and can't be relocated for the 64-bit address space over 32-bit limit.

Dwarf2 actually allows for both 32bit and 64-bit relocations. The document, mentioned by you, explains this in section 7.5.1.1 on page 143. If the first DWORD of .debug_info section is 0xffffffff, then this is 64-bit format.

But LLVM for some reason always chooses DWARF 32-bit format. This is why I asked the question. Resulting relocable 64-bit objects can't be loaded into addresses that are over 32-bit limit.

Yuri

  why do you want to prevent R_X86_64_32 generation? for 32bit dwarf, I think generation of R_X86_64_32 is reasonable.

  you can check http://dwarfstd.org/doc/DWARF4.pdf.

Because R_X86_64_32 elements are 4-byte addresses and can't be relocated for the 64-bit address space over 32-bit limit.

Dwarf2 actually allows for both 32bit and 64-bit relocations. The document, mentioned by you, explains this in section 7.5.1.1 on page 143. If the first DWORD of .debug_info section is 0xffffffff, then this is 64-bit format.

But LLVM for some reason always chooses DWARF 32-bit format. This is why I asked the question. Resulting relocable 64-bit objects can't be loaded into addresses that are over 32-bit limit.

there is no support for Dwarf 64 yet. From what I known, there is no way you can get rid of that.

looks like there are some discussions on this already, you can search bugzilla.

http://llvm.org/bugs/show_bug.cgi?id=14969
http://llvm.org/bugs/show_bug.cgi?id=15173

I think that you may have confused 2 separate issues here; the size of the
DWARF tables with the size of addresses on the target system.

The 32-bit DWARF vs. 64-bit DWARF is all about the size of the tables in the
DWARF sections (and hence also the size of the references between entries in
the tables).

So a 32-bit DWARF section can correctly describe code/data addresses that
are 64-bit, and any references to variables/functions will require 64-bit
relocation entries.

However in this case the DWARF sections are effectively limited in size, and
references between one DWARF entry and another will use 32-bit relocation
entries. It is these 32-bit relocations entries that I think you are
seeing in your image.

You only need to go to 64-bit DWARF when your debugging information becomes
too large to fit in 32-bit DWARF tables.

Keith

Behalf Of Yuri

I am not sure if this is true.
Currently R_X86_64_32 EL relocations are issued for DWARF-32 debug info sections. This is because the size of address in DWARF-32 is only 32-bits, according to the above mentioned specification. Such relocations can't be resolved (without overflow) when the base address is 64-bit and higher than 4GB threshold.

So I have to always load such ELFs into the space within the first 4GB, otherwise relocation resolver will fail with overflow.

Yuri

The size of an address on the target machine in the 32-bit DWARF format is NOT restricted to 32-bits. The size of an address on the target machine is specified in the DWARF table headers.

For example, in the DWARF-3 Specification, section 7.5.1, the unit_length field is used to specify whether the DWARF format is 32-bit or 64-bit (as well as defining the length of the table). However the address_size field is used to specify the size of an address on the target machine.

So whether the DWARF format is 32-bit or 64-bit is independent of the size of addresses on the target machine as they are specified by different fields.

It sounds like you are trying to load the DWARF sections into target memory .... and if so I guess I would have to ask you why as they are not normally loaded into target memory?

Keith

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Yuri
Sent: Monday, April 29, 2013 1:30 PM
To: Keith Walker
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] How to prevent LLVM from emitting R_X86_64_32 ELF
relocations?

> You only need to go to 64-bit DWARF when your debugging information
becomes
> too large to fit in 32-bit DWARF tables.

I am not sure if this is true.
Currently R_X86_64_32 EL relocations are issued for DWARF-32 debug info
sections. This is because the size of address in DWARF-32 is only
32-bits, according to the above mentioned specification. Such
relocations can't be resolved (without overflow) when the base address
is 64-bit and higher than 4GB threshold.

References from one DWARF section to another DWARF section will use
R_X86_64_32, but references from a DWARF section to (say) .text or .data
should use 64-bit relocations. This is how DWARF makes the distinction
between the DWARF format (32-bit) and the target-address-size (64-bit).

The DWARF sections normally aren't loaded into process memory, the way
.text and .data are; so whatever "base address" you are using for the
other sections really should not apply to the DWARF sections.

You are right, debug sections aren't normally loaded into the memory together with the sections needed for running.

However, I am mostly focusing on the (lightweight) ELF injection of the relocable ELF objects (ET_REL), and loading them into the memory as a whole, bypassing file, is often the easiest way of working with them. When debug info is needed, it has to be loaded into memory by debugging tools anyway. And it has to be relocated, so debugging tools have then to make sure they are placed into the lowest 4G portion for DWARF-32 debug info. So having DWARF-32 either forces abandoning the simple monolithic mount of the ELF at arbitrary address, and treating debug info separately, or just placing the whole ELF into the lowest 4G.

So, in case when it is preferable to load the whole ELF together monolithically, R_X86_64_32 entries stand in the way. I am not sure why llvm has to use DWARF-32 by default on x86_64 architectures.

Yuri

Have you looked at the RuntimeDyld object? It does what you're talking about.

There are definitely some issues related to reading the debug information of objects loaded into memory this way, but they can be handled without suppressing that relocation type. Generally, you need the code that reads the debug information to behave as if all debug sections are at address zero.

Take a look at the DWARFContextInMemory class (in lib/DebugInfo) and see if it does what you need. Even if it doesn't, you can use its handling of in-memory sections as a guide for what you need.

-Andy

I'll bite:
1. It's smaller (not everything is leb128 in dwarf)
2. It's faster to process as a result.
3. It offers almost no disadvantages in practice

Certainly, it doesn't have to use it (though, as others mentioned, llvm
doesn't support 64 bit dwarf), but it makes a lot of sense, given the
advantages.

What would the advantages to using dwarf-64 by default be, exactly?

IE what problem does it solve besides the above, which seems to be handled
fine by all current debuggers, and even folks with multi-gigabyte debug
info haven't complained.