Sorting relocation entries

What would be the best way to sort relocation entries before they are
written out in ELFObjectWriter::WriteRelocationsFragment?

According to the Mips ABI documents I have, there are certain
restrictions on the order relocations appear in the table (e.g.
R_MIPS_HI16 and R_MIPS_GOT16 must be followed immediately by a
R_MIPS_LO16). When I enable post RA scheduling, some of the
restrictions are violated in the generated object code, which results
in incorrect relocation values generated by the linker.

I am considering imitating what gas does in function mips_frob_file
(line 15522 of tc-mips.c) to fix this problem:

http://repo.or.cz/w/binutils.git/blob/master:/gas/config/tc-mips.c

Are there any other targets that have similar restrictions or requirements?

Hi Akira,

If I follow correctly, the relocation entries can thus be in a different order than the instructions that they're for? That seems a bit odd, but I suppose there's nothing inherently wrong with that. It's just not something, AFAIK, that llvm has had to deal with before. This should definitely be a target-specific thing, not a general ELFObjectWriter thing, as other targets may have entirely different needs. Offhand, it seems reasonable to have a post-processing pass over the relocation list before it's written out to the file. The target can manipulate the list in whatever manner it needs to. A hook on MCELFObjectTargetWriter should do the trick.

-Jim

Hi Jim,

Yes, the relocation entries have to be reordered so that the
got16/lo16 or hi16/lo16 pairs appear consecutively in the relocation
table. As a result, relocations can appear in a different order than
the instructions that they're for.

For example, in this code, the post-RA scheduler inserts an
instruction with relocation %got(body_ok) between %got(scope_top) and
%lo(scope_top).

$ cat z29.s
  lw $3, %got(scope_top)($gp)
  lw $2, %got(body_ok)($gp)
  lw $3, %lo(scope_top)($3)
  addiu $2, $2, %lo(body_ok)

This is the assembled program generated by gas:
$ mips-linux-gnu-objdump -dr z29.gas.o

     748: 8f830000 lw v1,0(gp)
                        748: R_MIPS_GOT16 .bss
     74c: 8f820000 lw v0,0(gp)
                        74c: R_MIPS_GOT16 .bss
     750: 8c630000 lw v1,0(v1)
                        750: R_MIPS_LO16 .bss
     754: 244245d4 addiu v0,v0,17876
                        754: R_MIPS_LO16 .bss

gas reorders these relocations with the function in the following link:

http://repo.or.cz/w/binutils.git/blob/master:/gas/config/tc-mips.c#l15222

$ mips--linux-gnu-readelf -r z29.gas.o

Relocation section '.rel.text' at offset 0x4584 contains 705 entries:
Offset Info Type Sym.Value Sym. Name
...
00000748 00000409 R_MIPS_GOT16 00000000 .bss // %got(scope_top)
00000750 00000406 R_MIPS_LO16 00000000 .bss // %lo(scope_top)
0000074c 00000409 R_MIPS_GOT16 00000000 .bss // %got(body_ok)
00000754 00000406 R_MIPS_LO16 00000000 .bss // %lo(body_ok)

The attached patch makes the following changes to make direct object
emitter write out relocations in the correct order:

1. Add a target hook MCELFObjectTargetWriter::ReorderRelocs. The
default behavior sorts the relocations by the r_offset.
2. Move struct ELFRelocationEntry from ELFObjectWriter to
MCELFObjectTargetWriter and add member fixup to it. The overridden
version of ReorderRelocs (MipsELFObjectWriter::ReorderRelocs) needs
access to ELFRelocationEntry::Type and MCFixup::Value to reorder the
relocations.

Do you think these changes are acceptable?

Here is the patch.

reloc.patch (4.67 KB)

Hi Akira,

This is looking good. Some specific comments on the details below.

Thanks!
  Jim

diff --git a/include/llvm/MC/MCELFObjectWriter.h b/include/llvm/MC/MCELFObjectWriter.h
index 6e9f5d8..220ecd0 100644
--- a/include/llvm/MC/MCELFObjectWriter.h
+++ b/include/llvm/MC/MCELFObjectWriter.h
@@ -13,6 +13,7 @@
#include "llvm/MC/MCObjectWriter.h"
#include "llvm/Support/DataTypes.h"
#include "llvm/Support/ELF.h"
+#include <vector>

namespace llvm {
class MCELFObjectTargetWriter {
@@ -27,6 +28,33 @@ protected:
                           uint16_t EMachine_, bool HasRelocationAddend_);

public:
+ /// @name Relocation Data
+ /// @{
+
+ struct ELFRelocationEntry {
+ // Make these big enough for both 32-bit and 64-bit
+ uint64_t r_offset;
+ int Index;
+ unsigned Type;
+ const MCSymbol *Symbol;
+ uint64_t r_addend;
+ const MCFixup *fixup;
+
+ ELFRelocationEntry()
+ : r_offset(0), Index(0), Type(0), Symbol(0), r_addend(0), fixup(0) {}
+
+ ELFRelocationEntry(uint64_t RelocOffset, int Idx,
+ unsigned RelType, const MCSymbol *Sym,
+ uint64_t Addend, const MCFixup *Fixup)
+ : r_offset(RelocOffset), Index(Idx), Type(RelType),
+ Symbol(Sym), r_addend(Addend), fixup(Fixup) {}
+
+ // Support lexicographic sorting.
+ bool operator<(const ELFRelocationEntry &RE) const {
+ return RE.r_offset < r_offset;
+ }
+ };
+

I don't think this really belongs to the MCELFObjectTargetWriter class, per se. I suggest moving it outside of the class definition.

   static uint8_t getOSABI(Triple::OSType OSType) {
     switch (OSType) {
       case Triple::FreeBSD:
@@ -52,6 +80,8 @@ public:
   virtual void adjustFixupOffset(const MCFixup &Fixup,
                                  uint64_t &RelocOffset);

+ virtual void ReorderRelocs(const MCAssembler &Asm,

s/ReorderRelocs/reorderRelocs/. Function names start w/ a lower case letter. Personally, I prefer naming the prefix "sort" rather than "reorder", as it's a bit more descriptive, but not a big deal either way.

+ std::vector<ELFRelocationEntry>& Relocs);

The '&' binds to the identifier, not the type name, and should be formatted as such. I.e., space before the '&' and no space between it and "Relocs".

   /// @name Accessors
   /// @{
diff --git a/lib/MC/ELFObjectWriter.cpp b/lib/MC/ELFObjectWriter.cpp
index 36f94b4..093eb07 100644
--- a/lib/MC/ELFObjectWriter.cpp
+++ b/lib/MC/ELFObjectWriter.cpp
@@ -84,31 +84,7 @@ class ELFObjectWriter : public MCObjectWriter {
       }
     };

- /// @name Relocation Data
- /// @{
-
- struct ELFRelocationEntry {
- // Make these big enough for both 32-bit and 64-bit
- uint64_t r_offset;
- int Index;
- unsigned Type;
- const MCSymbol *Symbol;
- uint64_t r_addend;
-
- ELFRelocationEntry()
- : r_offset(0), Index(0), Type(0), Symbol(0), r_addend(0) {}
-
- ELFRelocationEntry(uint64_t RelocOffset, int Idx,
- unsigned RelType, const MCSymbol *Sym,
- uint64_t Addend)
- : r_offset(RelocOffset), Index(Idx), Type(RelType),
- Symbol(Sym), r_addend(Addend) {}
-
- // Support lexicographic sorting.
- bool operator<(const ELFRelocationEntry &RE) const {
- return RE.r_offset < r_offset;
- }
- };
+ typedef MCELFObjectTargetWriter::ELFRelocationEntry ELFRelocationEntry;

Scoping operators shouldn't be typedefed away. Spell it out explicitly when the type is referenced. It makes the code clearer, though a bit more verbose. That said, with the above tweak to move the relocation type out to the top level, there shouldn't need to be any explicit scope resolution.

     /// The target specific ELF writer instance.
     llvm::OwningPtr<MCELFObjectTargetWriter> TargetObjectWriter;
@@ -786,7 +762,7 @@ void ELFObjectWriter::RecordRelocation(const MCAssembler &Asm,
   else
     assert(isInt<32>(Addend));

- ELFRelocationEntry ERE(RelocOffset, Index, Type, RelocSymbol, Addend);
+ ELFRelocationEntry ERE(RelocOffset, Index, Type, RelocSymbol, Addend, &Fixup);
   Relocations[Fragment->getParent()].push_back(ERE);
}

@@ -1072,8 +1048,7 @@ void ELFObjectWriter::WriteRelocationsFragment(const MCAssembler &Asm,
                                                MCDataFragment *F,
                                                const MCSectionData *SD) {
   std::vector<ELFRelocationEntry> &Relocs = Relocations[SD];
- // sort by the r_offset just like gnu as does
- array_pod_sort(Relocs.begin(), Relocs.end());
+ TargetObjectWriter->ReorderRelocs(Asm, Relocs);

Please add a comment explaining a bit. Nothing elaborate, just something along the lines of, "Sort the relocation entries. Most targets just sort by r_offset, but some (e.g., MIPS) have additional constraints."

   for (unsigned i = 0, e = Relocs.size(); i != e; ++i) {
     ELFRelocationEntry entry = Relocs[e - i - 1];
diff --git a/lib/MC/MCELFObjectTargetWriter.cpp b/lib/MC/MCELFObjectTargetWriter.cpp
index 15bf476..4f3e3b2 100644
--- a/lib/MC/MCELFObjectTargetWriter.cpp
+++ b/lib/MC/MCELFObjectTargetWriter.cpp
@@ -7,6 +7,7 @@
//
//===----------------------------------------------------------------------===//

+#include "llvm/ADT/STLExtras.h"

Since we're moving the sort here from ELFObjectWriter.cpp, it may be possible to remove the STLExtras.h include from the latter. Please check and see.

#include "llvm/MC/MCELFObjectWriter.h"

using namespace llvm;
@@ -36,3 +37,10 @@ const MCSymbol *MCELFObjectTargetWriter::ExplicitRelSym(const MCAssembler &Asm,
void MCELFObjectTargetWriter::adjustFixupOffset(const MCFixup &Fixup,
                                                 uint64_t &RelocOffset) {
}
+
+void
+MCELFObjectTargetWriter::ReorderRelocs(const MCAssembler &Asm,
+ std::vector<ELFRelocationEntry>& Relocs) {

'&' binding thing again.

+ //

Not original with you, but since we're in here anyway, this should be a well-formed sentence: "Sort by the r_offset, just like gnu as does."

+ array_pod_sort(Relocs.begin(), Relocs.end());

Trailing whitespace.

Hi Jim,

Thanks for reviewing the patch.

I couldn't get rid of STLExtras.h, but other than that, I followed all
the suggestions in your email.
Please let me know if you notice anything else that needs fixing.

reloc2.patch (4.48 KB)

Hi Akira,

Just two very minor things that I missed the first time around.
1. The 'fixup" member of ELFRelocation entry should be "Fixup" instead.
2. Since we're always passing in a non-NULL fixup, that should probably be a reference, not a pointer.

Good for commit with those tweaks.

Thanks!

-Jim

Thanks, committed in r153345.
Please let me know if there is anything else that needs fixing.