Is it possible to manually specify .init_array order?

After this discussion I’m working to migrate a project from .ctors to .init_array. My problem is I’m trying to maintain an identical initialization execution order. FWIW I control the code that walks .init_array during startup.

  • The fragment from D91187 will not work because I’m changing --target so .ctors is no longer emitted. That’s kind of the whole point of this. :slight_smile:
  • I’ve tried manually sorting .init_array and .rela.init_array within the linker script but neither has any effect.
  • I think what I actually want is the REVERSE linker script directive suggested by @MaskRay in this binutils bug. Then I could flip the order of input objects (whose .init_array sections are each already flipped). But as far as I can tell, nothing like this is implemented in GNU ld nor LLD. If this was in place I would have the ability to produce an exactly inverted list and flip my startup logic to walk in the opposite direction.
  1. Is there an alternative technique to achieve this sorting that anyone could suggest?
  2. Are there any plans to implement REVERSE in LLD? Is there any opposition to it being implemented?

What follows is a series of examples that illustrate what I’ve noted above.

Examples

// a.h
struct A { A(); };

// a.cc
#include "a.h"
A::A() {}
A a;

// aa.cc
#include "a.h"
A aa;

// ldscript
(empty)

// build.sh
rm -f a.o aa.o ia.ro
clang++ -g -c a.cc
clang++ -g -c aa.cc
clang++ -g -nostdlib a.o aa.o -o ia.ro -fuse-ld=lld -r -T ldscript
llvm-objdump -s -j.rela.init_array ia.ro
echo ""
llvm-nm --numeric-sort ia.ro | grep "_GLOBAL__"

Baseline

The baseline will initialize a then aa:

$ ./build.sh

ia.ro:  file format elf64-x86-64
Contents of section .rela.init_array:
 0000 00000000 00000000 01000000 05000000  ................
 0010 20000000 00000000 08000000 00000000   ...............
 0020 01000000 05000000 50000000 00000000  ........P.......

0000000000000020 t _GLOBAL__sub_I_a.cc
0000000000000050 t _GLOBAL__sub_I_aa.cc

Sort .init_array

// ldscript
SECTIONS {
    .init_array : {
        aa.o:(.init_array)
        a.o:(.init_array)
    }
}

Sorting .init_array has no effect. This is the result regardless of the object order above:

$ ./build.sh

ia.ro:  file format elf64-x86-64
Contents of section .rela.init_array:
 0000 00000000 00000000 01000000 06000000  ................
 0010 20000000 00000000 08000000 00000000   ...............
 0020 01000000 06000000 50000000 00000000  ........P.......

0000000000000020 t _GLOBAL__sub_I_a.cc
0000000000000050 t _GLOBAL__sub_I_aa.cc

Sort .rela.init_array

// ldscript
SECTIONS {
    .rela.init_array : {
        aa.o:(.rela.init_array)
        a.o:(.rela.init_array)
    }
}

Sorting .rela.init_array has no effect. This is the result regardless of the object order above:

$ ./build.sh

ia.ro:  file format elf64-x86-64
Contents of section .rela.init_array:
 0000 00000000 00000000 01000000 05000000  ................
 0010 20000000 00000000 08000000 00000000   ...............
 0020 01000000 05000000 50000000 00000000  ........P.......

0000000000000020 t _GLOBAL__sub_I_a.cc
0000000000000050 t _GLOBAL__sub_I_aa.cc

Change link order of a.o and aa.o

// ldscript
(empty)

// build.sh
[...]
clang++ -g -nostdlib aa.o a.o -o ia.ro -fuse-ld=lld -r -T ldscript
[...]

This is a sanity check. It does change the initialization order, but it’s going to reorder the contents of every other section too (which I don’t want):

 $ ./build.sh

ia.ro:  file format elf64-x86-64
Contents of section .rela.init_array:
 0000 00000000 00000000 01000000 05000000  ................
 0010 20000000 00000000 08000000 00000000   ...............
 0020 01000000 05000000 50000000 00000000  ........P.......

0000000000000020 t _GLOBAL__sub_I_aa.cc
0000000000000050 t _GLOBAL__sub_I_a.cc

Thanks!

FYI for anyone who finds this down the road: REVERSE is what I needed.

  • Imagine a.o’s .ctors section contains (1, 2, 3) and b.o’s .ctors section contains (4, 5, 6). When we link a.o b.o we end up with (1, 2, 3, 4, 5, 6)
  • That same code using .init_array will result an a.o .init_array section of (3, 2, 1) and a b.o .init_array section of (6, 5, 4). When we link a.o b.o we end up with (3, 2, 1, 6, 5, 4)
  • But if we can use REVERSE in the linker script, then .init_array ends up as (6, 5, 4, 3, 2, 1). Perfect! Because I control the code that walks .init_array, I can now just walk in the opposite direction that I used for .ctors and execution order remains identical.

REVERSE was committed at 447aa48b4a02fa9e22fa45b2fb7a85c12df2e6c3.

For .init_array (no suffix), I suggest ld.lld --shuffle-sections=.init_array=-1 in .init, .ctors, and .init_array | MaskRay . All issues I have found with a large code base are initialization order fiasco bugs. In the article, I have explained how dynamic linking can essentially shuffle the order.

So for “anyone who finds this down the road”, you likely have bugs in your code base:)

REVERSE seems a good complement for --shuffle-sections.*=-1, can be implemented trivially, and doesn’t look like it can cause a GNU ld compatibility problem (the behavior is clear even if GNU ld hasn’t implemented it yet), so I accepted the patch.