Create "appending" section that can be partially dead stripped

Hi,

Is there a way in llvm IR to emit multiple data elements within a single compilation unit that
a) are guaranteed to appear sequentially in the final binary (in the order they appear in the llvm IR), and
b) will be removed on an individual basis by the optimizers and/or linker in case they are not referenced from anywhere
?

Defining them as "appending" puts them all into a single section definition, even when compiling with -fdata-sections, so as soon as one of the symbols in that section is live, all of them will remain live.

Example:

What happens if you drop appending linkage? I think it will just work, since you are already using a custom section, which will ensure that all the data appears contiguously in memory.

Although, I do worry about what LLVM’s alias analysis will say about this… I don’t think LLVM allows GEPing from one global to another, and at the end of the day, you’ll GEP from an external global representing the section start through to the elements of the array to section end.

What happens if you drop appending linkage? I think it will just work,
since you are already using a custom section, which will ensure that all
the data appears contiguously in memory.

Thanks for the suggestion, but it still puts everything in a single .section statement.

Although, I do worry about what LLVM's alias analysis will say about
this... I don't think LLVM allows GEPing from one global to another, and
at the end of the day, you'll GEP from an external global representing
the section start through to the elements of the array to section end.

That's true and an interesting point. However, wouldn't that mean that "appending" linkage and "section" globals in general are completely unusable from llvm IR and would only be safely usable from inline assembler or code not compiled by llvm? (unless you don't access such data as a contiguous block, but I guess that doesn't happen very often)

Jonas

Try giving the globals linkonce_odr linkage instead of external linkage
manually? This is essentially the effect of -fdata-sections, except it
happens later during codegen.

That one indeed works, thanks! (provided I process the .ll file with optimisations enabled).

Regarding the question whether the optimisers and analyses will deal correctly with me iterating over all of the elements in such a section, at least at first sight it seems that the test below looks fine when processed with full optimisations. That's of course no proof that there won't be any problems in any case.

However, I'm not sure how to add a symbol without data at the start and end of the section. Adding a [i32 x 0]-typed symbol still inserts a byte (see the @arrstart/@arrstop). I can't work with alias declarations to the first and last element, since then the first and last element will always be considered live.

Jonas

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

define i32 @main() {
Entry:
  %sumvar = alloca i32

; make the second and third element live by directly referring them
  %eleptr1 = getelementptr [2 x i32]* @arr2, i64 0, i32 0
  %eleptr2 = getelementptr [2 x i32]* @arr3, i64 0, i32 0
  %ele1 = load i32* %eleptr1
  %ele2 = load i32* %eleptr2
  %sum1 = add i32 %ele1, %ele2
  store i32 %sum1, i32* %sumvar

; now loop over the entire array by using @arrstart and @arrstop
  %loopstop = ptrtoint [0 x i32]* @arrstop to i64
  %loopstart = ptrtoint [0 x i32]* @arrstart to i64
  %loopcount = sub i64 %loopstop, %loopstart
  %looparrinit = bitcast [0 x i32]* @arrstart to i32*
  br label %LoopStart

; sum all elements in the array
LoopStart:
  %looparr = phi i32* [%looparrinit, %Entry], [%looparrnext, %LoopBody]
  %loopcond = icmp eq i64 %loopcount, 0
  br i1 %loopcond, label %LoopEnd, label %LoopBody
LoopBody:
  %val = load i32* %looparr
  %sum2 = load i32* %sumvar
  %sum3 = add i32 %val, %sum2
  store i32 %sum3, i32* %sumvar
  %looparrnext = getelementptr i32* %looparr, i64 1
br label %LoopStart

; return the sum
LoopEnd:
  %retval = load i32* %sumvar
  ret i32 %retval
}

; this declaration inserts a 0 byte
@arrstart = global [0 x i32] , section "mytest"

@arr1 = linkonce_odr global [2 x i32] [
   i32 1,
   i32 2
], section "mytest"

@arr2 = linkonce_odr global [2 x i32] [
   i32 3,
   i32 4
], section "mytest"

@arr3 = linkonce_odr global [2 x i32] [
   i32 5,
   i32 6
], section "mytest"

; this declaration inserts a 0 byte
@arrstop = global [0 x i32] , section "mytest"

I think you want to declare those [0 x i32] globals instead of defining
them by adding extern and dropping the empty brackets. Typically they are
defined by the linker, and not every TU.

A good starting point for this kind of stuff is to look at what Clang does
on C code like this:

$ cat t.cpp
extern "C" int printf(const char *, ...);
extern int __start_my_section;
extern int __stop_my_section;
int __attribute__((section("my_section"))) a = 1;
int __attribute__((section("my_section"))) b = 2;
int main() {
  for (int *i = &__start_my_section[0], *e = &__stop_my_section[0]; i != e;
       ++i) {
    printf("%d ", *i);
  }
  printf("\n");
}

$ clang -cc1 t.cpp -emit-llvm -o -
... // omitted

Speaking of which, I believe both ld.bfd and ld.gold define
__start_my_section for you, which should make your life easier.

[section start/stop symbols]

I think you want to declare those [0 x i32] globals instead of defining
them by adding extern and dropping the empty brackets. Typically they
are defined by the linker, and not every TU.

In my case, the sections will be unique per compilation unit (translation unit?). I'm trying to add the start and stop symbols myself because at least on OS X, the linker doesn't appear to define any such symbols. Its man page mentions an option called "-sectobjectsymbols", but it's listed under the obsolete options (on OS X 10.9) and has as explanation "Adding a local label at a section start is no longer supported. This option is obsolete."

[C example]

Speaking of which, I believe both ld.bfd and ld.gold define
__start_my_section for you, which should make your life easier.

For platforms using the GNU binutils, indeed. My original example was for Linux/x86-64, but if possible I'd like a solution that does not depend on the platform or used linker (other than the way how sections are defined, since e.g. on OS X you also have to specify a segment name). Maybe the easiest is to consciously just add dummy sentinels. Or maybe I can add the start and stop labels via module-level inline assembly...

Thanks,

Jonas