[LLVM LTO]internalize pass

Hi All ,

We are in process of exploring the LTO and found that internalize
pass is the replacement for whole program optimisation
(-fwhole-program in gcc) in clang and in the below case

define i32 @test() #0 {

entry:

  ret i32 0

}

define i32 @main() #0 {

entry:

  %retval = alloca i32, align 4

  store i32 0, i32* %retval, align 4

  %call = call i32 @test()

  ret i32 %call

}

*** IR Dump After Internalize Global Symbols ***; ModuleID = 'ld-temp.o'

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable

define internal i32 @test() #0 {

entry:

  ret i32 0

}

; Function Attrs: nounwind uwtable

define internal i32 @main() #0 {

entry:

  %retval = alloca i32, align 4

  store i32 0, i32* %retval, align 4

  %call = call i32 @test()

  ret i32 %call

}

the functions def like test() and the main() are marked internal by
the internalize pass.

Queries is ,we tried to prevent mark the internal string attribute
before function name i.e

$bin/llvm-lto -print-after-all test.o
-internalize-public-api-list=test,main -o output.o

$bin/llvm-lto -print-after-all test.o
-internalize-public-api-file=file -o output.o
$cat file
test
main

Both cases ,no luck here ,still the functions are marked internal :frowning:

Any suggestions here or its a bug with internalize pass.

Thank you
~Umesh

Hi All ,

We are in process of exploring the LTO and found that internalize
pass is the replacement for whole program optimisation
(-fwhole-program in gcc) in clang and in the below case

define i32 @test() #0 {

entry:

  ret i32 0

}

define i32 @main() #0 {

entry:

  %retval = alloca i32, align 4

  store i32 0, i32* %retval, align 4

  %call = call i32 @test()

  ret i32 %call

}

*** IR Dump After Internalize Global Symbols ***; ModuleID = 'ld-temp.o'

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable

define internal i32 @test() #0 {

entry:

  ret i32 0

}

; Function Attrs: nounwind uwtable

define internal i32 @main() #0 {

entry:

  %retval = alloca i32, align 4

  store i32 0, i32* %retval, align 4

  %call = call i32 @test()

  ret i32 %call

}

the functions def like test() and the main() are marked internal by
the internalize pass.

Queries is ,we tried to prevent mark the internal string attribute
before function name i.e

The option you would want is -exported-symbol=main, so that llvm-lto tells
internalize to preserve it. No need to list 'test' since presumably it is
fine to internalize that (and the use in 'main' will prevent it from being
dead code eliminated, which I assume is currently happening to both of
these symbols).

Hi All ,

We are in process of exploring the LTO and found that internalize
pass is the replacement for whole program optimisation
(-fwhole-program in gcc) in clang and in the below case

define i32 @test() #0 {

entry:

  ret i32 0

}

define i32 @main() #0 {

entry:

  %retval = alloca i32, align 4

  store i32 0, i32* %retval, align 4

  %call = call i32 @test()

  ret i32 %call

}

*** IR Dump After Internalize Global Symbols ***; ModuleID = 'ld-temp.o'

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable

define internal i32 @test() #0 {

entry:

  ret i32 0

}

; Function Attrs: nounwind uwtable

define internal i32 @main() #0 {

entry:

  %retval = alloca i32, align 4

  store i32 0, i32* %retval, align 4

  %call = call i32 @test()

  ret i32 %call

}

the functions def like test() and the main() are marked internal by
the internalize pass.

Queries is ,we tried to prevent mark the internal string attribute
before function name i.e

The option you would want is -exported-symbol=main,

Note that on some platforms symbol names get lowered differently.

E.g., on Darwin, the correct usage is `-exported-symbol=_main`.

This is something I find annoying, it there a good reason for that?

I mean the "API" could be defined in a platform independent way so that the conversion happens automatically (or it is the responsibility of the caller to do it).

Hi All ,

We are in process of exploring the LTO and found that internalize
pass is the replacement for whole program optimisation
(-fwhole-program in gcc) in clang and in the below case

define i32 @test() #0 {

entry:

ret i32 0

}

define i32 @main() #0 {

entry:

%retval = alloca i32, align 4

store i32 0, i32* %retval, align 4

%call = call i32 @test()

ret i32 %call

}

*** IR Dump After Internalize Global Symbols ***; ModuleID = 'ld-temp.o'

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable

define internal i32 @test() #0 {

entry:

ret i32 0

}

; Function Attrs: nounwind uwtable

define internal i32 @main() #0 {

entry:

%retval = alloca i32, align 4

store i32 0, i32* %retval, align 4

%call = call i32 @test()

ret i32 %call

}

the functions def like test() and the main() are marked internal by
the internalize pass.

Queries is ,we tried to prevent mark the internal string attribute
before function name i.e

The option you would want is -exported-symbol=main,

Note that on some platforms symbol names get lowered differently.

E.g., on Darwin, the correct usage is `-exported-symbol=_main`.

This is something I find annoying, it there a good reason for that?

I mean the "API" could be defined in a platform independent way so that the conversion happens automatically (or it is the responsibility of the caller to do it).

llvm-lto is meant to test the internals of libLTO. This tests the
code path for exported symbols being passed in by the linker. The
linker on Darwin will pass symbols in using `_`.

I don't personally feel too strongly about how we test that code path,
so if you have a better idea...

I was rather thinking about designing the libLTO API in a way that the linker would have to strip the _ when passing symbols to preserve.
I know we can't change the behavior of the existing API, but this is probably something I'd be careful about with a callback solution.