global question

Below is some C code and its LLVM translation (using a recent rev). The compiler goes ahead and loads from x in main, but it seems clear that a constant propagation pass could have inferred that x is 0. In fact that is what happens if x is static, but in this case I cannot see how external linkage of x would invalidate that optimization.

This is one of those optimizations that would not be an issue for SPEC or whatever, but probably makes a difference for smallish embedded codes that tend to have a lot of code in main().

Thanks,

John Regehr

int x;

int main (void)
{
   return x;
}

regehr@john-home:~$ llvm-gcc -O2 -S --emit-llvm test.c -o -

; ModuleID = 'test.c'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-linux-gnu"
@x = common global i32 0 ; <i32*> [#uses=1]

define i32 @main() nounwind readonly {
entry:
  %0 = load i32* @x, align 4 ; <i32> [#uses=1]
  ret i32 %0
}

The only problem is that you could have another translation unit, and it could be in C++, and it could have a global object with a constructor that runs, and that constructor might set the value of x to 1. So, inferring the value is 0 isn't possible in that case. Now, if you do -fwhole-program and/or tell the optimizer the only export is main, it might have a chance at doing that you want.

Hi John,

Below is some C code and its LLVM translation (using a recent rev). The
compiler goes ahead and loads from x in main, but it seems clear that a
constant propagation pass could have inferred that x is 0. In fact that
is what happens if x is static, but in this case I cannot see how
external linkage of x would invalidate that optimization.

int x;

int main (void)
{
  return x;
}

Please correct me if I'm wrong, but how can the compiler know, that x is not initialized in another file which defines x as extern? It can only be sure, when x is declared static.

So, for me the LLVM-code is right.

(I even remember someone, that it is platform specific whether x is initialized to 0 or not... or was it a question of C89 or C99?)

regards,
Patrick.

Hi John,

Below is some C code and its LLVM translation (using a recent rev). The
compiler goes ahead and loads from x in main, but it seems clear that a
constant propagation pass could have inferred that x is 0. In fact that
is what happens if x is static, but in this case I cannot see how
external linkage of x would invalidate that optimization.

int x;

int main (void)
{
  return x;
}

Please correct me if I'm wrong, but how can the compiler know, that x is not initialized in another file which defines x as extern? It can only be
sure, when x is declared static.

So, for me the LLVM-code is right.

(I even remember someone, that it is platform specific whether x is initialized to 0 or not... or was it a question of C89 or C99?)

regards,
Patrick.

Please correct me if I'm wrong, but how can the compiler know, that x is not initialized in another file which defines x as extern? It can only be
sure, when x is declared static.

I don't think you can attach an initializer to an extern except at the point where the variable is defined. But since x is defined here, there would then be two definitions of x, a link-time error.

(I even remember someone, that it is platform specific whether x is initialized to 0 or not... or was it a question of C89 or C99?)

Definitely initialized to zero.

John

The only problem is that you could have another translation unit, and
it could be in C++, and it could have a global object with a
constructor that runs, and that constructor might set the value of x
to 1.

Thanks. That is ugly.

John

Please correct me if I'm wrong, but how can the compiler know, that x is
not initialized in another file which defines x as extern? It can only be
sure, when x is declared static.

That's technically illegal in pure standard C, but yes, another file
with "int x = 10;" would override the definition in gcc.

(I even remember someone, that it is platform specific whether x is
initialized to 0 or not... or was it a question of C89 or C99?)

It's definitely initialized to zero; it's what the standard calls a
tentative definition. See C99 6.9.2p2.

-Eli

If that is the entire program, the value will be 0, and that is not platform specific, save the fact in ancient history there were buggy systems.

FWIW

-------- main.c ----------

#include <stdio.h>

int x;

int main(int arcg, char **argv) {
fprintf(stderr, “x is: %d\n”, x);
return 0;
}

---------- init.c ---------

extern int x;

attribute((constructor))
static void __initialize() {
x = 12;
}

[MacBook:~/Desktop]% gcc -c init.c
[MacBook:~/Desktop]% gcc -c main.c
[MacBook:~/Desktop]% gcc -o test init.o main.o
[MacBook:~/Desktop]% ./test

x is: 12

Strictly, that's true, but gcc supports it as an extension. Variables
like "int x;" are given common linkage, which allows them to be
overridden.

-Eli

Using the "opt -internalize" pass will mark everything internal (aka static) if there is a main defined in the .bc file. If this is useful for your project, it should be easy to do. Another (better) option is to use LTO with an export map that says you only export main.

-Chris

Or just have init.c say
int x = 12;

(Works with most C compilers.)