Bug in opt

I have a problem.

I'm writing a C compiler in my favorite programming language (don't ask :slight_smile:

I have made a .s file, which can be correctly assembled
and run with lli. But when I optimize it I get no errors
from the optimizer, but the resultant file is incorrect.

Here's what happens:

llvm-as test2_gen.s %% no errors test2_gen.s.bc is produced

lli test2_gen.s.bc
n=887459712 %% no errors

opt -std-compile-opts -S test2_gen.s.bc > test2_opt.s.bc

%% no errors
%% But now the generated file cannon be disassembled or run

lli test2_opt.s.bc
lli: error loading program 'test2_opt.s.bc': Bitcode stream should be a
multiple
of 4 bytes in length
llvm-dis test2_opt.s.bc
llvm-dis: Bitcode stream should be a multiple of 4 bytes in length

The generated .s file is as follows:

; ----- start
; Compiled by the amazing Ericsson C->LLVM compiler
; Hand crafted in Erlang
; ModuleID = 'test2.c'

target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:
64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"
target triple = "i386-pc-linux-gnu"

; Sock it to me baby

;; globals
declare i32 @printf(i8* , ...)

@main.str1 = private constant [6x i8] c"n=%i\0A\00"

;; code
define i32 @main() nounwind {
  ;;return register

  %tmp_1 = alloca i32 ,align 4
  %i = alloca i32 ,align 4
  %max = alloca i32 ,align 4
  %n = alloca i32 ,align 4
  %tmp_2 = add i32 0,0
  store i32 %tmp_2 ,i32* %i
  %tmp_3 = add i32 0,100000000
  store i32 %tmp_3 ,i32* %max
  %tmp_4 = add i32 0,0
  store i32 %tmp_4 ,i32* %n
  br label %initfor_1

initfor_1:
  %tmp_5 = add i32 0,0
  store i32 %tmp_5 ,i32* %i
  br label %testfor_3

updatefor_2:
  %tmp_6 = load i32* %i
  %tmp_7 = add i32 0,1
  %tmp_8 = add i32 %tmp_6 ,%tmp_7
  store i32 %tmp_8 ,i32* %i
  br label %testfor_3

testfor_3:
  %tmp_9 = load i32* %i
  %tmp_10 = load i32* %max
  %tmp_11 = icmp slt i32 %tmp_9 ,%tmp_10
  br i1 %tmp_11 ,label %bodyfor_4,label %endfor_5

bodyfor_4:
  %tmp_12 = load i32* %n
  %tmp_13 = load i32* %i
  %tmp_14 = add i32 %tmp_12 ,%tmp_13
  store i32 %tmp_14 ,i32* %n
  br label %updatefor_2

endfor_5:
  %tmp_15 = getelementptr [6 x i8]* @main.str1, i32 0, i32 0
  %tmp_16 = load i32* %n
  %tmp_17 = call i32 (i8* , ...)* @printf(i8* %tmp_15 , i32 %tmp_16 )
  %tmp_18 = add i32 0,0
  ret i32 %tmp_18
}

The C code was as follows:

int printf(const char * format, ...);

int main()
{
  int i=0, max=100000000,n=0;
  for(i = 0; i < max; i = i + 1){
    n = n + i;
    }
  printf("n=%i\n", n);
  return(0);
}

/Joe

Hi Joe,

I have made a .s file, which can be correctly assembled
and run with lli. But when I optimize it I get no errors
from the optimizer, but the resultant file is incorrect.

Here's what happens:

llvm-as test2_gen.s %% no errors test2_gen.s.bc is produced

there's actually no need to assemble this to bitcode: you can pass
test2_gen.s directly to opt. At least you can in recent versions of
LLVM.

opt -std-compile-opts -S test2_gen.s.bc> test2_opt.s.bc

By using -S you ask opt to produce human readable IR rather than
bitcode, so you should really output to test2_opt.s.

%% no errors
%% But now the generated file cannon be disassembled or run

lli test2_opt.s.bc
lli: error loading program 'test2_opt.s.bc': Bitcode stream should be a
multiple
of 4 bytes in length

This means that it doesn't contain bitcode. And indeed it doesn't, it
contains human readable IR due to your using -S above.

llvm-dis test2_opt.s.bc
llvm-dis: Bitcode stream should be a multiple of 4 bytes in length

Same problem.

That said, in latest LLVM lli accepts human readable IR as well as bitcode,
so I'm guessing that you are using an older version that does not have this
feature. Of course I may also have misdiagnosed the problem :slight_smile:

Ciao, Duncan.

You have produced .ll, not .bc. This is die -S flag.

Hello

opt -std-compile-opts -S test2_gen.s.bc > test2_opt.s.bc

I believe -S will yield the text output.

Hi Joe,

I have made a .s file, which can be correctly assembled
and run with lli. But when I optimize it I get no errors
from the optimizer, but the resultant file is incorrect.

Here's what happens:

llvm-as test2_gen.s %% no errors test2_gen.s.bc is produced

there's actually no need to assemble this to bitcode: you can pass
test2_gen.s directly to opt. At least you can in recent versions of
LLVM.

opt -std-compile-opts -S test2_gen.s.bc> test2_opt.s.bc

By using -S you ask opt to produce human readable IR rather than
bitcode, so you should really output to test2_opt.s.

%% no errors
%% But now the generated file cannon be disassembled or run

lli test2_opt.s.bc
lli: error loading program 'test2_opt.s.bc': Bitcode stream should be a
multiple
of 4 bytes in length

This means that it doesn't contain bitcode. And indeed it doesn't, it
contains human readable IR due to your using -S above.

Silly me I didn't think to look - you're right.

This is very cool - my C compiler spits out lousy code, but after
"opt'ing" the result
more or less results in what as optimising C compiler would have spit out.

Which means that language interoperability becomes really easy - just
parse and de-sugar
the input (form any language) and your're away.

Thanks for you help

/Joe

We normally reserve the .s suffix for native assembly files and use .ll for LLVM IR assembly. The .bc suffix implies binary bitcode.

llvm-as: .ll -> .bc
llvm-dis: .bc -> .ll
llc: .ll/.bc -> .s

I don't think the tools require these suffixes, but it helps avoid confusion.

/jakob