General strategy to optimize LLVM IR

Hi,

Our DSL emit sub-optimal LLVM IR that we optimize later on (LLVM IR ==> LLVM IR) before dynamically compiling it with the JIT. We would like to simply follow what clang/clang++ does when compiling with -O1/-O2/-O3 options. Our strategy up to now what to look at the opt.cpp code and take part of it in order to implement our optimization code.

It appears to be rather difficult to follow evolution of the LLVM IR optimization strategies. With LLVM 3.3 our optimization code does not produce code as fast as the one produced with clang -03 anymore. Moreover the new vectorizations passes are still not working.

It there a recommended way to add -O1/-O2/-O3 kind of optimizations on LLVM IR code? Any code to look at beside the opt.cpp tool?

Thanks.

Stéphane Letz

Hi,

Our DSL emit sub-optimal LLVM IR that we optimize later on (LLVM IR ==> LLVM IR) before dynamically compiling it with the JIT. We would like to simply follow what clang/clang++ does when compiling with -O1/-O2/-O3 options. Our strategy up to now what to look at the opt.cpp code and take part of it in order to implement our optimization code.

It appears to be rather difficult to follow evolution of the LLVM IR optimization strategies. With LLVM 3.3 our optimization code does not produce code as fast as the one produced with clang -03 anymore. Moreover the new vectorizations passes are still not working.

It there a recommended way to add -O1/-O2/-O3 kind of optimizations on LLVM IR code? Any code to look at beside the opt.cpp tool?

I'm not /entirely/ sure what you're asking. It sounds like you're
asking "what passes should my compiler's -O1/2/3 flag's correspond to"
and one answer to that is to look at Clang (I think Clang's is
different from opt/llc's, maybe).

Hi,

Our DSL emit sub-optimal LLVM IR that we optimize later on (LLVM IR ==> LLVM IR) before dynamically compiling it with the JIT. We would like to simply follow what clang/clang++ does when compiling with -O1/-O2/-O3 options. Our strategy up to now what to look at the opt.cpp code and take part of it in order to implement our optimization code.

It appears to be rather difficult to follow evolution of the LLVM IR optimization strategies. With LLVM 3.3 our optimization code does not produce code as fast as the one produced with clang -03 anymore. Moreover the new vectorizations passes are still not working.

It there a recommended way to add -O1/-O2/-O3 kind of optimizations on LLVM IR code? Any code to look at beside the opt.cpp tool?

I’m not /entirely/ sure what you’re asking. It sounds like you’re
asking “what passes should my compiler’s -O1/2/3 flag’s correspond to”
and one answer to that is to look at Clang (I think Clang’s is
different from opt/llc’s, maybe).

PassManagerBuilder decides what passes to run. Unfortunately, the clang driver uses a back door to set a bunch of flags that configure PassManagerBuilder See EmitAssemblyHelper::CreatePasses. I find this extremely difficult to follow and don’t know of any way to derive an equivalent “opt” command line. Good luck.

-Andy

Hi Stéphane,

Hi,

Our DSL emit sub-optimal LLVM IR that we optimize later on (LLVM IR ==> LLVM IR) before dynamically compiling it with the JIT. We would like to simply follow what clang/clang++ does when compiling with -O1/-O2/-O3 options. Our strategy up to now what to look at the opt.cpp code and take part of it in order to implement our optimization code.

It appears to be rather difficult to follow evolution of the LLVM IR optimization strategies. With LLVM 3.3 our optimization code does not produce code as fast as the one produced with clang -03 anymore. Moreover the new vectorizations passes are still not working.

It there a recommended way to add -O1/-O2/-O3 kind of optimizations on LLVM IR code? Any code to look at beside the opt.cpp tool?

the list of passes (and the flags that can be used to tweak it) is in
   lib/Transforms/IPO/PassManagerBuilder.cpp
You can use the PassManagerBuilder to create your own pass list. However
this is not enough to get good optimization, some more things are needed:

   1) You must add DataLayout info to the module (using setDataLayout). For
the vectorizer to do anything I think you are also obliged to add a target
triple (using setTargetTriple);
   2) In order to get vectorization you also have to add target specific
analysis passes using addAnalysisPasses (see TargetMachine).

Ciao, Duncan.

After taking code from LLVM 3.3 opt.cpp tool, the LLVM IR optimizations now produce correctly optimized code (by comparing with what clang -O3 -emit-llvm and opt -O3 give).

Then the LLVM IR is given to JIT, but now we see speedup regression compared to what we had with LLVM 3.1 (by comparing how clang -O3 does with a C version of our generated code and what is compiled using a LLVM IR ==> (optimizations passes) ==> LLVM IR ==> JIT.

Our code basically does:

EngineBuilder builder(fResult->fModule);
builder.setOptLevel(CodeGenOpt::Aggressive);
builder.setEngineKind(EngineKind::JIT);
builder.setUseMCJIT(true);

(I tried to add builder.setMCPU(llvm::sys::getHostCPUName()); without changes…)

Is there any new things to "activate" in LLVM 3.3 to get similar speed results to what we had with LLVM 3.1?

Thanks

Stéphane Letz