Dragonegg + IR + llc = Dragonegg directly

Hi all,

I tried using dragonegg to compile some numerical software of ours. I
tried out two different approaches expecting both would yield the same
results:
1. gfortran-4.6 -fplugin=dragonegg-3.0 -o test.o test.f (I ommitted a
bunch of additional arguments for brevity)
2. gfortran-4.6 -fplugin=dragonegg-3.0 -fplugin-arg-dragonegg-emit-ir -S
-o test.ll test.f
   llc -O0 -o test.s test.ll
   as -o test.o test.s

When comparing the results of our software compiled with gfortran-4.6
without LLVM with approach 1, I get the same results for both versions.
However, when I use approach 2, the computation results differ from the
original version.

I expected that using the dragonegg plugin to generate native code
directly would internally do more or less the same as the explicit step of
approach 2, but this does not seem to be the case. Any ideas why this
happens?

Regards,

Martin

Hi Martin,

I tried using dragonegg to compile some numerical software of ours. I
tried out two different approaches expecting both would yield the same
results:
1. gfortran-4.6 -fplugin=dragonegg-3.0 -o test.o test.f (I ommitted a
bunch of additional arguments for brevity)
2. gfortran-4.6 -fplugin=dragonegg-3.0 -fplugin-arg-dragonegg-emit-ir -S
-o test.ll test.f
    llc -O0 -o test.s test.ll
    as -o test.o test.s

When comparing the results of our software compiled with gfortran-4.6
without LLVM with approach 1, I get the same results for both versions.
However, when I use approach 2, the computation results differ from the
original version.

I expected that using the dragonegg plugin to generate native code
directly would internally do more or less the same as the explicit step of
approach 2, but this does not seem to be the case. Any ideas why this
happens?

different arguments are being passed to the code generators. In the
CreateTargetMachine function in Backend.cpp options like -fPIC,
-fomit-frame-pointer and so on are transformed into LLVM languages
and passed to the code generators. Same goes for feature strings
(whether you are targeting a machine supporting SSE and so on). A
few more generic codegen options are set in ConfigureLLVM.

Ciao, Duncan.

Hi Duncan,

thanks for the quick reply. I understand, that the generated code is different between the two approaches.
But I would still expect IEEE rules to be respected in any case. I do not see any reason why -fPIC -fomit-frame-pointer
and the like should have any impact on the results computed by the generated code.

Are there any options I can set on the command line of llc to force the identical behaviour with respect to numerical stability?
I tried the some of the llc options like --disable-excess-fp-precision and --disable-fp-elim, but without success.

Martin

Martin,

Are there any options I can set on the command line of llc to force the
identical behaviour with respect to numerical stability?
I tried the some of the llc options like --disable-excess-fp-precision and
--disable-fp-elim, but without success.

And the first cmdline contained -O0 as well? Or... it was something
different? What if you omit -O0 from llc for the second time?

Hi Martin,

thanks for the quick reply. I understand, that the generated code is different
between the two approaches.
But I would still expect IEEE rules to be respected in any case. I do not see
any reason why -fPIC -fomit-frame-pointer
and the like should have any impact on the results computed by the generated code.

probably in the direct case you are using the x86 floating point stack and in
the llc case you are using xmm registers. This is because dragonegg picks up
what gcc thinks the target is, and gcc is very conservative, while llc defaults
to targeting the host machine.

Are there any options I can set on the command line of llc to force the
identical behaviour with respect to numerical stability?

Try llc -mcpu=i386

You may also need to turn off some cpu attributes like SSE, I don't recall, see
   llc -mcpu=help test.ll
for a list.

Ciao, Duncan.

Hi Anton,

yes the first command line contained -O0 as well. I also tried omitting
-O0 from the llc command line, but this made no difference.

Martin

Martin

yes the first command line contained -O0 as well. I also tried omitting
-O0 from the llc command line, but this made no difference.

Hrm, interesting... Is it possible for you to share the .S / .bc?

You may also need to turn off some cpu attributes like SSE, I don't recall, see
llc -mcpu=help test.ll

Maybe it'd be easier at dragonegg level -mfpmath=sse / x87 - and
compare the results

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Anton Korobeynikov
Sent: Tuesday, April 17, 2012 12:04 PM
To: Duncan Sands
Cc: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Dragonegg + IR + llc = Dragonegg directly

> You may also need to turn off some cpu attributes like SSE, I don't
> recall, see
> llc -mcpu=help test.ll
Maybe it'd be easier at dragonegg level -mfpmath=sse / x87 - and compare
the results

Indeed. Various GCC flavors (and probably gfortran) default to using x87 math which uses 80 bit intermediates internally for 64 bit calculations. LLVM tends to be smarter and tries to use SSE if the host proc supports it. SSE uses 64 bit throughout the path, so the results often do not match up in the lower bits, and can propagate into much larger differences given a large enough number of operations.

-Gordon Keiser

Hi Duncan,

I tried it with llc -mcpu=i386 and now the results are the same. Thanks
for the tip.
I am in the process of writing an optimization pass involving constant
propagation,
and I want to be sure, that this pass does not modify the numerical results.

Martin