Help adding the Bullet physics sdk benchmark to the LLVM test suite?

How do other benchmarks deal with unstable algorithms or differences in floating point results?

haven’t been following this thread, but this sounds like a typical
unstable algorithm problem. Are you always operating that close to
the tolerance level of the algorithm or are there some sets of inputs
that will behave reasonably?

What do you mean by “reasonably” or “affect codes so horribly”?

The accumulation of algorithms in a physics pipeline is unstable and unless the compiler/platform
guarantees 100% identical floating point results, the outcome will diverge.

Do you think LLVM can be forced to produce identical floating point results?
Even when using different optimization levels or even different CPUs?

Some CPUs use 80bit FPU precision for intermediate results (on-chip in registers),
while variables in-memory only use 32-bit or 64bit precision.
In combination with cancellation and other re-ordering this can give slightly different results.

If not, the code doesn’t seem very useful to me. How could anyone rely
on the results, ever?

The code has proven to be useful for games and special effects in film,
but this particular benchmark might not suite LLVM testing indeed.

I suggest working on a better benchmark that tests independent parts of the pipeline,
so we don’t accumulate results (several frames) but we test a single algorithm at a time,
with known input and expected output. This avoid unstability and we can measure the error of the output.

Anton, are you interested in working together on such improved benchmark?
Thanks,
Erwin

How do other benchmarks deal with unstable algorithms or differences in
floating point results?

>> haven't been following this thread, but this sounds like a typical
>> unstable algorithm problem. Are you always operating that close to
>> the tolerance level of the algorithm or are there some sets of inputs
>> that will behave reasonably?

What do you mean by "reasonably" or "affect codes so horribly"?

"Reasonably" means the numerics won't blow up due to small changes in
floating-point results caused by compiler transformations like reassociation.

"Affects code so horribly" means that the compiler is causing an unstable
algorithm to blow up, generating useless results. This shouldn't happen
unless the user allows it with an explicit compiler flag. AFAIK LLVM has
no such flag yet. It has some flags to control changes in precision, which
helps, but I don't think there's a flag that says "don't do anything risky,
ever."

For example, a gfortran-fronted LLVM should have a way to always respect
ordering indicated by parentheses. I don't know if gfortran even has that,
let alone LLVM proper.

The accumulation of algorithms in a physics pipeline is unstable and unless
the compiler/platform guarantees 100% identical floating point results, the
outcome will diverge.

Yep. 100% reproducability is really important. LLVM should have a flag to
guarantee it.

Do you think LLVM can be forced to produce identical floating point
results? Even when using different optimization levels or even different
CPUs?

Not right now, but the support can certainly be added. It really *should*
be added. It will take a bit of work, however.

Some CPUs use 80bit FPU precision for intermediate results (on-chip in
registers), while variables in-memory only use 32-bit or 64bit precision.
In combination with cancellation and other re-ordering this can
give slightly different results.

Yep, which is why good compilers have ways to control this. llc, for example,
has the -disable-excess-fp-precision and -enable-unsafe-fp-math options. I
don't know if there's a way to control usage of the x87 stack, however.

>> If not, the code doesn't seem very useful to me. How could anyone rely
>> on the results, ever?

The code has proven to be useful for games and special effects in film,
but this particular benchmark might not suite LLVM testing indeed.

We can make it suit it. If it works for real world situations it must
work when compiled with LLVM. Otherwise it's an LLVM bug (assuming the
code is not doing undefined things).

I suggest working on a better benchmark that tests independent parts of the
pipeline,

That's useful in itself.

so we don't accumulate results (several frames) but we test a single
algorithm at a time,

No, we should be testing this accumulated stuff as well. As LLVM gets used in
more arenas, this type of problem will crop up, guaranteed. In fact the only
way we (Cray) get away with it is that we don't use very many LLVM passes and
we stricly target SSE only.

                              -Dave

I don't think there's a flag that says "don't do anything risky,
ever."

"Don't do anything risky with floating-point" is the default mode. If you're
aware of any unsafe floating-point optimizations being done by default, please
file a bug.

For example, a gfortran-fronted LLVM should have a way to always respect
ordering indicated by parentheses. I don't know if gfortran even has that,
let alone LLVM proper.

LLVM does not currently re-associate floating-point values, so this hasn't
been an issue.

Dan

Hello, Erwin

I suggest working on a better benchmark that tests independent parts of the
pipeline,
so we don't accumulate results (several frames) but we test a single
algorithm at a time,
with known input and expected output. This avoid unstability and we can
measure the error of the output.
Anton, are you interested in working together on such improved benchmark?

This is pretty interesting approach. However, for now I'm more
concerned about code speed, I'm seeing that llvm-generated code is
slower that gcc-generated one on at least two platforms (20% on x86-64
& even more on arm), so, I suspect an optimization deficiency is
somewhere...

Ok. It seems that something is causing a problem if Bullet is failing.

                               -Dave

But keep in mind that fast+incorrect is no good. Are we sure the gcc code is
correct?

                                 -Dave

We haven’t determined what ‘failing’ means or what the ‘correct’ behaviour is.

Imagine a ball at the top of a rounded hill. If the ball is not exactly at the top but a tiny amount on the left it will roll left,
but a tiny amount on the right it will roll right. The difference in initial position can be negligible but the final result is miles away.

Is there a irc channel or perhaps google wave for a quick chat? Or do people prefer to keep all communication in the mailing list so everyone can participate?
Thanks,
Erwin

2010/1/5 David Greene <dag@cray.com>

Hello, Erwin

Is there a irc channel or perhaps google wave for a quick chat? Or do people
prefer to keep all communication in the mailing list so everyone can
participate?

There is #llvm IRC channel on OFTC network. ML audience is definitely larger :slight_smile: