troubles with llvm-gcc 4.0 and APFloat on X86_64

hi,

i'm trying to make some experiments with the ARM backend (llvm 2.1) and therefore built an arm-softfloat-linux-gnu toolchain on x86_64 linux.

however, the llvm-gcc frontend seems to cause troubles with single precision floating point values, i.e., they are not converted correctly to the particular target format (double precision works as expected).

it seems the problem is related to the following piece of code taken
from APFloat.cpp:1836 (called from ConvertREAL_CST)
  APFloat::APFloat(float f) {
    APInt api = APInt(32, 0);
    initFromAPInt(api.floatToBits(f));
  }

i guess the floatToBits call will return the wrong half word since ints are 4, but float and double are both 8 byte on x86_64 (but i'm not yet sure).

is anybody having the same kind of problems or is there an official patch for this issue? is llvm-gcc known to work as a cross compiler on x86_64 for 32 bit targets and/or arm in particular?

thanks in advance,

hi,

i'm trying to make some experiments with the ARM backend (llvm 2.1) and
therefore built an arm-softfloat-linux-gnu toolchain on x86_64 linux.

however, the llvm-gcc frontend seems to cause troubles with single
precision floating point values, i.e., they are not converted correctly
to the particular target format (double precision works as expected).

I haven't seen this problem. You say the frontend; to check,

llvm-gcc -O0 -S -emit-llvm file.c -o file.ll

produces an invalid constant in the .ll file? Can you give an example?

it seems the problem is related to the following piece of code taken
from APFloat.cpp:1836 (called from ConvertREAL_CST)
  APFloat::APFloat(float f) {
    APInt api = APInt(32, 0);
    initFromAPInt(api.floatToBits(f));
  }

i guess the floatToBits call will return the wrong half word since ints
are 4, but float and double are both 8 byte on x86_64 (but i'm not yet
sure).

I'm pretty sure float is 4 everywhere; that wouldn't be it.

is anybody having the same kind of problems or is there an official
patch for this issue? is llvm-gcc known to work as a cross compiler on
x86_64 for 32 bit targets and/or arm in particular?

I have no 64-bit host available. Cross-compilation between x86-32 and powerPC
(different endianness) works correctly.

hi,

Dale Johannesen wrote:

i'm trying to make some experiments with the ARM backend (llvm 2.1) and
therefore built an arm-softfloat-linux-gnu toolchain on x86_64 linux.

however, the llvm-gcc frontend seems to cause troubles with single
precision floating point values, i.e., they are not converted correctly
to the particular target format (double precision works as expected).

I haven't seen this problem. You say the frontend; to check,

llvm-gcc -O0 -S -emit-llvm file.c -o file.ll

produces an invalid constant in the .ll file? Can you give an example?

here's what i get for the following example:

#include <stdio.h>
int main(int argc, char *argv)
{
   float f = 0.6;
   printf("hello world: %f / %f\n", f, 0.6);
}

hi,

i've got some more things to note. first, the issue is not related to x86_64 being the host machine - it also happens on i686/linux.

next, i think (one of) the problem(s) is the use of [HOST_]WORDS_BIG_ENDIAN instead of [HOST_]FLOAT_WORDS_BIG_ENDIAN in llvm-convert.cpp (see patch below).

this fixes single precision floating point but breaks double precision. for arm-softfloat-linux-gnu, FLOAT_WORDS_BIG_ENDIAN is true while WORDS_BIG_ENDIAN is false. as far as i've seen, there's only a single flag for endianess in the llvm target description string, so i don't really understand how this is supposed to work.

i wonder how other people are cross compiling for arm-linux-gnu?
any help would be highly appreciated!

cheers,

Dietmar,

this fixes single precision floating point but breaks double precision.
for arm-softfloat-linux-gnu, FLOAT_WORDS_BIG_ENDIAN is true while
WORDS_BIG_ENDIAN is false. as far as i've seen, there's only a single
flag for endianess in the llvm target description string, so i don't
really understand how this is supposed to work.

Hrm, I think I even noticed this during llvm-gcc 4.2 bring up, but it
seems I've completely forgotten to raise this question.

i wonder how other people are cross compiling for arm-linux-gnu?
any help would be highly appreciated!

This stuff should be clearly investigated and reworked. All gcc-related
stuff is in real.c / real.h

hi,

i've got some more things to note. first, the issue is not related to
x86_64 being the host machine - it also happens on i686/linux.

next, i think (one of) the problem(s) is the use of
[HOST_]WORDS_BIG_ENDIAN instead of [HOST_]FLOAT_WORDS_BIG_ENDIAN in
llvm-convert.cpp (see patch below).

this fixes single precision floating point but breaks double precision.
for arm-softfloat-linux-gnu, FLOAT_WORDS_BIG_ENDIAN is true while
WORDS_BIG_ENDIAN is false. as far as i've seen, there's only a single
flag for endianess in the llvm target description string, so i don't
really understand how this is supposed to work.

Agree. I think those two match on all the targets I've tried.

I think the right approach is to use REAL_VALUE_TO_TARGET_SINGLE for float
and REAL_VALUE_TO_TARGET_DOUBLE for double, then the two endiannesses
can be handled separately.

Dale Johannesen wrote:

next, i think (one of) the problem(s) is the use of
[HOST_]WORDS_BIG_ENDIAN instead of [HOST_]FLOAT_WORDS_BIG_ENDIAN in
llvm-convert.cpp (see patch below).

this fixes single precision floating point but breaks double precision.
for arm-softfloat-linux-gnu, FLOAT_WORDS_BIG_ENDIAN is true while
WORDS_BIG_ENDIAN is false. as far as i've seen, there's only a single
flag for endianess in the llvm target description string, so i don't
really understand how this is supposed to work.

Agree. I think those two match on all the targets I've tried.

I think the right approach is to use REAL_VALUE_TO_TARGET_SINGLE for float
and REAL_VALUE_TO_TARGET_DOUBLE for double, then the two endiannesses
can be handled separately.

just to be sure: we don't want to generate target dependent constants in the frontend, do we? this would make it imho unnecessarily hard to deal with them in the backend.

what we're trying to do is to add another flag to the target machine that encodes the floating point endianes (just like "e" vs. "E") and use it when appropriate in the backend to get the target layout right(e.g., when emitting constants and during target lowering). we'll test and post a corresponding patch tomorrow.

anyway, i'm obviously not the first person trying to compile for arm-linux and i wonder how other people achieve this, e.g., how are the arm-softfloat targets in the nightly tester configured?

i've changed the subject of the thread since the old one was meanwhile slightly misleading.

cheers and thanks again for your help,

Dale Johannesen wrote:

next, i think (one of) the problem(s) is the use of
[HOST_]WORDS_BIG_ENDIAN instead of [HOST_]FLOAT_WORDS_BIG_ENDIAN in
llvm-convert.cpp (see patch below).

this fixes single precision floating point but breaks double
precision.
for arm-softfloat-linux-gnu, FLOAT_WORDS_BIG_ENDIAN is true while
WORDS_BIG_ENDIAN is false. as far as i've seen, there's only a single
flag for endianess in the llvm target description string, so i don't
really understand how this is supposed to work.

Agree. I think those two match on all the targets I've tried.

I think the right approach is to use REAL_VALUE_TO_TARGET_SINGLE for
float
and REAL_VALUE_TO_TARGET_DOUBLE for double, then the two endiannesses
can be handled separately.

just to be sure: we don't want to generate target dependent constants in
the frontend, do we? this would make it imho unnecessarily hard to deal
with them in the backend.

I'd rather not, but for long double it's unavoidable; there are several variants
for different targets. Also, there are places in the backend where the intermediate
format is picked up and treated as host double, so endianness has to match for that.
I think this is avoidable, but you may have to change more places than you're expecting;
it's going to be a while before constant folding of libm calls is done in software,
for example.

There are targets that have non-IEEE float and double as well, but LLVM doesn't
currently support any of them.

In principle I think keeping IEEE float and double in an endian-independent form
in the IR files is a good idea. BUT:
I'm told retaining the ability to use files in the existing format is a requirement (so
floats still need to occupy 8 bytes). Since ARM target doesn't currently work that one
is a reasonable exception IMO, but changing the format for x86, for example,
would not be greeted with joy.

Yep this is a good way of putting it. Also, if you want to *add* a target data specifier to capture FP endianness (in the target memory, not in the IR files) that would be just fine,

-Chris

hi,

Chris Lattner wrote:

In principle I think keeping IEEE float and double in an endian- independent form in the IR files is a good idea. BUT: I'm told retaining the ability to use files in the existing format is a requirement (so floats still need to occupy 8 bytes). Since ARM target doesn't currently work that one is a reasonable exception IMO, but changing the format for x86, for example, would not be greeted with joy.

Yep this is a good way of putting it. Also, if you want to *add* a target data specifier to capture FP endianness (in the target memory, not in the IR files) that would be just fine,

ok, here's a patch (see attachment). credits for preparing it go to florian brandner. there's a new flag n/N in addition to e vs. E, e.g., e-N-p:32:32-f64:64:64-i64..., that is used to encode FP endianess and which is used to get constants right in the backend. in it's absence it's initialized to be the same as integer endianess.

you'll also need another patch for llvm-gcc to dump the constants in host format and link the softfloat library into libgcc. all patches should apply cleanly to the 2.1 release.

the changes are moderate and appear to work quite well. however, i wouldn't advocate to apply it as it is to llvm. it still needs some testing and i think it's better to get the endianess issue right once and for all by defining a host and target independent format for bitcode that's used throughout the framework.

thanks for your help! cheers,

llvm-2.1-softfloat-endianess.patch (9.89 KB)

llvm-gcc-2.1-softfloat-endianess.patch (2.01 KB)