clang 4.0.0: Invalid code for builtin floating point function with -mfloat-abi=hard -ffast-math (ARM)

Hello,

clang/llvm 4.0.0 generates invalid calls for builtin functions with -mfloat-abi=hard -ffast-math.

Small example fail.c:

    // clang -O2 -target armv7a-none-none-eabi -mfloat-abi=hard -ffast-math -S fail.c -o -
    extern float sinf (float x);
    float sin1 (float x) {return (sinf (x));}

generates code to pass the parameter in r0 and expect the result in r0.
The same code without -ffast-math compiles correctly. It also works with -fno-builtin-sinf.

(-O2 is not required to trigger the bug, but makes the resulting code easier to read)

It seems -ffast-math changes the internal declaration of builtin functions, as if declared
with -mfloat-abi=soft.

Works just fine with or without -ffast-math in previous releases.

cheers,

Peter

Hello,

clang/llvm 4.0.0 generates invalid calls for builtin functions with -mfloat-abi=hard -ffast-math.

Small example fail.c:

   // clang -O2 -target armv7a-none-none-eabi -mfloat-abi=hard -ffast-math -S fail.c -o -
   extern float sinf (float x);
   float sin1 (float x) {return (sinf (x));}

generates code to pass the parameter in r0 and expect the result in r0.
The same code without -ffast-math compiles correctly. It also works with -fno-builtin-sinf.

(-O2 is not required to trigger the bug, but makes the resulting code easier to read)

It seems -ffast-math changes the internal declaration of builtin functions,

It would be more accurate to say -ffast-math makes the compiler treat sin() as a builtin, and therefore recreate the declaration from scratch.

as if declared
with -mfloat-abi=soft.

That is probably unintentional. Granted, using -mfloat-abi like this is kind of weird, but I think clang's behavior is supposed to be gcc-compatible. See rG6ef45916c613 and 30543 – 4.0 Regression in llvm.powi.f64 lowering on ARM for the most recent work in this area.

CC'ing the author of that commit.

-Eli

why is using -mfloat-abi weird?

And I do not agree that whether functions are treated as builtin should depend on -ffast-math.
(it does not with gcc)

Obviously rL291909 is related. It explicitly states -mfloat-abi should be ignored for builtins.
(It does not actually ignore -mfloat-abi unless -ffast-math is specified)

I do not think this is how gcc actually behaves.
Gcc does not ignore -mfloat-abi. This really changes the calling conventions for builtin functions to VFP.

And with gcc -ffast-math does *not* change CC. I tested this with the latest binary I could find (gcc version 6.3.1 20170215):

https://developer.arm.com/-/media/Files/downloads/gnu-rm/6_1-2017q1/gcc-arm-none-eabi-6-2017-q1-update-win32.exe?product=GNU%20ARM%20Embedded%20Toolchain,32-
bit,Windows,6-2017-q1-update

Here is a more elaborate example of the problem:

fail.c:

   extern float sinf (float x);
   float sin1 (float x) {return (sinf (x) + 1.0);}

arm-none-eabi-gcc.exe -O2 -mfloat-abi=hard -ffast-math -S fail.c -o -

   sin1:
     push {r4, lr}
     bl sinf
     vldr.32 s15, .L3
     pop {r4, lr}
     vadd.f32 s0, s0, s15
     bx lr

VFP CC (and optimal code).

Compiled with clang/llvm 4.0.0:
clang.exe -target armv7a-none-none-eabi -O2 -mfloat-abi=hard -ffast-math -S fail.c -o -

   sin1:
     push {r11, lr}
     mov r11, sp
     vmov r0, s0
     bl sinf
     vmov.f32 d16, #1.000000e+00
     vmov d17, r0, r0
     vadd.f32 d0, d17, d16
     pop {r11, pc}

Using non-VFP CC - incompatible with the gcc code.

Finally, using eabihf or -fno-fast-math corrects the problem:

clang.exe -target armv7a-none-none-eabihf -O2 -mfloat-abi=hard -ffast-math -S fail.c -o -

   sin1:
     push {r11, lr}
     mov r11, sp
     vpush {d8}
     vmov.f32 d8, #1.000000e+00
     bl sinf
     vadd.f32 d0, d0, d8
     vpop {d8}
     pop {r11, pc}

This is using VFP calling conventions as expected.

(I don't like llvm's handling of r11 and d8, though - the optimizer needs to be optimized...:slight_smile:

IMHO the current handling is incompatible with gcc. (and with previous releases of clang)

Cheers,

Peter

That is probably unintentional. Granted, using -mfloat-abi like this is
kind of weird, but I think clang's behavior is supposed to be
gcc-compatible. See rG6ef45916c613 and
30543 – 4.0 Regression in llvm.powi.f64 lowering on ARM for the most recent work in
this area.

why is using -mfloat-abi weird?

Err, oh, I think I misunderstood how it's supposed to work the last time I looked. I guess it's supposed to be the same as using --target?

And I do not agree that whether functions are treated as builtin should depend on -ffast-math.
(it does not with gcc)

Err, oh, sorry, that isn't really what I meant. We treat it as a C library function either way; we just model it in the IR differently because we don't want to model modifications to errno in fast-math mode.

-Eli

Small example fail.c:

   // clang -O2 -target armv7a-none-none-eabi -mfloat-abi=hard -ffast-math
-S fail.c -o -
   extern float sinf (float x);
   float sin1 (float x) {return (sinf (x));}

I changed your example slightly to make sure we're passing the
arguments, otherwise 'sin1' just becomes 'b sinf', which is "correct"
on both hard and soft float.

extern float sinf (float x);
float sin1 (float x, float y) {return (sinf (y)+x);}

generates code to pass the parameter in r0 and expect the result in r0.
The same code without -ffast-math compiles correctly. It also works with
-fno-builtin-sinf.

Right, so this is the problem. You're declaring a function that is
declared in the C library, and the compiler will try and match it with
what it understands of the ABI (via builtins).

In this case, GCC and Clang "understand" different things about them,
and I'm not saying Clang is right. But if you want to use your own
sine functions, using -fno-builtin-sinf is the *right* way to go.

So, now that you have a work around that makes sense (and can progress
without us blocking you), let's talk about the problem at hand.

It would be more accurate to say -ffast-math makes the compiler treat sin()
as a builtin, and therefore recreate the declaration from scratch.

Precisely.

But the problem is not as simple as it seems. We had a conversation
about this during the US LLVM last year, and the conclusion is that
soft-float ABI functions should always have soft-float calling
conventions (GCC seems to agree). But sin/cos are not soft-fp
functions at all.

Looking at the patch, sin/cos wasn't wrongly bundled with the soft-fp
nodes, so it's possible that fast-math is combining nodes, thus
changing the behaviour of the selection dag wrt calling conventions.

Saleem,

Any light on why is sin/cos bundled with soft-fp handling?

cheers,
--renato

>> Small example fail.c:
>>
>> // clang -O2 -target armv7a-none-none-eabi -mfloat-abi=hard
-ffast-math
>> -S fail.c -o -
>> extern float sinf (float x);
>> float sin1 (float x) {return (sinf (x));}

I changed your example slightly to make sure we're passing the
arguments, otherwise 'sin1' just becomes 'b sinf', which is "correct"
on both hard and soft float.

extern float sinf (float x);
float sin1 (float x, float y) {return (sinf (y)+x);}

>> generates code to pass the parameter in r0 and expect the result in r0.
>> The same code without -ffast-math compiles correctly. It also works with
>> -fno-builtin-sinf.

Right, so this is the problem. You're declaring a function that is
declared in the C library, and the compiler will try and match it with
what it understands of the ABI (via builtins).

In this case, GCC and Clang "understand" different things about them,
and I'm not saying Clang is right. But if you want to use your own
sine functions, using -fno-builtin-sinf is the *right* way to go.

So, now that you have a work around that makes sense (and can progress
without us blocking you), let's talk about the problem at hand.

> It would be more accurate to say -ffast-math makes the compiler treat
sin()
> as a builtin, and therefore recreate the declaration from scratch.

Precisely.

But the problem is not as simple as it seems. We had a conversation
about this during the US LLVM last year, and the conclusion is that
soft-float ABI functions should always have soft-float calling
conventions (GCC seems to agree). But sin/cos are not soft-fp
functions at all.

Looking at the patch, sin/cos wasn't wrongly bundled with the soft-fp
nodes, so it's possible that fast-math is combining nodes, thus
changing the behaviour of the selection dag wrt calling conventions.

Saleem,

Any light on why is sin/cos bundled with soft-fp handling?

sin/cos are libm functions, and so a libcall to those need to honour the
floating point ABI requests. The calling convention to be followed there
should match `-mfloat-abi` (that is, -mfloat-abi=hard => AAPCS/VFP,
-mfloat-abi=soft => AAPCS).

Exactly, but they're not, and that's the problem. Do you have any idea
why -ffast-math would change their PCS for libc calls?

The behaviour seems to have been by your patch
(https://reviews.llvm.org/rL291909), so maybe there's some
transformation that uses the run-time functions, or some other badly
coupled logic...

cheers,
--renato

I think the problem is here:

llvm\lib\Target\ARM\ARMISelLowering.cpp, line 187:

   if (!Subtarget->isTargetDarwin() && !Subtarget->isTargetIOS() &&
       !Subtarget->isTargetWatchOS()) {
     const auto &E = Subtarget->getTargetTriple().getEnvironment();

     bool IsHFTarget = E == Triple::EABIHF || E == Triple::GNUEABIHF ||
                       E == Triple::MuslEABIHF;
     // Windows is a special case. Technically, we will replace all of the "GNU"
     // calls with calls to MSVCRT if appropriate and adjust the calling
     // convention then.
     IsHFTarget = IsHFTarget || Subtarget->isTargetWindows();

     for (int LCID = 0; LCID < RTLIB::UNKNOWN_LIBCALL; ++LCID)
       setLibcallCallingConv(static_cast<RTLIB::Libcall>(LCID),
                             IsHFTarget ? CallingConv::ARM_AAPCS_VFP
                                        : CallingConv::ARM_AAPCS);
   }

IsHFTarget is only true if the target is eabihf. -msoft-abi is completely ignored.

cheers,

Peter

Did anything ever happen here? This still appears to be broken on trunk.

-Eli

Hi Peter,

Sorry, I completely missed this reply. Peter (Smith) is looking into
it, so copying him, here.

I remember too many hacks to Clang's driver to understand all the
nuances between the flags to copy GCC bug-for-bug. We've done it
mostly because the solution space is gigantic and too many (external)
people rely on those bugs being there or the hackery they have won't
work.

I have always been a strong supporter to clean up GCC bug
implementations and be candid (but helpful) to the users to help them
migrate to a decent scheme (and ultimately fix GCC).

Hopefully, we'll have now enough people looking into it that we can
get at least some of that cleaned up.

cheers,
--renato

PS: Also adding Joerg, because he has some weird cases on his own.

I have been looking into one particular case so far which is in clang
https://bugs.llvm.org/show_bug.cgi?id=28164
The _Complex multiply and divide helpers are always called with AAPCS,
regardless of -mfloat-abi=hard which is not correct.

I'm not at the moment looking at the general problem but I can spare
the time to do so soon.

Apologies in advance for jumping in here without knowing much of the context.

My understanding (not much) of gcc is that the default of -mfloat-abi
is controllable at gcc build time. Some distributions such as debian
use gnueabihf to distinguish -mfloat-abi=hard and gnueabi as
-mfloat-abi=soft, but not all do i.e. they do not follow the
convention of adding hf to the triple so you can have eabi meaning
-mfloat-abi=hard. If I'm correct here it seems that it is impossible
to match any arbitrary target that you might get from gcc as there
isn't a universally followed convention. To me this implies that
-mfloat-abi should take precedence over what it is in the target.

Peter

The libgcc behavior here is IMO arbitrarily insane and broken.
Essentially, there are functions in libgcc/compiler_rt that have an ABI
dictated by ARM EABI and there are functions outside that scope. Why
those should have a hard-wired ABI (AAPCS) is beyond me.

Joerg

Some OSs take the hf/sf division to the arch name (armv7lh) instead of
the ABI, so just using the environment part of the triple is not
enough.

This is really crazy and impossible to get it right because you never
know what you will get and OSs can change the run time behaviour
without warning.

There were suggestions to make Clang more like GCC and hard-code stuff
at build time, but this seems forcing one toolchain to behave like the
other for no additional benefit.

--renato