Floating point instructions patch

Hello,

I have been gone for a while, finishing work on my Master's thesis... Now that I'm back I updated LLVM to the most recent version and found that my FP_ABS SelectionDAGNode type and code generation was now conflicting with the new FABS node type. I brought the rest of my local modifications in line with the FABS implementation, so here is my patch that includes sqrt, sin and cos also. The only thing missing is expansion to lib calls for these node types for targets that don't support them... I'm sure someone who understands the LLVM internals a bit better than me can add that in no time :wink:

I also noticed that the DoesntAccessMemoryTable in BasicAliasAnalysis.cpp includes "sinh", "cosh" and friends -- as far as I know, these can actually set errno so they should not be in the table...

Here is the patch, hope to see it applied soon!

m.

fpinst.patch (10.8 KB)

I have been gone for a while, finishing work on my Master's thesis...

Hi Morten, congrats! :slight_smile:

Now that I'm back I updated LLVM to the most recent version and found that my FP_ABS SelectionDAGNode type and code generation was now conflicting with the new FABS node type. I brought the rest of my local modifications in line with the FABS implementation, so here is my patch that includes sqrt, sin and cos also. The only thing missing is expansion to lib calls for these node types for targets that don't support them... I'm sure someone who understands the LLVM internals a bit better than me can add that in no time :wink: Here is the patch, hope to see it applied soon!

Ok, overall it looks good. I've applied several pieces of it here:

Add and legalize new nodes:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025890.html
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025891.html
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025892.html

Add fabs/fabsf support to x86 isel simple:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025893.html

Add fsqrt support to x86 pattern isel:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025896.html

New X86 instrs:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025894.html

The patches I didn't apply are these:

1. Match (Y < 0) ? -Y : Y -> FABS in the SelectionDAGISel.cpp file. We
    already catch this at the DAG level. If we aren't, please let me know.
2. Codegen fsin/fcos to fsin/fcos for X86. We cannot do this, except
    under the control of something like -enable-unsafe-fp-math. These
    instructions are architected to have a limited range.
3. Codegen fsqrt/fsqrtf C functions to the FSQRT dag node. These
    functions can set errno, so this is not a safe transformation. The
    proper way to do this is to introduce an llvm.sqrt.f32/llvm.sqrt.f64
    pair of intrinsics to represent these operations without errno. The
    optimizer can then turn fsqrt calls into these intrinsics when errno is
    provably never used or if a compiler option is specified to ignore
    errno. Also, your work could just generate llvm.sqrt.* calls
    directly.
4. Codegen of fsin/fcos libm calls to FSIN/FCOS dag nodes. We really do
    need to be able to legalize these for targets that don't support them
    before we can do this. The easiest way to do this is to check the
    TargetLowering information for the operation to see if they are
    supported. If not, leave them as function calls in the
    SelectionDAGISel.cpp file. The alternative way is to have legalize
    lower them to libcalls. This is also needed for fsqrt, though sin/cos
    don't set errno.

I hope that committing the patches I did will significantly help your merging. To get the rest in, here are what needs to be done:

For #1, verify that it is still needed.

For #2, you can choose to add a flag to llvm/Target/TargetOptions.h named -enable-unsafe-fp-math. This would default to off, but you can obviously set it to on for your environment. This could also be used to do a lot of other simplifications and optimizations in the code generator.

For #3, just add the llvm intrinsics.

For #4, implement it or find someone to do it :slight_smile:

Please lemme know if you have any questions, thanks for the patches!

-Chris

I also noticed that the DoesntAccessMemoryTable in BasicAliasAnalysis.cpp includes "sinh", "cosh" and friends -- as far as I know, these can actually set errno so they should not be in the table...

You're right, I should not believe man pages. Fixed, thanks!

-Chris

Chris Lattner wrote:

The patches I didn't apply are these:

1. Match (Y < 0) ? -Y : Y -> FABS in the SelectionDAGISel.cpp file. We
   already catch this at the DAG level. If we aren't, please let me know.

OK, no problem - I was just told last time I tried to get my patch in that this was needed because the C++ frontend generated this code, I'm generating calls to fabsf() so for me this is unimportant ...

2. Codegen fsin/fcos to fsin/fcos for X86. We cannot do this, except
   under the control of something like -enable-unsafe-fp-math. These
   instructions are architected to have a limited range.

OK, I will add this flag.

3. Codegen fsqrt/fsqrtf C functions to the FSQRT dag node. These
   functions can set errno, so this is not a safe transformation. The
   proper way to do this is to introduce an llvm.sqrt.f32/llvm.sqrt.f64
   pair of intrinsics to represent these operations without errno. The
   optimizer can then turn fsqrt calls into these intrinsics when errno is
   provably never used or if a compiler option is specified to ignore
   errno. Also, your work could just generate llvm.sqrt.* calls
   directly.

Actually, in my first patch these instructions (including fabs) were all intrinsics.
My code generator used to generate llvm.sqrt calls, so it's no problem to go back to that.

4. Codegen of fsin/fcos libm calls to FSIN/FCOS dag nodes. We really do
   need to be able to legalize these for targets that don't support them
   before we can do this. The easiest way to do this is to check the
   TargetLowering information for the operation to see if they are
   supported. If not, leave them as function calls in the
   SelectionDAGISel.cpp file. The alternative way is to have legalize
   lower them to libcalls. This is also needed for fsqrt, though sin/cos
   don't set errno.

The first solution is what I went for the first time around, but I was told that it should be lowered to libcalls. I was unable to implement this as I don't see how I get the information about what type (float/double) the call originally used. When the DAG is legalized these are all promoted to f64, right? If you just tell me how, I will fix this one also...

I hope that committing the patches I did will significantly help your merging.

Yes, thank you very much!

m.

Chris Lattner wrote:

The patches I didn't apply are these:
2. Codegen fsin/fcos to fsin/fcos for X86. We cannot do this, except
   under the control of something like -enable-unsafe-fp-math. These
   instructions are architected to have a limited range.

OK, I will add this flag.

Great, thanks!

3. Codegen fsqrt/fsqrtf C functions to the FSQRT dag node. These
   functions can set errno, so this is not a safe transformation. The
   proper way to do this is to introduce an llvm.sqrt.f32/llvm.sqrt.f64
   pair of intrinsics to represent these operations without errno. The
   optimizer can then turn fsqrt calls into these intrinsics when errno is
   provably never used or if a compiler option is specified to ignore
   errno. Also, your work could just generate llvm.sqrt.* calls
   directly.

Actually, in my first patch these instructions (including fabs) were all intrinsics. My code generator used to generate llvm.sqrt calls, so it's no problem to go back to that.

Sounds good. The difference between sin/cos and sqrt is that sqrt sets errno. Because of this, we have to introduce a new intrinsic (llvm.sqrt) which is undefined on range error.

4. Codegen of fsin/fcos libm calls to FSIN/FCOS dag nodes. We really do
   need to be able to legalize these for targets that don't support them
   before we can do this. The easiest way to do this is to check the
   TargetLowering information for the operation to see if they are
   supported. If not, leave them as function calls in the
   SelectionDAGISel.cpp file. The alternative way is to have legalize
   lower them to libcalls. This is also needed for fsqrt, though sin/cos
   don't set errno.

The first solution is what I went for the first time around, but I was told that it should be lowered to libcalls. I was unable to implement this as I don't see how I get the information about what type (float/double) the call originally used. When the DAG is legalized these are all promoted to f64, right? If you just tell me how, I will fix this one also...

That is definitely one way to do it, but I think there are some minor pieces missing before this can be implemented (I'll get to it eventually ;-). In the mean time, it will be safe if you just check in the SelectionDAGISel.cpp code if the target supports these operations. If so, introducing the FSIN/FCOS nodes should be ok, otherwise just generate normal calls.

The problem with this approach happens if anything else could generate FSIN/FCOS nodes. Perhaps in the future this will be a problem, but it shouldn't be for now. If your patch includes a big FIXME, I will be happy to commit it and fix the issue the correct way when I get a chance.

Thanks a lot Morten!

-Chris

New patch here -- it's not been tested yet because we're having some problems with the application (I can't create new VM programs at the moment), but it compiles OK :wink: Please look over it and see if there are some more changes you'd like me to make before you can commit it...

m.

fpinst2.patch (14.4 KB)

Looks great. I committed your patches here:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025913.html
...
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050425/025927.html

I made the following changes:
1. The legalize stuff wasn't quite right.
2. I told all non-x86 targets that FSIN/FCOS/FSQRT were not legal for
    them yet.
3. I added a regression test here:
     Regression/CodeGen/Generic/intrinsics.ll

Overall the patch looked great, thanks a lot!

-Chris