Enable changing UnsafeFPMath on a per-function basis

To continue the discussion I started last year (see the link below) on embedding command-line options in bitcode, I came up with a plan to improve the way the backend changes UnsafeFPMath on a per-function basis. The code in trunk currently resets TargetOptions::UnsafeFPMath at the beginning of SelectionDAGISel::runOnMachineFunction to enable compiling one function with “unsafe-fp-math=true” and another with “unsafe-fp-math=false”, but this won’t work if we want to parallelize the backend in the future, which is something Eric mentioned in his talk last year.

Here is the link to the original discussion I started last year:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078785.html

These are the changes I plan to make:

  1. In llc.cpp (and any other tools that use the option), change the function attribute in the IR based on the value of command line option “enable-unsafe-pf-math” (EnableUnsafeFPMath, defined in CommandFlags.h).

  2. Replace usages of TargetOptions::UnsafeFPMath with calls to a function which gets the value of attribute “unsafe-fp-math” in the IR.

  3. Stringify function attribute “unsafe-fp-math” and append it to the string that is used as the lookup key in TargetMachine::getSubtargetImpl(const Function&). Also, pass the function attribute to the subtarget constructors that need it to construct itself (e.g., ARM) or construct one of the backend objects (e.g., X86, which needs it in the constructor of X86TargetLowering).

  4. In MachineFunction’s constructor, call TargetMachine::getSubtargetImpl(const Function &) to initialize STI (pointer to the per-function subtarget object). Currently, the version of TargetMachine::getSubtargetImpl that doesn’t take a const Function& parameter is called, which returns the module-level subtarget object.

  5. Fix ARMAsmPrinter::emitAttributes to compute the value of TargetOptions::UnsafeFPMath based on the function attributes of all the functions in the module being compiled (see the link below).

http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-December/079904.html

  1. Move the code in CGCall.cpp that sets the function attributes to BackendUtil.cpp. clang should set the function attributes regardless of whether it is compiling from source code or from an IR file (e.g., clang foo1.ll -o foo1.s -ffast-math), but currently this happens only if it’s compiling from source.

Any comments and suggestions are welcome. If we can agree on the main issues, I’ll follow up with patches that implement the changes I proposed.

Whatever happened to tracking the safe-or-fast-ness of FP math on instructions? Is tracking this property at a function granularity correct? I seem to recall nobody wanted to thread this through the SDAG.

If we agree that we want to track this at the function level, then your proposal seems reasonable.

From: "Reid Kleckner" <rnk@google.com>
To: "Akira Hatanaka" <ahatanak@gmail.com>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Monday, January 12, 2015 2:54:48 PM
Subject: Re: [LLVMdev] Enable changing UnsafeFPMath on a per-function basis

Whatever happened to tracking the safe-or-fast-ness of FP math on
instructions? Is tracking this property at a function granularity
correct? I seem to recall nobody wanted to thread this through the
SDAG.

No, I think we did want to do that, just no one has yet done it. We now have NSW/NUW in SDAG, so it should not be too much different for the FP flags.

-Hal

Hi Akira,

To continue the discussion I started last year (see the link below) on embedding command-line options in bitcode, I came up with a plan to improve the way the backend changes UnsafeFPMath on a per-function basis. The code in trunk currently resets TargetOptions::UnsafeFPMath at the beginning of SelectionDAGISel::runOnMachineFunction to enable compiling one function with “unsafe-fp-math=true” and another with “unsafe-fp-math=false”, but this won’t work if we want to parallelize the backend in the future, which is something Eric mentioned in his talk last year.

Here is the link to the original discussion I started last year:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078785.html

Thanks for coming back around to this. I’d hoped to have most of this handled, but as you can tell it’s a nice herd of yaks to get everything playing nicely.

These are the changes I plan to make:

  1. In llc.cpp (and any other tools that use the option), change the function attribute in the IR based on the value of command line option “enable-unsafe-pf-math” (EnableUnsafeFPMath, defined in CommandFlags.h).

  2. Replace usages of TargetOptions::UnsafeFPMath with calls to a

function which gets the value of attribute “unsafe-fp-math” in the IR.

These two combined seems a little weird. Do you mean that you plan on, effectively, changing the IR if someone passes a command line flag to an existing set of IR?

  1. Stringify function attribute “unsafe-fp-math” and append it to the string that is used as the lookup key in TargetMachine::getSubtargetImpl(const Function&). Also, pass the function attribute to the subtarget constructors that need it to construct itself (e.g., ARM) or construct one of the backend objects (e.g., X86, which needs it in the constructor of X86TargetLowering).

Another thought is to put it as part of the feature string rather than adding it as a separate attribute. That way you won’t have to add a lookup in each target.

  1. In MachineFunction’s constructor, call TargetMachine::getSubtargetImpl(const Function &) to initialize STI (pointer to the per-function subtarget object). Currently, the version of TargetMachine::getSubtargetImpl that doesn’t take a const Function& parameter is called, which returns the module-level subtarget object.

This can be done first if you like at the moment, though you may run into other problems and testing may be somewhat minimal.

  1. Fix ARMAsmPrinter::emitAttributes to compute the value of TargetOptions::UnsafeFPMath based on the function attributes of all the functions in the module being compiled (see the link below).

http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-December/079904.html

Yes.

  1. Move the code in CGCall.cpp that sets the function attributes to BackendUtil.cpp. clang should set the function attributes regardless of whether it is compiling from source code or from an IR file (e.g., clang foo1.ll -o foo1.s -ffast-math), but currently this happens only if it’s compiling from source.

This is an interesting solution, but involves changing IR files and I don’t know that we want to do this (as I mentioned above). But I do admit I like it more than having to check both TargetOptions and the attribute to do it.

Another thing you haven’t brought up is the inliner (or other cross function optimizations)- what do you plan on doing there? Think two files, one compiled with and one compiled without and linked together.

-eric

Basically my plan is to change the backend to use the value of
"unsafe-fp-math" that is recorded in the IR as a function attribute instead
of reading the value from TargetOptions::UnsafeFPMath (which is global). It
doesn't change at which level (function or instruction level) fast FP math
is tracked or make it harder to change selection sag to track it at a finer
granularity, if someone decides to do that later.

Hi Eric,

Hi Akira,

To continue the discussion I started last year (see the link below) on
embedding command-line options in bitcode, I came up with a plan to improve
the way the backend changes UnsafeFPMath on a per-function basis. The code
in trunk currently resets TargetOptions::UnsafeFPMath at the beginning of
SelectionDAGISel::runOnMachineFunction to enable compiling one function
with “unsafe-fp-math=true” and another with “unsafe-fp-math=false”, but
this won’t work if we want to parallelize the backend in the future, which
is something Eric mentioned in his talk last year.

Here is the link to the original discussion I started last year:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078785.html

Thanks for coming back around to this. I'd hoped to have most of this
handled, but as you can tell it's a nice herd of yaks to get everything
playing nicely.

These are the changes I plan to make:

1. In llc.cpp (and any other tools that use the option), change the
function attribute in the IR based on the value of command line option
“enable-unsafe-pf-math” (EnableUnsafeFPMath, defined in CommandFlags.h).

2. Replace usages of TargetOptions::UnsafeFPMath with calls to a

function which gets the value of attribute “unsafe-fp-math” in the IR.

These two combined seems a little weird. Do you mean that you plan on,
effectively, changing the IR if someone passes a command line flag to an
existing set of IR?

Yes, my current plan is to change the existing attributes in the IR when
command line options are passed. I wasn't sure if changing the IR was a
good idea, but it seemed to be the simplest way to handle command line
options passed to llc using the current cl::opt infrastructure. This also
depends on how the command line infrastructure will look like after Chris
checks in his patches.

3. Stringify function attribute “unsafe-fp-math” and append it to the

string that is used as the lookup key in
TargetMachine::getSubtargetImpl(const Function&). Also, pass the function
attribute to the subtarget constructors that need it to construct itself
(e.g., ARM) or construct one of the backend objects (e.g., X86, which needs
it in the constructor of X86TargetLowering).

Another thought is to put it as part of the feature string rather than
adding it as a separate attribute. That way you won't have to add a lookup
in each target.

Yes, that's another way to pass the option, which is probably less
error-prone than passing each function attribute. Does it mean that we
define SubtargetFeatures for unsafe-fp-math and other options too? The
downside of this approach is that it will needlessly create new subtarget
objects for targets that currently don't need to specialize subtargets
based on the value of "unsafe-fp-math", although I'm not sure how much
impact it will have on memory.

4. In MachineFunction’s constructor, call

TargetMachine::getSubtargetImpl(const Function &) to initialize STI
(pointer to the per-function subtarget object). Currently, the version of
TargetMachine::getSubtargetImpl that doesn’t take a const Function&
parameter is called, which returns the module-level subtarget object.

This can be done first if you like at the moment, though you may run into
other problems and testing may be somewhat minimal.

OK. I'll spend a little more time investigating and testing and send a
patch for it if everything goes well.

5. Fix ARMAsmPrinter::emitAttributes to compute the value of
TargetOptions::UnsafeFPMath based on the function attributes of all the
functions in the module being compiled (see the link below).

http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-December/079904.html

Yes.

6. Move the code in CGCall.cpp that sets the function attributes to
BackendUtil.cpp. clang should set the function attributes regardless of
whether it is compiling from source code or from an IR file (e.g., clang
foo1.ll -o foo1.s -ffast-math), but currently this happens only if it’s
compiling from source.

This is an interesting solution, but involves changing IR files and I
don't know that we want to do this (as I mentioned above). But I do admit I
like it more than having to check both TargetOptions and the attribute to
do it.

Another thing you haven't brought up is the inliner (or other cross
function optimizations)- what do you plan on doing there? Think two files,
one compiled with and one compiled without and linked together.

I haven't thought much about how I should fix the inliner as I was thinking
this should be tackled as a separate problem. Since "unsafe-fp-math" is
already embedded in the IR as a function attribute, we don't have to wait
for my changes to be checked in and we can start fixing the problem today.
Is that right?

I’m very concerned about how to do this because parsing command lines happens before IR is loaded, and it is conceivable that a single set of options could be used for multiple modules, or alternatively different options could be used on different modules in the same process.

Your earlier patches were storing options temporarily in global data, which seems really fragile.

-Chris

FWIW, I think it is a mistake to pursue the per-function modeling of this
given how close we are to having proper per-instruction tracking of this. I
would much rather see work going toward that if we think that is the right
long-term thing for LLVM...

If this is super simple to do for per-function granularity, cool. But if it
is adding lots of complexity (which I'm worried about with Chris B's
comments below) I think it makes more sense to focus on doing per-DAG-node
fast math flags.

We already sort of model fast math flags at per-function level by resetting
TargetOptions before selection dag is run. My proposal just makes changes
that are needed to parallelize the backend (sorry about the misleading
title of this thread).

If modeling the flags at per-instruction level is the right thing to do, I
agree that I shouldn't proceed with my current plan. I think exposing fp
flags in selection dag needs a little more work than r210467 (which exposed
nsw and nuw) did. In particular, some of the targets set operation actions
based on the value of TargetOptions::UnsafeFPMath (see
X86TargetLowering::resetOperationActions), which means legalization has to
examine the flags in addition to the opcode and type to get the operation
action.

Long term, I definitely think this is the right way to go. It lets us
optimize code which has been inlined from an unsafe fp math function into a
normal function.

Is there a thread that already discuss that other than the two years old one I found: http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-October/054999.html

Thanks,

Mehdi

Instruction-level FMF has potential benefits beyond just inlined code. A
couple of recent examples posted here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080364.html

I think I should move the discussion to other command line options for now.
Based on the feedback I received, it seems that fixing selection-dag to
model fp math flags at instruction (selection-dag node) level is the
direction we should be heading. I believe that will be a fairly large
project and also it's mostly orthogonal to the problem I'm trying to solve.
I'll look into it later when I am done fixing the other options.

Before I move on to the next command line option, I'd like to discuss a few
more things that were brought up in this thread. Let me know what you think.

Hi Eric,

Hi Akira,

To continue the discussion I started last year (see the link below) on
embedding command-line options in bitcode, I came up with a plan to improve
the way the backend changes UnsafeFPMath on a per-function basis. The code
in trunk currently resets TargetOptions::UnsafeFPMath at the beginning of
SelectionDAGISel::runOnMachineFunction to enable compiling one function
with “unsafe-fp-math=true” and another with “unsafe-fp-math=false”, but
this won’t work if we want to parallelize the backend in the future, which
is something Eric mentioned in his talk last year.

Here is the link to the original discussion I started last year:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078785.html

Thanks for coming back around to this. I'd hoped to have most of this
handled, but as you can tell it's a nice herd of yaks to get everything
playing nicely.

These are the changes I plan to make:

1. In llc.cpp (and any other tools that use the option), change the
function attribute in the IR based on the value of command line option
“enable-unsafe-pf-math” (EnableUnsafeFPMath, defined in CommandFlags.h).

2. Replace usages of TargetOptions::UnsafeFPMath with calls to a

function which gets the value of attribute “unsafe-fp-math” in the IR.

These two combined seems a little weird. Do you mean that you plan on,
effectively, changing the IR if someone passes a command line flag to an
existing set of IR?

Instead of changing the IR, I'm considering adding a flag to the fields of
TargetOptions that indicates whether the corresponding option has been
specified on the command line. Initially, changing the function attribute
value in the IR seemed to be a good idea (or at least acceptable) to me,
but changing all instructions with fp math flags in the IR is probably not
a good idea.

I'm considering defining a template class like this:

template<typename ValTy>
struct Option {
ValTy Val;
bool Occurred;
};

Option<unsigned> StackAlignmentOverride;

The field "Occurred" is used to decide whether Option.Val should override
the function attribute value in the IR. For example, passing
"-stack-alignment 64" to llc sets StackAlignmentOverride=(64, true). The
backend passes will look at StackAlignmentOverride and get its value if
Occurred==true, otherwise try to get the function attribute value in the IR.

Yes, my current plan is to change the existing attributes in the IR when

command line options are passed. I wasn't sure if changing the IR was a
good idea, but it seemed to be the simplest way to handle command line
options passed to llc using the current cl::opt infrastructure. This also
depends on how the command line infrastructure will look like after Chris
checks in his patches.

3. Stringify function attribute “unsafe-fp-math” and append it to the

string that is used as the lookup key in
TargetMachine::getSubtargetImpl(const Function&). Also, pass the function
attribute to the subtarget constructors that need it to construct itself
(e.g., ARM) or construct one of the backend objects (e.g., X86, which needs
it in the constructor of X86TargetLowering).

Another thought is to put it as part of the feature string rather than
adding it as a separate attribute. That way you won't have to add a lookup
in each target.

Yes, that's another way to pass the option, which is probably less
error-prone than passing each function attribute. Does it mean that we
define SubtargetFeatures for unsafe-fp-math and other options too? The
downside of this approach is that it will needlessly create new subtarget
objects for targets that currently don't need to specialize subtargets
based on the value of "unsafe-fp-math", although I'm not sure how much
impact it will have on memory.

4. In MachineFunction’s constructor, call

TargetMachine::getSubtargetImpl(const Function &) to initialize STI
(pointer to the per-function subtarget object). Currently, the version of
TargetMachine::getSubtargetImpl that doesn’t take a const Function&
parameter is called, which returns the module-level subtarget object.

This can be done first if you like at the moment, though you may run into
other problems and testing may be somewhat minimal.

OK. I'll spend a little more time investigating and testing and send a
patch for it if everything goes well.

Are there any plans to define a separate class for the module-level
subtarget? The reason I ask is that it seems pretty easy to mistakenly use
the module-level subtarget when the function-level subtarget should be used
and those kind of bugs are very hard to find.