[RFC] FastMath flags support in MLIR (arith dialect)

jfurtek · January 26, 2022, 6:25pm

The LLVMIR dialect in MLIR currently has support for FastMath flags (nnan, ninf, nsz, …) that map directly to the LLVM equivalents for floating point instructions (e.g. fadd, fsub, …). I would argue that MLIR would benefit from supporting the FastMath concepts in other places, outside of the LLVMIR dialect. (More on the “where” in MLIR below.) Specifically:

there may be target-specific IRs that are not LLVM IR (see Tensor Codegen Thoughts, MLIR ODM 2020/01/23, slide 6)
FastMath semantics in many cases map directly to higher level constructs like vectors and tensors, and transformations may wish to leverage this behavior before lowering to the LLVMIR dialect.

There are already some cases where fast-math-related floating point behavior ambiguity exists (or did exist) in MLIR:

Canonicalization of ‘x + (+0.0)’ in tosa
conversion of complex::MulOp in ComplexToLLVM.cpp adopts the “naive” finite math lowering approach that would result from fast-math optimizations, whereas the same conversion (to the Arithmetic dialect) in ComplexToStandard.cpp generates the (dozens of) extra instructions to correctly handle Inf/Nan values

The documentation for the floating point instructions in the Arithmetic dialect (example here) suggest FastMath attributes as a “distant future” TODO.

Considerations:

It seems sensible to build on what LLVM and clang have done here.

Initially, the MLIR FastMathFlags would be identical to the LLVM fast-math flags. This would allow for straightforward lowering to LLVM IR.
LLVM allows fast-math flags for floating point instructions (e.g. fadd, fmul, fdiv, …) as well as the phi, select, and call instructions (which, in MLIR, are not in the arith dialect). However, it seems feasible to restrict the scope of the FastMathFlags attribute to the arith dialect:
- LLVM optimizations to call instructions with fast-math flags seemed to be limited to optimizations that leverage specific knowledge of the meaning of LLVM intrinsics or known library calls (e.q. sqrt()) that may be inlined/LTO. It seems that the existing MLIR framework can accomplish similar optimizations on known functions (if desired), without requiring fast-math attribute support for the std.call operation.
- There is no corresponding phi node in MLIR
- LLVM does perform some optimizations on select instructions with floating point arguments when fast-math flags are present. MLIR has a select operation in the std dialect. In spite of this, I would think it makes more sense to confine FastMathFlags to the arith dialect, as opposed to cluttering the to-be-replaced-at-some-point std dialect.
Should MLIR have a.) a fast-math attribute with a default value of “no-fast-math-optimizations”, or b.) an optional fast-math attribute?
- clang seems to have gone through some evolution in terms of interpreting an unset fast-math bit as “unspecified” vs. “intentionally unset to forbid optimizations.” It seems feasible in MLIR that pipelines could, for example, set fast-math for all operations without that don’t have the (optional) attribute present, and keep other (numerically sensitive) operations (that have a specific fast-math attribute present) untouched. (An alternative with more granularity would be a set of optional boolean attributes.) I think that the optional BitEnumAttr attribute approach would provide some flexibility.

Proposed changes:

Creation of a FastMathFlags attribute type (more specifically, a BitEnumAttr) in the arith dialect
Additonal of an optional FastMathFlags attribute to floating point operations in the arith dialect
Addition of a FastMath interface to the floating point operations in the arith dialect, patterned after the existing interface in the LLVMIR. This interface would be used (primarily) to apply modifications to the FastMathFlags for operations that support it.
Development of passes that use the FastMathFlagsInterface to add/modify fast-math flags for supporting operations
Progressive addition of fast-math-aware transforms/folding implementations for floating point arith operations

jfurtek · January 31, 2022, 3:43pm

Pinging here for one more try at soliciting feedback - I’d rather avoid writing the code if the lack of replies indicates that there is lack of interest in reviewing proposed changes, or if the approach is wrong.

I have an out-of-tree dialect that needs to perform simplifications on floating point operations. Having the functionality within MLIR itself seems to make the most sense, and it seems like (from comments in the documentation) the intent was to have this capability at some point.

Thanks in advance!

mehdi_amini · January 31, 2022, 5:57pm

I think it is a very important thing, sorry for the lack of reply: it fell off my radar.

I’m confused by this: I don’t understand what you mean with your reference to clang, and the link to the LLVM mailing list is about the interactions of the per-instruction flag and the target-level / function-level flag I believe.

Another consideration could be to make it a builtin attribute, which would allow to reuse it a multiple levels without cross-dialect dependencies considerations.

jfurtek · January 31, 2022, 7:32pm

sorry for the lack of reply: it fell off my radar.

No worries…

I’m confused by this

IIUC, with LLVM/clang, originally there were no per-instruction flags, and (according to that thread and some others) there were situations where interpretation of the function-level and target-specific attributes was unclear, even after the per-instruction attributes started to appear. Quoting that thread (bold emphasis mine):

The core of my complaint about the current state of things is that the backend is treating the fast-math flags as if they are “on or undecided” rather than “on or off.” IMO, the absence of a fast-math flag in the IR means that the behavior controlled by that flag is not permitted. That’s the way the IR is treated in IR-level optimization passes, but the backend codegen (at least in places) behaves as though the absence of a fast-math flag means “not permitted unless enabled by TargetOptions.” That’s bad as a starting point, but it’s particularly bad when you start linking together IR from multiple modules that may have been created from different compilation units with different options.

(Sorry - I could have been clearer about which part of the post I was focused on, and maybe I’m imagining the similarities…)

I’m assuming that MLIR would adopt the instruction-level control approach that LLVM now has.

I’m trying to anticipate how MLIR should treat floating point operations - in particular in cases where the Op is created without a specific fastmath attribute. That would include everything in MLIR right now, but even if we look forward to full MLIR support I think there may be some things to consider.

I was thinking of two options, but maybe there are others:

A. Float operations have a fastmath attribute with a default case of none (i.e. no traditional fastmath optimizations are allowed). The fastmath behavior is set at MLIR generation time, and modifying the attribute would a require a pass to either “clobber” all fastmath attributes, or distinguish (somehow) between “safe to change fastmath” and “not safe to change fastmath” operations.
B. Float operations have an optional fastmath attribute. The absence of a fastmath attribute could, in theory, be treated differently by transformations than a the presence of an attribute with fastmath = none. As an example, a frontend might explicitly disable fastmath flags for, say, activation functions, and leave other fastmath attributes unspecified. A pass might then enable fastmath for operations that did not have the (optional) attribute provided. Maybe results are compared for accuracy by some outer compiler loop, and the transformations are reverted if fastmath yields unacceptable results. I guess one could think of MLIR’s support for optional attributes as providing an extra “unspecified” fastmath enum case.

From a compiler perspective, explicit instruction-level attributes seem to (mostly) address the problem of ambiguity in what optimizations are allowed. But maybe the optional aspect of MLIR attributes (which I don’t think can be done for fastmath in LLVM right now) might provide some flexibility for front ends or pass frameworks.

Another consideration could be to make it a builtin attribute, which would allow to reuse it a multiple levels without cross-dialect dependencies considerations.

Having fastmath as a builtin attribute (so that it could be used from multiple dialects) was my initial thought. (I had a feeling that the recent efforts to compartmentalize things from std into, say, arith would favor placing the attribute with the floating point operations.) I don’t have a strong preference.

mehdi_amini · January 31, 2022, 7:53pm

Right, the transition to per-instruction has been painful in LLVM, but this is mostly because this use to be handled globally (and per-function).
We don’t have this initial state in MLIR though, so starting with per-instructions flag seems cleaner (our technical debt here will be to add handling for these flags in every canonicalized, etc.)

This is something I’m not sure I understand, I don’t quite graps the difference between “fast-math is allowed” and “fast-math is unspecified”,which you interpret as “it may be turned on”.

jfurtek · January 31, 2022, 9:40pm

I don’t quite grasp the difference between “fast-math is allowed” and “fast-math is unspecified”,which you interpret as “it may be turned on”.

This is precisely what clang/LLVM were doing, right? fastmath flags had the value none, and the implementation was using the attribute from “somewhere else” (which in this case was TargetOptions). Having another possibly conflicting source of flags wasn’t good for implementing the back end, but there may be some other uses for the idea.

To be clear, I am not suggesting that should be any ambiguity from a backend/codegen perspective. If an Operation has no fastmath attribute, this is interpreted the same as if it had fastmath = none. Any transformations that would fold/canonicalize/replace would have well-defined behavior.

I think that allowing an unspecified fastmath attribute may have value in manipulating an IR, before the transformations that would use fastmath.

Consider an MLIR module with 10,000 float instructions, and 1,000 that are known to be numerically sensitive. On these, fastmath would be set to none, prohibiting traditional fastmath optimizations. What about the remaining 9,000 floating point instructions? The impact of applying fastmath optimizations to these remaining instructions to overall accuracy is unknown at the time of MLIR generation. If the fastmath attribute were to be unspecified when generating the initial MLIR, then a pass could do something like:

if (fastmath.not_specified()) fastmath.set_fast();

After lowering and compilation, the output is compared to reference for accuracy. In this way, whether or not the attribute was specified provides an additional op filter, similar to finding all ops of a given type. In this case, the optional fastmath attribute provides a way to defer setting the fastmath flags.

I’m not attached to making fastmath an optional attribute - I just thought there there might be a use for it. Maybe there is a way to do the same thing with discardable attributes or something.

kiranchandramohan · January 31, 2022, 9:57pm

Fast math flags are probably useful for the math dialect as well. When the math dialect operation is lowered the fast attribute can potentially be present on the llvm intrinsic that corresponds to the math function, or on the call instruction to the math routine, or in some cases can even be encoded into the mangled name of the math function.

Can this be viewed as a use-case for Fast Math flags/attributes outside the arith dialect?

In Clang there is now a kind of umbrella flag for modelling floating-point behaviour, ffp-model=<strict|precise|fast>. Would it make sense to model this in MLIR?
https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffp-model

jfurtek · January 31, 2022, 10:28pm

Fast math flags are probably useful for the math dialect as well.

You are right! I should have seen that before now - I was focused on a different use case. I think this means that FastMathFlags would have to be a builtin attribute so that it can be used in both arith and math dialects (and maybe others).

In Clang there is now a kind of umbrella flag for modelling floating-point behaviour,
ffp-model=<strict|precise|fast> . Would it make sense to model this in MLIR?

Yes - I think that we could adopt those umbrella flags as well. The BitEnumAttr recently changed to allow enum cases that are effectively aliases for a group of individual bits. Setting one of the “umbrella” values would set the corresponding individual bits (assuming we define all those in the clang table), but we would also retain the ability to use a subset of the individual bits.

mehdi_amini · January 31, 2022, 11:28pm

Not really: “unknown” and “false” is the same thing LLVM (unless it changed?)
What LLVM has been doing is allowing a global flag to override the per-instruction flag (when it comes to the backend at least). Mostly because it took longer for the backend to support per-instruction flag compared to LLVM.

Then you can mark them as “fast-math is allowed”.
Your flow still seems possible: you can perform lowering and compilation, run, check for accuracy, and then remove fast math flags if it isn’t satisfying.

Basically I think for your distinction to make sense you would need to justify a distinction between “fast-math is allowed” and “fast-math shouldn’t be disallowed”.

Another issue you have is that the user may want to express that “fp-contraction” is allowed but nothing else. However that would fit your definition of “if (fastmath.not_specified()) fastmath.set_fast();”, so it seems that what you want is an orthogonal attribute.
Maybe even a tri-state for each FMF: “allowed”, “disallowed”, “flip_a_coin” (or “unset” as you prefer ).

These are convenient way for the frontend users to control the individual flags that gets added. It seems that precise is entirely equivalent to the contract fast-math flag isn’t it?

kiranchandramohan · February 1, 2022, 10:36am

Yes, precise is equivalent to contract=ON in the Clang modelling.
WRT math intrinsic functions, there can be different implementations provided by a vendor for each of the options (fast, precise, strict). Retaining the model information can help in choosing which implementation to pick.

Mogball · February 2, 2022, 1:42am

+1. This is a long, long-standing TODO.

Some of the operations in the math dialect could make use of the fast-math flags. math and arith are closely related dialects.

mehdi_amini · February 2, 2022, 1:42am

I guess I haven’t understood still at this point the fundamental difference, or rather why this difference is useful. This “I don’t know” behavior is weird to me, and I’m still grasping with it.

To be clear: I understand that part fine. I see that you want to model “I-have-not-yet-determined-the-behavior”, I am just not sure why and how it differs from “fast-math is allowed” in practice.

If I really try to imagine some use-cases for this (like the one you describe), I’m not convinced I would include them in the way FMF is modeled and instead handle them separately with a domain-specific flag.

This is mostly because any analysis or transformation will need to have a binary answer: “allowed”/“disallowed”. The “maybe” state in between has no interpretation here for any FMF client.

It seems then there is something I clearly don’t understand in your motivating use-case, or what you really want here.
You haven’t really elaborated on this with respect to my example: you may know that fp-contract is always safe, so you want to add it everywhere. However you don’t know about all the more “aggressive” options: how do you handle it now? Your use-case seems that you want to leave out the possibility of adding them later. But because you added fp-contract you can’t express this optionality anymore.

I couldn’t make sense of your distinction and optionality so far: that does not mean you’re not right though, so it is worth discussing it more

jfurtek · February 2, 2022, 6:00pm

I’m enjoying the discussion, and I am happy to continue it. I’m also happy to defer to your expertise here, and I’m not 100% sure that my thoughts are well-formed enough to insist on something…

I see the traditional clang model as one that specifies fastmath flags at compilation time, but I think that MLIR should strive to adopt something more “fluid”. In other words, with C/C++ and clang, if you want to change fastmath flags, you recompile the C/C++ source code. With MLIR, maybe there is a situation in which the fastmath flags are varied and iterated on, without leaving MLIR.

It sounds as if you are suggesting that in order to allow subsequent fast-math optimizations (or potential exploration of them), “fast-math is allowed” should be the default in MLIR. This doesn’t seem right to me, since that is the opposite of the default for everything else. (Maybe I’m misunderstanding the suggestion.) At the very least, it seems like having this be the “best practice” for front ends would generate confusion.

I was thinking, perhaps too narrowly for my use cases, that the workflow for FastMathFlags would work something like this:

If the FMF flags are fixed/known for all ops, then front ends should set them to the desired value. (Perhaps this is the most common case.) Lowering to LLVMIR is a straightforward mapping of MLIR attributes. Otherwise…
For operations with “fixed/known” FMF behavior, set the FMF flags to these known values. This might include setting numerically sensitive computations to none (or whatever combination is known to be acceptable for these operations), and others to fast. The presence of these FMF attributes will be honored and not modified (unless some sort of override is forced).
For non-fixed FMF operations, leave the FMF flags unset in the generated IR (by not providing an attribute). These non-fixed ops might include those that will inherit a “global” default value, as well as those that will be determined via experimentation. Passes can be used to selectively apply FMF flags to all FMF ops that have unset FMF attributes. (There could be other selection criteria as well, but unset attributes is a particularly convenient one.)

This essentially partitions all FP nodes into “fixed FMF” and “variable FMF” groups. The exploration process would gradually move nodes from the “variable” to the “fixed” group. Any node without an attribute would be treated as none for FMF purposes by optimization passes. This really only provides a “single bit” of flexibility - but I would argue that in some cases that is enough.

This assumes a very narrow definition of “FMF client”. Clearly, any optimization that rearranges FP operations, or folds constants needs concrete values for FMF flags, and the proposal addresses that - conservatively. (When the optional attribute is not present, it is assumed that no optimizations are safe, just as if there were a none FMF value present.)
But I would argue that a pass that, for example, modifies FMF flags (as part of some iterative exploration) is also a “client.” Going back to my original example, which modifies FMF flags only if they are “unset” - doesn’t that qualify as an “interpretation”? I can understand an argument that there may be other ways to do it, or that it isn’t how MLIR should approach it, but I don’t see how, using your words, there is “no interpretation here for any FMF client.” It is a very limited (on or off) interpretation, but one with very low storage and computation overhead.

My objection to the “tri-state” approach (which is something that I had considered) was based mostly on practical, software engineering considerations. I agree that this approach handles the general case of per-flag “deferral/unknown,” but it seems to me like it does so at a drastic cost in both size and runtime overhead. (If, for example, I wanted to use MLIR for a JIT compiler I might care very much about this.) Perhaps a custom attribute type or data structures would limit the impact, but I’m not sure it is worth the effort. In contrast, optional attributes exist already in MLIR, and the overhead seems minimal. So if it were true that the optional attribute handles 90% of the use cases, to me it seems like a reasonable approach. (And maybe it doesn’t handle 90%, or maybe there are no practical use cases, but that was the context of my “not what I want” statement.)

That is correct - once the FMF attribute is set to any valid value, the ability to interpret it as “unknown” is lost.

But in my experience, the “90%” case is choosing between fastmath=fast and fastmath=none. Other cases are certainly valid and should be supported. In this scenario, FastMathFlags aren’t “8 separate 1-bit variables” that are optimized independently, rather it is often treated as a single “atomic” value.

I can imagine cases where some sort of dynamic attribute would be better suited to solve the “exploration of independent fastmath flags” scenario. But It seems to me that adding multiple optional boolean attributes to every FP instruction (to handle the general case) will bloat the IR, and that is a bell that would be difficult to un-ring.

I’m primarily interested in having a FastMathFlags attribute attached to FP operations. On top of that, I think the optional aspect adds a potentially useful way to add some flexibility to modifying FastMathFlags, via the MLIR pass infrastructure, for the most common “changing FMF as a single value” use case, with (almost) zero time and space overhead. (It seems like any in-memory space overhead of FastMathFlags will be incurred by every FP operation.) More sophisticated usage scenarios could rely on dynamic attributes - and for workflows that aren’t yet defined, or that may be used infrequently, IMO dynamic attributes would be preferable to adding features to all MLIR FP operations.

mehdi_amini · February 3, 2022, 8:14am

It seems that you are anchored to the runtime cost of the feature: it seems negligible to me, or at least clearly not a bloat. We’re talking about doubling the number of bits of a very small bitfield that will be stored uniquely in the Context. Just the pointer used to point to this struct alone will cost more storage space than the FMF themselves

It seems like any in-memory space overhead of FastMathFlags will be incurred by every FP operation.

Because of how we handle attributes: the cost of storage on a per-op basis is null. Each operation has a pointer to a structured allocated in the context, pooled amongst all the operations.

Regardless, let’s leave the implementation cost for a separate discussion, I’m more interested in the usefulness of the feature.

Sure, but by the same logic the >90% of cases also will never ever need an “optional” semantics.
In reality: it is hard to anticipate because when looking at LLVM we think “clang” and other similar flow while MLIR is applied to radically different use-cases.
Also I suspect a lot of people would want to allow contraction as a default behavior instead, which wouldn’t interact well with the kind of auto-tuner you are alluding to.
Another example I know of is a GPU compiler for which allowing reciprocal was also something always on by default (the HW actually had higher precision with reciprocal).

I see it as a “producer” instead, or as something entirely different from the compiler point of view. It seems like an “out-of-band” thing that does not look like any “regular” client (should I have used “consumer” instead maybe?).

Ultimately for your idea of an auto-tuner to work, you need to have the frontends create IR by making a choice (differentiate between “fast-math isn’t allowed” and “you may allow fast-math”, which is different than “fast-math is allowed” somehow): but if the frontend has to express intent to the auto-tuner, why not make it with another attribute entirely? The ops that are intended to be tuned could be tagged with an attribute enable.fmf.autotune for example. Doing so also allows to set for example the allow contraction flag while still expressing the intent to allow an override (this should also address your runtime performance concern).

I’m saying that the very definition of “fast-math” is to allow the compiler to do fast-math optimization: it isn’t a guarantee about anything else. From this point of view I’m mostly saying that the distinction with the “unknown” state is not clear: you’re also just expressing that you “allow” the compiler/tooling to exploit fast-math.
From this point of view, if you intend to use such an auto-tuner, you could start by marking everything that is tunable as “fast” and then let the auto-tuner disable the “fast” mode on some operation. It seems to me that you get the exact same tuning process and the same search space: unless you also want to be able to limit the search space by expressing to the auto-tuner that it should never disallow fast-math when it has been allowed already.

jfurtek · February 3, 2022, 9:48pm

I was comparing an optional BitEnumAttr to 32 different optional BoolAttrs to implement your tri-state suggestion. (I could be wrong, but I don’t think the overhead is as small in that case.) Certainly, the same could be implemented with a separate bitmap and some extra logic to combine bits from the two. I’m aware of the “unique” storage aspect for attributes - I was under the impression that the overhead per op would be a “pointer per attribute”, with multiple ops pointing to the same storage space when the attribute value is the same. Maybe there is something more complex going on.

And they are paying (approximately?) zero price for it.

This is incorrect - you could leave those values as unset in the source MLIR, and then always combine contract with something else when setting the flags via a pass. Regardless, I agree that the optional approach doesn’t handle all (or even most) potential scenarios.

The example pass both consumes (reads) and produces (writes) an FMF value. I’m not sure I understand this part of the thread. Are you saying that an “undefined” FMF value is “out of band” because it doesn’t make sense to certain parts of the workflow (i.e. the compiler)? If so, then I agree - an optional FMF is an “out of band” convenience that could be used for IR manipulation. But there are other parts of MLIR that don’t make sense to a compiler back end - like unrealized_conversion_cast. (That seems pretty “out of band” to me, but I’m OK with it. )

IMO, this flags seems a little obscure to add to the IR - if for no reason other than readability in text form. (How many more will there be?)

I feel like the unknown state means just that: unknown. It doesn’t mean anything in and of itself, other than that the generation process (or previous pass) did not assign a value - full stop.

That can be used, by different “clients,” to do any number of things. A pass in an autotuner might use this to allow the compiler/tooling to exploit fast math, as you suggest above. Another pass might do something else to replace that op entirely. It seems like you are projecting a meaning onto it based on just one “client” (lowering to LLVMIR), even though that that client doesn’t see the unset value. (All FMF attributes would be set, or it they aren’t set, interpreted as none.)

At a basic level, I see the optional aspect as a way to filter FP operations. Going back to my motivating example, if I have an MLIR file with a function containing 50 mulf operations, and I want to treat 10 of them differently (from a fastmath perspective), and I know that they are different at generation time. I can imagine a way to generate the .mlir file, and then provide a simple option to mlir-opt that will modify only those instructions (albeit supporting limited combinations), and then proceed through the lowering/compilation process. The runtime/storage/text_readability impact (over non-optional FastMathFlags) is (essentially) zero, and IMO it has a conceptual simplicity to it. It seems like you are advocating for the more general case, but even if the overhead is small, there is no guarantee that it will be used, or that it will be adequate for some currently unknown use case.

I’m not attached to the optional part - I was trying to anticipate what might be useful to avoid having to later introduce a breaking change. I understand your points - maybe it is a case of “agree to disagree.” along with “perfection is the enemy of progress”.

What do you think about this plan?

1.) add FastMathFlags as a required attribute (not optional) with a default value, and implement fold transformations as appropriate
2.) later add enable.fmf.autotune (or an additional is_fixed bit enum, or something else, after further reflection) as needed to handle modification of FMF attributes within MLIR, after step 1 lands

mehdi_amini · February 5, 2022, 6:42am

1.) add FastMathFlags as a required attribute ( not optional) with a default value, and implement fold transformations as appropriate

That seems fine, but we need to consider what flags it’ll include. If it is just what LLVM provides, then it is basically about promoting the existing llvm.fastmath attribute for general uses, otherwise we need to look at it more carefully.
Also by saying it’ll be a required attribute, instead of considering that the absence of the attribute implies that there is no FMF enabled, you mean that you’d prefer the op attribute dictionary to always have an entry for the FMF even when no flag is set? But in this case it’d point to one where all flags are unset? What makes you favor this model compared to the existing one where the absence of the attribute implies no flag set?

jfurtek · February 7, 2022, 11:01pm

As a first pass, I was intending to just use LLVM FastMath values - this would definitely give some underlying structure to things like x * 0 = 0 transformations in MLIR, and would be straightforward to lower to the LLVMIR dialect. (There are also some small changes that need to happen to the FMF attribute that is currently in LLVMIR before being promoted to general use. I can elaborate here, or we can save it for the code review…)

As for “FastMathFlags things that clang/LLVM might do differently if starting over”, I know that the behavior of isinf()/isnan() in the presence of fastmath flags has generated lots … of … discussion recently. It seems to me, at first glance, that MLIR can avoid some of the ambiguity by adding isinf()/isnan() operations with FastMathFlags attributes at some point. (I know that ComplexToStandard is using cmpf uno right now.) But I don’t see a different MLIR-specific fastmath flag solution to advocate here yet.

Yes.

Existing where? If I’m not mistaken, the existing FMF attribute in the LLVMIR dialect is not optional - it is a DefaultValuedAttr<>.

It seems to me that using “attribute absence” in place of an explicit fastmath=none enum value would be clumsy for code that enumerates different fastmath values. For example:

llvm::SmallVector<FastMathFlags, 3> fmf = { none, contract, fast };
for(auto f : fmf)
{
    // Retrieve Operation to modify...
    op.setFastMathFlags(f);
    // lower, run tests, evaluate results, ...
}

Representing the none case in the loop above as, instead, the absence of an optional attribute would require (IIUC) removing the attribute.

It does seem like the “optional” attribute case would allow a more concise assembly format when the attribute isn’t present (compared to having fastmath="none"), but I don’t see any other advantage. (If I’m missing something let me know.)

mehdi_amini · February 8, 2022, 1:35am

I’m not sure what it’d look like, but we should leave this for another discussion specific to this topic

ODS is a bit misleading here, the attribute may not be present but the C++ API will construct one for you for convenience:
The verifier account for the attribute to not be provided:

static ::mlir::LogicalResult __mlir_ods_local_attr_constraint_LLVMOps5(
    ::mlir::Operation *op, ::mlir::Attribute attr, ::llvm::StringRef attrName) {
  if (attr && !((attr.isa<::mlir::LLVM::FMFAttr>()))) {
    return op->emitOpError("attribute '") << attrName
        << "' failed to satisfy constraint: LLVM fastmath flags";
  }
  return ::mlir::success();
}

And the actual accessor is encapsulating the fact that the attribute may not be present:

::mlir::LLVM::FastmathFlags FAddOp::getFastmathFlags() {
  auto attr = getFastmathFlagsAttr();
    if (!attr)
      return ::mlir::LLVM::FMFAttr::get(::mlir::Builder((*this)->getContext()).getContext(), {}).getFlags();
  return attr.getFlags();
}

So unfortunately DefaultValuedAttr isn’t much different than Optional, it is just convenient “sugar” at the C++ API level but does not allow to assume that the attribute is really present.

This could also be handled in the setter itself. Your “none” would be replaced by FastMathFlags{} and the setter could detect it and delete the attribute entry.

Actually the textual format can always be tweaked to favor any default we’d like (like eliding the none case is trivial if it isn’t optional), but also we generally should optimize for the in-memory representation instead.

I don’t quite see a strong argument one way or another right now for this.

Something remaining to be defined is where to put this attribute if no in the builtin dialect, I don’t quite see an alternative right now.
@River707 ?

jfurtek · February 8, 2022, 2:11am

Thanks for the explanation - I wasn’t aware how that works…

I figured that is possible, but it seems strange (to me) have an op that has a fastmath=none attribute present (in-memory), and then elide that from the output text format. The reason that seems strange is that upon the following read, the op presumably would then be created without the optional attribute (since there was no text for the attribute present). So the round trip of memory → MLIR file → memory changed the actual internal representation (if not the actual semantics in this case). I’m not sure that I understand the intended MLIR semantics of DefaultValued and Optional (compared to what those mean in C++ I guess), so maybe I need to think about this more.

Why builtin? (Or why would there be no alternative to builtin?) It would seem that an attribute in arith would be usable from other dialects (as others have recently helped me understand), and it seems likely that a dialect that uses FastMathFlags would already have a dependency on arith anyway.

mehdi_amini · February 8, 2022, 2:32am

Not necessarily: the parser for the operation can have whatever behavior. You could decide that an op with an integer attribute could elide it when the value is 42 (for example because it would be this value in 99% of the cases…).

Yeah no, that wouldn’t be OK, the printer and parser should be in sync to preserve the round-trip.

If we believe that any dialect that would need to use FastMathFlags can easily take a dependency on arith then that would be fine, is it though?

Topic		Replies	Views
[RFC] FastMath flags support in complex dialect MLIR	6	745	July 28, 2023
Fastmath flags support in LLVM dialect ops MLIR	15	732	December 15, 2020
[RFC] Improving IR fast-math semantics IR & Optimizations core , rfc , llvm , llvm-ir	22	883	May 31, 2024
Place for fastmath attributes MLIR	13	601	April 24, 2023
[RFC] Integer overflow flags support in `arith` dialect MLIR	20	766	February 21, 2024

[RFC] FastMath flags support in MLIR (arith dialect)

Considerations:

Proposed changes:

Related Topics