Handling optional input/output arguments

When working with frontend dialects, we encounter optional attributes as well as input/output arguments. Optional attributes are easily handled in TableGen as of now. One example of optional input is the bias for Conv2D in Keras, we see the same optional input in ONNX’s CONV operation. There are many other examples.

We have identified a few ways to handle optional input/output:

  1. Creating separate operation varients in the dialect itself, and steering the creation of the one or the other when encountering a frontend operation with optional arguments. Drawback is lower reuse of code among the different variants.

  2. Creating a separate tensor type that accepts “no value” and possibly add an attributes that indicates the presence of an optional argument. This let us use a common operations for all combinations of optional arguments. The verifier does not know the correlation between the attribute and argument, so checking of consistency is left to the developper.

  3. Adding support for optional arguments directly in TableGen, following the example of its handling of the optional attributes.

We currently use approach 1; I believe TF MLIR uses approach 2. Would the community be supportive of approach 3, to have a more standardized way to handle optional input/output arguments.

Hi Alex,

(I assume you meant “operands” because MLIR TableGen uses arguments for both operands and attributes, and an op’s region arguments are different from its TableGen ‘arguments’. )

It looks like you didn’t need to create different variants here. Custom op parsing can take care of it without that. Were you referring to strictly TableGen for custom op parsing as well (support for which was recently added)?

In the cases you are interested in, I assume you always know the number of operands (at compile time) whenever that optional info exists? If that’s the case, you really don’t need any of the three solutions you list. An attribute can say whether the optional operands exist, but the trickier part of how many, if constant, would be determined by looking at other information. For eg. a typical case is when you need to look at the type of a previously parsed operand (for eg. rank of a tensor/memref) to determine how many optional operands to expect of a specific kind. Extending TableGen for these special cases is in theory feasible but probably not worthwhile - because there could be non-trivial ways in which you determine how many of the operands of each optional kind you need. With TableGen, one could just have all the optional operands as trailing operands with Variadic in cases where you didn’t need variadic sub-lists for the preceding mandatory operands.

I have run into optional operands before - the AffineDmaStart/WaitOp has stride operands that are optional. We just use a custom form - strides are the only optional operands there and they are the trailing ones. You always know at compile time how many of the trailing operands are stride info. And in this case, you don’t even need an attribute to say there is stride info - again because those are the only optional operands.

And in the most general case where you don’t even know the number of optional operands of each type you have!, and an extra leading operand is needed just to give you that count: this would be extremely rare and extremely complex to even worry about!

I don’t think the issue is with parsing, but rather with the verifier: ODS will emit a verifier for its operands.
We use “Variadic” in TensorFlow: llvm-project/mlir/include/mlir/IR/OpBase.td at main · llvm/llvm-project · GitHub to model optional operands (I assume this is what @AlexEichenberger was referring to?) .

Attributes are “easy” as they are key-value pair, we don’t have this flexibility with Operands. How do you see this playing? How would an accessor emitted by Tablegen return the right Operand for a given name?

How do you see this playing? How would an accessor emitted by Tablegen return the right Operand for a given name?

I don’t have the deep MLIR knowledge of you guys, so the solution I am thinking of may be naive. Was thinking of one entry per input/output operands (regardless of optional or not), so no ambiguity on names and/or numbers, and return an Optional< > template around the optional ones.

The ONNX dialect is riddle with optional inputs and outputs. Not saying is good, it’s just what folks working with such dialect design have to deal with. We have seen other front-end dialects with them too.

We also use None in TFLite dialect too to signify optional. So it is not a tensor type, but a none value that can be created with a constant.

A problem was in how one writes patterns with optionals, which is what None helps with (there is always an operand, its value may just sometimes be none). But one can still verify if it is not none constraints on it.

Currently one can do bias().getType().isa<NoneType>() to check if the optional bias was provided. So of course this could be improved. An option would be to mark them as variadic and if the size is 0 then it is empty. So that is possible. E.g., instead of Optional<> one uses Variadic<> and if the variadic size is 0, then it is empty (that could even work today …). As long as there is one ordering it is fine - now using variadic such means writing DRR patterns that matches whether an operand is not set or not requires some extra work but as long as all the arguments are still in the same order (even if empty, they need to be specified [at least for know]) then it should work.

I view Optional as a special case of Variadic: it allows zero or one values instead of zero or more values. So emulating Optional with Variadic is certainly possible and it’s one way we are doing that right now. In general, following how we handle Variadic in ODS, we should be able to support Optional similarly and generate Optional<Value> instead of operand_range. It becomes tricky when an op has multiple optional values. Unlike Variadic, where a common use case is all the variadic operands are of the same count, I’m not sure we have such common cases for Optional. So likely it ends up we need to always attach AttrSizedOperandSegments for such ops.