[RFC] Add std.source and std.sink for test case writing/reduction

The other day while I was reducing a test case, I found the need for some generic ops that I can use to stand-in for arbitrary other ops Say I’m debugging an op, which I suspect to be the cause of an error:

%7861 = "some.op"(%7855, %7832) : (!some.type, !some.other_type) -> !yet.another_type
... computation that uses %7861 

While reducing the test case, one natural thing is to replace “some.op” with a dummy op that creates the value %7861 (or in general there might be multiple results as well). We don’t seem to have a std op that lets me do that. I propose a std.source op that lets us do that.

%7861 = source : !yet.another_type
... computation that uses %7861 

The same applies to a “sink” op. Consider the same starting IR snippet as above, but now we believe there is something about the use of %7855 which is problematic. Then I can delete the op and replace it with just

sink %7855 : !some.type

Ops with these semantics are also useful for testing passes. For many scenarios one can use function arguments / returns as “source”/“sink”, but in the presence of control flow that approach falls down.

In LLVM-land (or when reducing at the C++ level), we typically would spell them as external functions, but in MLIR that’s not always possible, since there is an open-ended notion of what a “function” could be. Depending on the context it’s not obvious (or just not possible) how to create a “call to an external function”.

Just as in the case of LLVM test case reduction with external functions, there is never any intent to actually execute the code at runtime. It’s just a structural device to tickle a compiler bug.

Another option here is to just use an op in an unregistered dialect. That works somewhat well, but requires passing -mlir-allow-unregistered dialect and is generally more typing and cognitive overhead. Also, I made the std.source op in the patch be NoSideEffect (which I think is usually what is desired), which is not possible with an unregistered op. (also, the test dialect is not linked in by most users of MLIR, so having these ops there doesn’t materially help)

What do folks think?

Patch: https://reviews.llvm.org/D79683

So you are proposing to add ops with no runtime semantics to standard dialect to avoid setting a command line flag?

Why? To enable free re-ordering? But wouldn’t you already have an error case with an existing ordering and so retaining the order removes one source of change?

When you put it that way, I guess it doesn’t make much sense :slight_smile:

My thought process was that when replacing many ops with std.source ops, it is useful to allow them to be deleted if unused.

I much rather avoid having this in the std dialect, I’d be OK having this in the test dialect though.
Otherwise just using unregistered op should work fine, even if it requires setting up a flag this is only used for test reduction so the impact does not seem that bad.

I am usually doing exactly this, just a matter of habit.

Regarding the ops, I’m fine having these, but I don’t think they belong to the standard dialect. I understand the idea of hooking on the standard dialect that is almost always registered though, but it’s almost :slight_smile:

I use unregistered ops with the -mlir-allow-unregistered flag too when needed, and I appreciate that the error message when you forget to do that gives you the exact flag to add :slight_smile:

It seems like these might fit in as “ops useful for compiler engineers and test cases” but not in standard. IREE has grown a few such ops over the months as well (and has struggled with where to put them). Unregistered ops are fine as long as a default op is what you need, but as Sean says, as soon as you need to customize it in some way, you need to add it somewhere.

So it sounds like we have some consensus that having these ops is valuable, but just that std is not the right place.

As far as putting them in the test dialect, my understanding is that the test dialect is a dialect used to test the MLIR core IR infra (e.g. exercise all the features, like ops with regions, custom termintors, ODS features, etc.), not an dialect used for “compiler testing” in a general sense.

Having something like IREE’s do_not_optimize and similar ops seem like it would fit in a “for compiler testing” dialect as well. The key invariant for this dialect is that we would expect all *-opt tools would register it (much as they register a common set of commandline flags), even if production flows do not.

Does that make sense to folks? We could call it testing or compiler perhaps?

As an aside, this would be a useful process to document in the debugging guide Sean added a while back. (It gets down to pass-level granularity but not op-level.)

The test dialect is setup to be an internal part of the in-tree MLIR tests and it has a bunch of corresponding things in it that I don’t think should be exposed/relied on (not to mention being defined outside of the normal conventions). testing seems not bad and it would leave room for it to have other ops that we’ve found a need for (checks, etc). It is just quite a broad name.

Other options (none of which I love):

opttools, tooling, mlirtooling, testtools, testhelpers?

What else than two ops without any trait (and so equivalent to unregistered ops for most purpose) would go in such a dialect?

See Stella’s link above: https://github.com/google/iree/blob/e17d29e281c42e37e00ac18036cd7f267f623c48/iree/compiler/Dialect/IREE/IR/IREEOps.td#L58

“do not optimize” and things that decompose into that like “unfoldable constant” probably.

Possibly related (if considering a name like “test”): IREE’s Check Dialect

@_sean_silva: it isn’t clear to me how is “do not optimize” different from an unregistered op?

The Check Dialect seems more related to some sort of runtime-specific implementation of assertion and other runtime tests?

Yes - I included it to potentially contrast with an attempt to claim the “testing” name. This check dialect contains ops that I would more traditionally expect to be in a thing named testing.

Back to the original point, we use it for distilled test cases that are checked in or persist in some way, and the IREE compiler has a pass that specifically removes them late in its pipeline. We found the pattern useful enough to make it a named op and support it in production tooling so that the same tools could process distilled test cases and user programs (without special flags).

For more local use (ie. in a specific filecheck test that needs such behavior), I generally use unregistered ops. Ditto for ad-hoc *-opt workflows.

It is specifically preserved in various lowering passes, is used when lowering things like dynamic_shape_constant and is used in the canonicalization of unfoldable_constant. In general we do not allow unknown ops in these pipelines, so I think it has to be a registered op. I think when I originally created it I suggested it might be generally useful but didn’t get traction, so I stuck it in IREE.

That’s fair! In a traditional compiler you can always call an external unknown function to model this, but in MLIR not every dialect allows external calls, so I can see how useful this can be to model!
(this applies to source/sink as well actually)

It seems like actually the trio of source, sink, and opaque_identity (i.e. iree.do_not_optimize) can be viewed in that light, with the added advantage over an unknown op of having a bit of syntax sugar which is useful since they are frequently written by hand during test case reduction.

I’m still struggling with what an appropriate dialect name for these would be (I agree that putting them in standard is not ideal). Trying some names:

  • testhint
  • util (I hate things called this but…)
  • opthint

A full name like useful_structural_ops_for_test_case_expression is too much :slight_smile:

Personally, It sounds like this is very closely related to testing and fuzzing. Why not give them a semantics as a randomized input generator which could be hooked into appropriate verification infrastructure (e.g. alive2) and represent appropriate preconditions and testvector constraints? With this in mind (which doesn’t necessarily agree with your concept!), I’d suggest something like “fuzz” as a dialect name.