[discussion] Fuzzing Pattern Application

I have seen many cases of “incorrect pattern usage” in the last months in downstream projects; sometimes also in MLIR itself. E.g.:

  • Faulty patterns crash when applied in a different order. (Only works when patterns are applied in a certain order.)
  • matchAndRewrite return value is inconsistent with what the pattern is doing. (E.g., modifying IR but returning failure.)
  • Pattern modified IR without going through the rewriter.

As a first step, I built a fuzzer that randomizes the worklist of the GreedyPatternRewriteDriver (⚙ D142447 [mlir][WIP] GreedyPatternRewriteDriver: Worklist fuzzer). I found various cases of crash/“incorrect result computed” in multiple downstream projects. MLIR seems to be fine, but there are many “false positives”, where test cases check too aggressively for exact IR. (Most (all?) of these could be resolved with CHECK-DAG or proper variable/block name capturing.)

I wanted to get people’s opinion/thoughts on this. Is this useful? Also, how could such fuzzers/expensive checks can be controlled/enabled. Some options that come to mind:

  • Add a flag to mlir-opt. Downside: It is difficult to propagate this flag to the right place (e.g., GreedyPatternRewriterDriver.cpp or PatternApplicator.cpp).
  • Compile time flag, e.g., -DMLIR_ENABLE_FUZZER=ON.
  • A Gist on Github that users can patch in if they want to run the Fuzzer/other expensive checks.
3 Likes

Nice, having a way to root out unhygienic patterns is very appealing to me (see some concerns I have in a previous comment).

However, I think adding significantly more CHECK-DAG could be problematic as we also want reproducibility in the compiler: the default order we apply things in should always return the same results.

I’d find it useful to use this without performing the regex part but only checking for failures, errors, crashes and infinite loops in a first approximation.

There is also the problem of pattern interference / orthogonality that could indeed benefit from more CHECK-DAG but that I don’t see as clear bugs. It would still be useful to have offline signal on those and maybe something low-tech as a local sed s/CHECK:/CHECK-DAG:/g would be enough to capture cases where materially different IR is generated.

Am I reading correctly from your post that you ran fuzzing and did not find crashes / timeouts ?
That would mean the upstream MLIR patterns are pretty clean already, despite the API footguns, which would be quite encouraging for the overall health of the project.

1 Like

I was just fuzzing the worklist order. Upstream MLIR patterns are pretty resilient to that. I was not checking for invalid matchAndRewrite return values or direct IR manipulation (bypassing the rewriter) yet; I expect a few broken patterns in upstream MLIR (but not too many as we’ve already fixed a few with the recent changes to the GreedyPatternRewriter).

1 Like

That would seem like a good way to land this to me.

I don’t think we have a mlir-config.h generated by cmake yet, but I felt like I needed it a couple of times recently!

A config would be good indeed (in general, not specific to this).

I’d also recommend a different enable flag name. There are soooo many different fuzzing forms and we already have a fuzzer tool upstream while this would be at a different level. With a config file we could be pretty explicit and then even have classes of these.

That would be a nice fix to make the test focus more on the structure [I had that old change upstream to randomize SSA IDs printed to also identify other such hardcoded cases and we discussed adding a verify config for stuff like that, but that was more just to help clean up core at that point that I made it, although it is a useful thing to identify incidental checks]

Haven’t followed how we use config files, could you please point to an example?

Strong +1 on naming deconflation, I would personally run the fuzzer on all my local presubmits so I’d love if it were easy to enable, -DMLIR_ENABLE_GREEDY_REWRITER_WORKLIST_FUZZER=ON would work for me.

On the subject of cmake flag and variable naming/comments, I’m not sure if I would consider this fuzzing. The WIP patch does not seem to keep track of any internal state apart from the RNG, so to me it seems more like more general ‘random testing’. How about either something like -DMLIR_RANDOMIZE_GREEDY_REWRITER_WORKLIST + a new command line flag --mlir-rng-seed for reproducibility, or limit the scope a bit and plug into -DLLVM_REVERSE_ITERATION?

1 Like

The way it work is based on the make “configure_file” feature, see here: llvm-project/CMakeLists.txt at main · llvm/llvm-project · GitHub

configure_file(
  ${LLVM_MAIN_INCLUDE_DIR}/llvm/Config/llvm-config.h.cmake
  ${LLVM_INCLUDE_DIR}/llvm/Config/llvm-config.h)

Where the llvm-config.h.cmake contains configuration flags like this:

/* Define if threads enabled */
#cmakedefine01 LLVM_ENABLE_THREADS

and in your build folder you get a header “configured” based on the CMake variables, for example mine right now:

$ cat build/include/llvm/Config/llvm-config.h
...
/* Target triple LLVM will generate code for by default */
/* Doesn't use `cmakedefine` because it is allowed to be empty. */
#define LLVM_DEFAULT_TARGET_TRIPLE "arm64-apple-darwin22.3.0"

/* Define if threads enabled */
#define LLVM_ENABLE_THREADS 1

/* Has gcc/MSVC atomic intrinsics */
#define LLVM_HAS_ATOMICS 1

/* Host triple LLVM will be executed on */
#define LLVM_HOST_TRIPLE "arm64-apple-darwin22.3.0"
...