How is llvm-opt-fuzzer supposed to be built and used with a pass pipeline?

Hello List,

I'm currently writing my own little optimization pass (on LLVM 6.0) and
considered it a neat idea to fuzz it using llvm-opt-fuzzer, which in
theory should be a ready-made tool for such jobs as far as I can tell,
potentially helping me to find UB and Address issues in my pass.

So I went ahead and followed the instructions in the build manual [1] to
build LLVM's llvm-opt-fuzzer as "RelWithDebInfo" with clang / clang++
using my 18.04.1 LTS Ubuntu instance (and its default clang which is
version 6.0). Then I tried to run llvm-opt-fuzzer and it complained that
it wasn't linked to LibFuzzer and thus no fuzzing would be performed. So
I hacked the Link.txt file for llvm-opt-fuzzer in my cmake build
directory to add the -fsanitize=fuzzer flag and remove the dummy object
file from linking. Now it would actually look at the corpus, but then
immediately give up because

"ERROR: no interesting inputs were found. Is the code instrumented for
coverage? Exiting."

at which point I'm lost because of my lack of experience with CMake and
LibFuzzer I don't know how I can build LLVM with the required
instrumentation.

So my (first) question is:

What are the proper arguments to pass to CMake to actually get
llvm-opt-fuzzer to work as intended?

Additionally my pass has the problem that it requires -loop-simplify
being run beforehand (which can't be requested using
AnalysisUsage.addRequired<>() apparently). So I tried to specify
'-passes "loop-simplify mypass"' to llvm-opt-fuzzer but it was rejected
because "./llvm-opt-fuzzer: can't parse pass pipeline". Naturally I
tried to find any documentation for this format but a search would only
show me the fact that LLVM applies all passes on a function / module
before moving on to the next for locality reasons.

So my (second) question is:

What are the proper arguments to pass to llvm-opt-fuzzer to have it run
more than one pass, e.g. first loop-simplify and then DCE?

Alternate (third?) question:

Is there any way to require the loops be in simplified form for your own
pass short of re-implementing loop-simplify yourself in your pass?

I hope somebody here can and is willing to help me.

Kind Regards

Jean-Pierre Münch

[1]: Building LLVM with CMake — LLVM 16.0.0git documentation

P.S.: While on my above "adventure" I noticed that building LLVM with
clang and -DLLVM_USE_SANITIZER="MemoryWithOrigins" fails to complete
because it apparently detects a bug in one of the build helper tools.

+Matt Morehouse +Justin Bogner

Jean-Pierre Münch via llvm-dev <llvm-dev@lists.llvm.org> writes:

Hello List,

I'm currently writing my own little optimization pass (on LLVM 6.0) and
considered it a neat idea to fuzz it using llvm-opt-fuzzer, which in
theory should be a ready-made tool for such jobs as far as I can tell,
potentially helping me to find UB and Address issues in my pass.

So I went ahead and followed the instructions in the build manual [1] to
build LLVM's llvm-opt-fuzzer as "RelWithDebInfo" with clang / clang++
using my 18.04.1 LTS Ubuntu instance (and its default clang which is
version 6.0). Then I tried to run llvm-opt-fuzzer and it complained that
it wasn't linked to LibFuzzer and thus no fuzzing would be performed. So
I hacked the Link.txt file for llvm-opt-fuzzer in my cmake build
directory to add the -fsanitize=fuzzer flag and remove the dummy object
file from linking. Now it would actually look at the corpus, but then
immediately give up because

"ERROR: no interesting inputs were found. Is the code instrumented for
coverage? Exiting."

at which point I'm lost because of my lack of experience with CMake and
LibFuzzer I don't know how I can build LLVM with the required
instrumentation.

So my (first) question is:

What are the proper arguments to pass to CMake to actually get
llvm-opt-fuzzer to work as intended?

There is some documentation about this, but it's admittedly easy to miss:

  https://llvm.org/docs/FuzzingLLVM.html#configuring-llvm-to-build-fuzzers

Most importantly, you'll want to configure your build with at least the
-DLLVM_USE_SANITIZE_COVERAGE=On flag, and you'll probably want to use
-DLLVM_USE_SANITIZER=Address as well.

Also do note that if you have compiler-rt checked out, it shouldn't be
built with coverage, so you'll want the -DLLVM_BUILD_RUNTIME=Off flag to
cmake too.

Additionally my pass has the problem that it requires -loop-simplify
being run beforehand (which can't be requested using
AnalysisUsage.addRequired<>() apparently). So I tried to specify
'-passes "loop-simplify mypass"' to llvm-opt-fuzzer but it was rejected
because "./llvm-opt-fuzzer: can't parse pass pipeline". Naturally I
tried to find any documentation for this format but a search would only
show me the fact that LLVM applies all passes on a function / module
before moving on to the next for locality reasons.

So my (second) question is:

What are the proper arguments to pass to llvm-opt-fuzzer to have it run
more than one pass, e.g. first loop-simplify and then DCE?

For simple pass pipelines like this, you can list the passes using
commas, like -passes="loop-simplify,dce'. There's some description of
the pass pipeline syntax in the doxygen for the function that parses
these:

  http://llvm.org/doxygen/classllvm_1_1PassBuilder.html#a31150d6cb0017e0a2ce8e6a85265d2c1

There may be more user oriented docs for this somewhere else, but I'm
not sure where.

Alternate (third?) question:

Is there any way to require the loops be in simplified form for your own
pass short of re-implementing loop-simplify yourself in your pass?

I don't believe there's a way to do that currently.

I hope somebody here can and is willing to help me.

Happy to help! Let me know if anything still isn't clear.

Thanks Justin, that solved all my problems!

Although there's the minor nitpick that maybe the fuzzer-build
documentation should say that the fuzzer doesn't work with the standard
ld Linker and instead lld or gold (didn't test that one) should be used
to avoid people building LLVM for 2 hours just to get an error from the
binary which reads

ERROR: The size of coverage PC tables does not match the
number of instrumented PCs. This might be a compiler bug,
please contact the libFuzzer developers.
Also check https://bugs.llvm.org/show_bug.cgi?id=34636
for possible workarounds (tl;dr: don't use the old GNU ld)

and apparently is still a thing with modern ld's (e.g. Ubuntu's 2.30).

Thanks again!

Jean-Pierre Münch

Thanks. I made some minor edits to the FuzzingLLVM docs based on your
questions and feedback in r339949. Hope it helps the next person!

Jean-Pierre Münch <jean-pierre.muench@stud.tu-darmstadt.de> writes: