llvm-mc-[dis]assemble-fuzzer status?

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,
and for the second one I believe the reason is that it does not enclose LLVMFuzzerTestOneInput in “extern ‘C’”.

Are those libraries maintained and/or used?

If yes, the code should be compilable, and ideally there should be a buildbot.
If no, maybe we should remove it, or move it to a separate repository.

Thanks,
George

(sorry for starting multiple threads, I believe this way it is more convenient to keep track of tasks)

George Karpenkov <ekarpenkov@apple.com> writes:

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am
looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,
and for the second one I believe the reason is that it does not
enclose LLVMFuzzerTestOneInput in “extern ‘C’”.

Are those libraries maintained and/or used?

I believe both of these worked a couple of months back when I last tried
them.

If yes, the code should be compilable, and ideally there should be a buildbot.
If no, maybe we should remove it, or move it to a separate repository.

Now that libFuzzer is part of the clang toolchain it should be much
easier to get bots up that are building these tools. Previously it was a
bit awkward.

I think it makes sense to fix these ones.

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am
looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,
and for the second one I believe the reason is that it does not enclose
LLVMFuzzerTestOneInput in “extern ‘C’”.

Are those libraries maintained and/or used?

If yes, the code should be compilable, and ideally there should be a
buildbot.

"there should be a buildbot" is actually two different questions.
1. There should be a bot that builds the fuzz targets and runs them on a
fixed set of inputs to ensure they don't bit-rot (and to use them as
regression tests).
This will require us to tweak the cmake machinery to allow building fuzz
target with regular flags (no coverage).
2. There should also be a bot that actually runs continuous fuzzing.
Our buildbots are not suitable for this, so I was planing to add the llvm
fuzzers to OSS-Fuzz (GitHub - google/oss-fuzz: OSS-Fuzz - continuous fuzzing for open source software.)
We already run the cxa_demangler fuzzer there with quite a bit of success.

I hope Daniel can answer the other questions.

George,

Thanks for doing the work to move libFuzzer to compiler-rt.

I probably touched these more recently than most and either didn’t deliver a complete patch or it’s rotted since then. In any case I haven’t gotten a chance to leverage it. But I’d like the idea of it arriving intact in the move to compiler-rt. If I’m able to get it back and running by the end of the week, would that be adequate?

Regarding a buildbot – I think that makes sense. I naively assumed that it would be a part of the default build set for the “all” target when building llvm. I figured that if there were ever any regression, that I or some other owner would be notified.

George,

Thanks for doing the work to move libFuzzer to compiler-rt.

I probably touched these more recently than most and either didn’t deliver a complete patch or it’s rotted since then. In any case I haven’t gotten a chance to leverage it. But I’d like the idea of it arriving intact in the move to compiler-rt. If I’m able to get it back and running by the end of the week, would that be adequate?

Regarding a buildbot – I think that makes sense. I naively assumed that it would be a part of the default build set for the “all” target when building llvm. I figured that if there were ever any regression, that I or some other owner would be notified.

Hi Brian,

Great, thanks!
I’m not sure why it’s not build: maybe because we never run “all” when LLVM_USE_SANITIZE_COVERAGE is set, which is required that to build that library.

I’ve just meant building them, not even necessarily running.
Then authors / people who make changes would notice, and it would get compiled.

I’m not sure why that would be necessary? We can have a checkout setup with LLVM_USE_SANITIZERS=ON.

Right, that would be great as well!

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am
looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,
and for the second one I believe the reason is that it does not enclose
LLVMFuzzerTestOneInput in “extern ‘C’”.

Are those libraries maintained and/or used?

If yes, the code should be compilable, and ideally there should be a
buildbot.

"there should be a buildbot" is actually two different questions.
1. There should be a bot that builds the fuzz targets and runs them on a
fixed set of inputs to ensure they don't bit-rot (and to use them as
regression tests).

I’ve just meant building them, not even necessarily running.
Then authors / people who make changes would notice, and it would get
compiled.

This will require us to tweak the cmake machinery to allow building fuzz
target with regular flags (no coverage).

I’m not sure why that would be necessary? We can have a checkout setup
with LLVM_USE_SANITIZERS=ON.

And also -DLLVM_USE_SANITIZE_COVERAGE=YES
But that *almost* implies that the host compiler is fresh clang, i.e.
essentially we need a bootstrap bot.
It's possible (we have a few bootstrap bots, and
lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer used to be such
too),
but it complicates the set up and makes it much slower.
A non-bootstrap bot is much more likely to stay green most of the time.

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am
looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,
and for the second one I believe the reason is that it does not enclose
LLVMFuzzerTestOneInput in “extern ‘C’”.

Are those libraries maintained and/or used?

If yes, the code should be compilable, and ideally there should be a
buildbot.

"there should be a buildbot" is actually two different questions.
1. There should be a bot that builds the fuzz targets and runs them on a
fixed set of inputs to ensure they don't bit-rot (and to use them as
regression tests).
This will require us to tweak the cmake machinery to allow building fuzz
target with regular flags (no coverage).
2. There should also be a bot that actually runs continuous fuzzing.
Our buildbots are not suitable for this, so I was planing to add the llvm
fuzzers to OSS-Fuzz (GitHub - google/oss-fuzz: OSS-Fuzz - continuous fuzzing for open source software.)
We already run the cxa_demangler fuzzer there with quite a bit of success.

clang-fuzzer is now running on oss-fuzz, and here are two trophies so far:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3134
ASSERT: ParmVarDeclBits.ScopeDepthOrObjCQuals == scopeDepth && "truncation!"
(haven't seen before)

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3133
llvm: ASSERT: DelayedTypos.empty() && "Uncorrected typos!"
An old friend: 21905 – clang::Sema::~Sema(): Assertion `DelayedTypos.empty() && "Uncorrected typos!"' failed.

I'll add clang-proto-fuzzer soon.

Which other fuzz targets are worth adding to oss-fuzz?

Who else wants to be automatically CC-ed to all trophies?
(I'll need to add your e-mail here:
https://github.com/google/oss-fuzz/blob/master/projects/llvm/project.yaml)

Kostya Serebryany <kcc@google.com> writes:

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am
looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,
and for the second one I believe the reason is that it does not enclose
LLVMFuzzerTestOneInput in “extern ‘C’”.

Are those libraries maintained and/or used?

If yes, the code should be compilable, and ideally there should be a
buildbot.

"there should be a buildbot" is actually two different questions.
1. There should be a bot that builds the fuzz targets and runs them on a
fixed set of inputs to ensure they don't bit-rot (and to use them as
regression tests).
This will require us to tweak the cmake machinery to allow building fuzz
target with regular flags (no coverage).
2. There should also be a bot that actually runs continuous fuzzing.
Our buildbots are not suitable for this, so I was planing to add the llvm
fuzzers to OSS-Fuzz (GitHub - google/oss-fuzz: OSS-Fuzz - continuous fuzzing for open source software.)
We already run the cxa_demangler fuzzer there with quite a bit of success.

clang-fuzzer is now running on oss-fuzz, and here are two trophies so far:

3134 - oss-fuzz - OSS-Fuzz: Fuzzing the planet - Monorail
ASSERT: ParmVarDeclBits.ScopeDepthOrObjCQuals == scopeDepth && "truncation!"
(haven't seen before)

3133 - oss-fuzz - OSS-Fuzz: Fuzzing the planet - Monorail
llvm: ASSERT: DelayedTypos.empty() && "Uncorrected typos!"
An old friend: 21905 – clang::Sema::~Sema(): Assertion `DelayedTypos.empty() && "Uncorrected typos!"' failed.

I'll add clang-proto-fuzzer soon.

Which other fuzz targets are worth adding to oss-fuzz?

I'd like llvm-isel-fuzzer to be added once its committed (which should
be as soon as LLVM fuzzers work in release builds again). One potential
issue is that llvm-isel-fuzzer is more of a collection of fuzzers, and
it needs some arguments to run (ie, to choose the backend).

I'd like llvm-isel-fuzzer to be added once its committed

consider it done (once it's there)

(which should
be as soon as LLVM fuzzers work in release builds again). One potential
issue is that llvm-isel-fuzzer is more of a collection of fuzzers, and
it needs some arguments to run (ie, to choose the backend).

I have the same problem with clang-proto-fuzzer, which uses the same
approach with flags as llvm-isel-fuzzer.

The solution I was thinking about is (drum roll!) to encode the flags in
the binary name, e.g.
"./llvm-isel-fuzzer,-flag1,-flag2" and then read these flags from argv[0]
in LLVMFuzzerInitialize()

Then in oss-fuzz build.sh we will just do this:
for flags in -flag1a,-flag1b -flag2a,-flag2b; do
  cp llvm-isel-fuzzer $OUT/llvm-isel-fuzzer,$flags
done

Kostya Serebryany <kcc@google.com> writes:

I'd like llvm-isel-fuzzer to be added once its committed

consider it done (once it's there)

(which should be as soon as LLVM fuzzers work in release builds
again). One potential issue is that llvm-isel-fuzzer is more of a
collection of fuzzers, and it needs some arguments to run (ie, to
choose the backend).

I have the same problem with clang-proto-fuzzer, which uses the same
approach with flags as llvm-isel-fuzzer.

The solution I was thinking about is (drum roll!) to encode the flags in
the binary name, e.g.
"./llvm-isel-fuzzer,-flag1,-flag2" and then read these flags from argv[0]
in LLVMFuzzerInitialize()

This is just horrible enough that it might work.

Then in oss-fuzz build.sh we will just do this:
for flags in -flag1a,-flag1b -flag2a,-flag2b; do
  cp llvm-isel-fuzzer $OUT/llvm-isel-fuzzer,$flags
done

Would it work to just create a simple shell script that forwards to the
"real" fuzzer binary? Ie,

  echo 'llvm-isel-fuzzer "$@" --ignore-remaining-flags=1 -mtriple=aarch64-apple-ios -global-isel -O0' > llvm-isel-fuzzer-aarch64-gisel

Then we could just tell OSS-Fuzz that llvm-isel-fuzzer-aarch64-gisel is
what we want to run. Depending on what OSS-Fuzz does with the binary I
could see this failing, of course.

Kostya Serebryany <kcc@google.com> writes:
>> I'd like llvm-isel-fuzzer to be added once its committed
>
> consider it done (once it's there)
>
>> (which should be as soon as LLVM fuzzers work in release builds
>> again). One potential issue is that llvm-isel-fuzzer is more of a
>> collection of fuzzers, and it needs some arguments to run (ie, to
>> choose the backend).
>
> I have the same problem with clang-proto-fuzzer, which uses the same
> approach with flags as llvm-isel-fuzzer.
>
> The solution I was thinking about is (drum roll!) to encode the flags in
> the binary name, e.g.
> "./llvm-isel-fuzzer,-flag1,-flag2" and then read these flags from
argv[0]
> in LLVMFuzzerInitialize()

This is just horrible enough that it might work.

This is not unheard of, right?
clang++ is a link to clang, but they actually behave in different ways

> Then in oss-fuzz build.sh we will just do this:
> for flags in -flag1a,-flag1b -flag2a,-flag2b; do
> cp llvm-isel-fuzzer $OUT/llvm-isel-fuzzer,$flags
> done

Would it work to just create a simple shell script that forwards to the
"real" fuzzer binary? Ie,

  echo 'llvm-isel-fuzzer "$@" --ignore-remaining-flags=1
-mtriple=aarch64-apple-ios -global-isel -O0' >
llvm-isel-fuzzer-aarch64-gisel

Then we could just tell OSS-Fuzz that llvm-isel-fuzzer-aarch64-gisel is
what we want to run. Depending on what OSS-Fuzz does with the binary I
could see this failing, of course.

This is unlikely to work with AFL and may complicate things for us in
future.
I am reluctant to support this in case we have some other fuzzing
mechanisms that won't not support this.

--kcc

Kostya Serebryany <kcc@google.com> writes:

Kostya Serebryany <kcc@google.com> writes:
>> I'd like llvm-isel-fuzzer to be added once its committed
>
> consider it done (once it's there)
>
>> (which should be as soon as LLVM fuzzers work in release builds
>> again). One potential issue is that llvm-isel-fuzzer is more of a
>> collection of fuzzers, and it needs some arguments to run (ie, to
>> choose the backend).
>
> I have the same problem with clang-proto-fuzzer, which uses the same
> approach with flags as llvm-isel-fuzzer.
>
> The solution I was thinking about is (drum roll!) to encode the flags in
> the binary name, e.g.
> "./llvm-isel-fuzzer,-flag1,-flag2" and then read these flags from
argv[0]
> in LLVMFuzzerInitialize()

This is just horrible enough that it might work.

This is not unheard of, right?
clang++ is a link to clang, but they actually behave in different ways

Changing behaviour based on argv[0] is pretty common, yes. Literally
parsing arguments out of argv[0] is pretty novel :wink:

This will probably work for the most part, as long as none of the
arguments we want to deal with have commas or spaces in them. The
biggest downside of this approach is that we have to implement the
splitting ourselves instead of letting the shell do it, and we really
don't want to have to implement something complicated and robust here.

(removed my @imgtec.com address since it no longer exists)

Sorry for the slow reply, it's a busy time for me right now.

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,

I don't seem to be able to build this with cmake+ninja yet (I'm having trouble recursing the compiler on macOS) but after manually building it... It seems this broke at the start of August when the CodeModel argument was removed from InitMCObjectFileInfo(). After removing that argument and adding an 'extern "C"' it at least compiles. I haven't had chance to try running it yet.

and for the second one I believe the reason is that it does not enclose LLVMFuzzerTestOneInput in “extern ‘C’”.

I agree we need an "extern C" here. I'm not sure what changed to make it required though.

Are those libraries maintained and/or used?

I haven't used it for quite a while now. My original motivator was the Mips assembler/disassembler being very buggy. I was using it to find crashes and generate interesting test cases for round-trip testing of the assembler/disassembler. Since then, the Mips MC layer has become much more stable and I've also changed jobs.

That said, I'd like to set up a bot to make use of these tools, it's mostly a matter of finding time for it. That's normally difficult but I should be able to do that in the next few weeks.

(removed my @imgtec.com address since it no longer exists)

Sorry for the slow reply, it's a busy time for me right now.

>
> Hi,
>
> As a part of a recent move of libFuzzer from LLVM to compiler-rt I am
looking into updating the build code
> for the libraries which use libFuzzer.
>
> I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
> For the first one, the reason is that it refers to a nonexistent enum,

I don't seem to be able to build this with cmake+ninja yet (I'm having
trouble recursing the compiler on macOS) but after manually building it...
It seems this broke at the start of August when the CodeModel argument was
removed from InitMCObjectFileInfo(). After removing that argument and
adding an 'extern "C"' it at least compiles. I haven't had chance to try
running it yet.

> and for the second one I believe the reason is that it does not enclose
LLVMFuzzerTestOneInput in “extern ‘C’”.

I agree we need an "extern C" here. I'm not sure what changed to make it
required though.

> Are those libraries maintained and/or used?

I haven't used it for quite a while now. My original motivator was the
Mips assembler/disassembler being very buggy. I was using it to find
crashes and generate interesting test cases for round-trip testing of the
assembler/disassembler. Since then, the Mips MC layer has become much more
stable and I've also changed jobs.

That said, I'd like to set up a bot to make use of these tools,

As soon as these fuzz targets build, don't immediately crash, and have
someone who cares about them,
I can add them to OSS-Fuzz for automated continuous fuzzing.

(removed my @imgtec.com address since it no longer exists)

Sorry for the slow reply, it’s a busy time for me right now.

Hi,

As a part of a recent move of libFuzzer from LLVM to compiler-rt I am looking into updating the build code
for the libraries which use libFuzzer.

I have tried to compile llvm-mc-assemble-fuzzer, and llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent enum,

I don’t seem to be able to build this with cmake+ninja yet (I’m having trouble recursing the compiler on macOS) but after manually building it… It seems this broke at the start of August when the CodeModel argument was removed from InitMCObjectFileInfo(). After removing that argument and adding an ‘extern “C”’ it at least compiles. I haven’t had chance to try running it yet.

and for the second one I believe the reason is that it does not enclose LLVMFuzzerTestOneInput in “extern ‘C’”.

I agree we need an “extern C” here. I’m not sure what changed to make it required though.

Are those libraries maintained and/or used?

I haven’t used it for quite a while now. My original motivator was the Mips assembler/disassembler being very buggy. I was using it to find crashes and generate interesting test cases for round-trip testing of the assembler/disassembler. Since then, the Mips MC layer has become much more stable and I’ve also changed jobs.

That said, I’d like to set up a bot to make use of these tools,

As soon as these fuzz targets build, don’t immediately crash, and have someone who cares about them,
I can add them to OSS-Fuzz for automated continuous fuzzing.

I had an out-of-tree target in mind but it would be great to test the in-tree targets with OSS-Fuzz.

Kostya Serebryany <kcc@google.com> writes:

I have tried to compile llvm-mc-assemble-fuzzer, and
llvm-mc-disassemble-fuzzer, and I couldn’t build either of those.
For the first one, the reason is that it refers to a nonexistent
enum,

...

Are those libraries maintained and/or used?

I haven't used it for quite a while now. My original motivator was the
Mips assembler/disassembler being very buggy. I was using it to find
crashes and generate interesting test cases for round-trip testing of the
assembler/disassembler. Since then, the Mips MC layer has become much more
stable and I've also changed jobs.

That said, I'd like to set up a bot to make use of these tools,

As soon as these fuzz targets build, don't immediately crash, and have
someone who cares about them, I can add them to OSS-Fuzz for automated
continuous fuzzing.

These both compile and run again as of r312011, though I suspect they'll
need some small changes to play well in OSS Fuzz and the like. They use
an approach to command line arguments that won't work for features like
-merge or parallel fuzzing (they could pretty easily be updated to use
"-ignore_remaining_args=1" like llvm-isel-fuzzer does though).

Yep. I may not have time to update these fuzzers though. Volunteers?
Also, even with -ignore_remaining_args=1 we may not be able to use them
(and llvm-isel-fuzzer) on oss-fuzz.

I'd suggest to at least change llvm-isel-fuzzer (and others) to have a
default value of flags, such that running e.g.
./bin/llvm-isel-fuzzer # no flags
will work (and fuzz one default config).

If we like how it works on oss-fuzz, we may then extend llvm-isel-fuzzer
to parse the command arguments (or a config type, etc) from the
executable's name.

--kcc