Some feedback on Libfuzzer

HI think I have a fairly nicely integrated Libfuzzer based fuzzer in
Postgres now. I can run things like:

SELECT fuzz(100000,'select regexp_matches(''foo/bar/baz'',$1,''g'')')

Which makes it convenient to fuzz arbitrary public functions available
in SQL. (I haven't figured out what interface to make for fuzzing
internal functions which take char buffers that can have nuls. The SQL
interface will only be able to handle valid utf8 encoded strings which
contain no nuls.)

I have some feedback of things that are a bit awkard or that I miss
from AFL. Some of this may actually be there but I'm just not using it
right?

1) One minor things, it's a bit of a pain to construct the argv when
you're not invoking it on the command line. Not a big deal but it
would be nice to bypass that and just allow the caller to set the
variables directly. Some of the parameters are not entirely clear
either -- I'm not clear what the distinction is between -runs and
-iterations and I'm not clear whether the timeout is for the whole run
or individual tests (it's not doing anything in my case which is
probably due to Postgres having its own ALRM handler).

2) I've caught a bug which causes enormous stack growth. AFL enforces
strict memory and time limits on the tests which is awfully
convenient. I can implement those myself in my fuzzer function (and in
fact in the Postgres context it would be best if I did) but some
simple AFL-style protection would be appreciated, as it is it takes a
*looong* time to fail and doesn't leave behind the crashing test. It
would be nice if Libfuzzer took a page out of the sanitize code's
tricks and kept an mmapped file where it wrote the current test being
run. If the file is never synced then it shouldn't cause any syscalls
or I/O until the program crashes and the file descriptor is closed.

My thinking is I need to set an RLIMIT_STACK setting and then install
a SEGV handler which will longjmp back to the top level and return to
the fuzzer. That will be risky since it's in theory impossible to
restore any state the SEGV caused but in practice if it's always
caused by a stack overflow might be safe. I would also like to have an
ALRM handler but that requires calling alarm() on every call and I'm
not sure if the setitimer in Libfuzzer can be disabled or if it'll
interfere with that. Maybe there's a better approach, I could call
setitimer and if I see more than n ALRMs during the execution decide
it's a fault. Again it would be nice if Libfuzzer provided that
itself.

3) When it writes the minimal test corpus it seems to keep older tests
around too. I guess the intent is to pass two directories, one which
starts empty and is intended to receive the results and one which is
maintained as the working tree? I'm not sure how to use this mode.

4) The actually fuzzing seems to be less effective than AFL at finding
good cases. In particular I've found I have to use only_ascii mode or
else it spends all the time looking at encoding errors on random
binary inputs. Even in only_ascii mode it seems insistent on putting a
^L in a *lot* of tests even when the function being tested always ends
with the same error if one is present.

I'm hoping to try DFA mode and hoping it will help with this but all
the "experimental" warnings in the docs scare me. Is it just that
there's room for improvement or is there any downside to running in
that mode?

Another thing I'm not clear whether it's not implemented yet or
there's just no feedback yet is the test for variable coverage. AFL
runs the same test repeatedly to test whether the coverage is
repeatable which can be an important thing to know whether your
testing is actually well implemented or whether you're failing to
clean up state sufficiently between runs.

5) I'm currently running 1M iterations per call then calling it again
(in a new process). It would be convenient if I could call it again in
the same process and in fact it would be most convenient if I could
make my code call the fuzzer repeatedly for, say, 1k invocations. I
could check for C-c once ever 1k calls and do any other cleanup,
checking for memory leaks, etc at that time.

It would also be nice to be able to ask for the minimal corpus back in
memory along with meta information like coverage, runtime, etc so I
could, say, store them in the database :slight_smile:

6) The crashing and slow tests are written to the current directory.
It would be nice to be able to provide a directory for them to go
into. Also, it would be nice to provide a callback or some other way
to override this. I could generate the whole SQL reproduction instead
of just having the binary data to pass and have to remember what
function I was testing.

In general the feedback is a bit unclear. It seems to print binary
strings in several different escape styles, sometimes using \x (though
it's not clear how many hex digits follow) sometimes using 0x and
sometimes using base64:

#755 NEW cov: 14667 bits: 476 units: 6 exec/s: 20 L: 4 \xa\x5\xcb*
0xa,0x5,0xcb,0x2a,

Test unit written to crash-b0f4bc53c8f72fd53ef0a6c1f46115bd7bd8fe50
Base64: IZqM9rA71To7KDonOlb8pCEoJ3Mn2sAnO1I3XwYoITtxO0exSjwo7u4nKZ8hnilHeQo6GDshTI4pKipWLa8KXg==

Of these only the base64 is convenient for writing reproductions
(though a callback would be most convenient) but it's not so
convenient for watching the progress. And for many lines it seems to
print no test data which is definitely not helpful for watching
progress:

#2028804 NEW cov: 15330 bits: 6511 units: 127 exec/s: 5880 L: 39
#2045447 NEW cov: 15330 bits: 6512 units: 128 exec/s: 5877 L: 47

Also, all this feedback is currently going into the server log. I
would like to capture it and report it to the client. I'm currently
basically just doing my own progress feedback this way but it's
missing the information about coverage and number of units found. It
can only show the number of tests done and things like memory usage
etc.

7) If I open up the corpus files in emacs and accidentally hit any key
then emacs saves an autosave file but then deletes it when I undo the
accidental edit -- which causes Libfuzzer to pretty much immediately
crash with:

Can not stat: /var/tmp/corpus/.#16813d894b330e26fdf4520793501dfffc830eb9;
exiting

I would suggest ignoring auto-save and backup files (.#* and *~) but
in any case this doesn't seem like it should be a fatal error. Just
warn about the disappearing file and move on to the next one.

It occurs to me that this is silly as Libfuzzer is currently not
capable of continuing once it finds one crash. There's no way for my
FuzzOne() call to report that the test "failed" but that it's still
prepared to continue fuzzing more inputs. This seems like the
fundamental problem I'm missing.

Also, one more thing, currently Libfuzzer does not catch SIGABRT and
treat it as a fatal event. I've added a SIGABRT handler to my own code
and moved StaticDeathCallback to public so I can call it from there.

Greg,
This is lots of useful feedback!I’ll reply to individual bullets when time permits (mostly after the holidays).

If you find a bug in Postgres with libFuzzer, please let us know so that we
can add it to http://llvm.org/docs/LibFuzzer.html#trophies

This is more like a limitation of asan, not libFuzzer.
By design, asan does not recover from the first crash.
This feature has been criticized quite a lot, but I am still convinced this
is a feature, not a bug.
IMHO, recovery mode will be misused/abused too often to be useful, besides
it adds complexity to the code.
(There is a patch under review right now to implement recovery mode for
asan,
but I am not sure if or when this patch will be committed)

Arguably a fuzzer changes the game somewhat. It's one thing to have a
crash during interactive testing but fuzzing is most useful if left to
run on a server unattended.

However I think this isn't the whole story here. There are many
different kinds of errors and not all are unrecoverable memory
failures. I may just get an internal error from my own code that I
know should never happen but doesn't represent major corruption. I
would like a way for the callback to return an error condition to the
fuzzer driver to have the test case set aside as a crash or perhaps as
a separate category (or perhaps to allow it to specify the name of the
category).

Also, one more thing, currently Libfuzzer does not catch SIGABRT and
treat it as a fatal event. I've added a SIGABRT handler to my own code
and moved StaticDeathCallback to public so I can call it from there.

Again, this is asan, not libFuzzer.
You need ASAN_OPTIONS=handle_abort=1
I hope to make it the default soon-ish.

Ah, that would have made this work a bit simpler. Thanks.

I have yet to really experiment with the sanitizers so I don't know if
asan is really doing anything for me given Postgres's internal memory
management. I couldn't get msan running but it sounds more promising
to me. I'll have to try again.

Oh. And enjoy your holidays!

More replies below.

If you feel some of your questions left unanswered, please ping or file a bug.

>
> This is more like a limitation of asan, not libFuzzer.
> By design, asan does not recover from the first crash.
> This feature has been criticized quite a lot, but I am still convinced
this
> is a feature, not a bug.
> IMHO, recovery mode will be misused/abused too often to be useful,
besides
> it adds complexity to the code.
> (There is a patch under review right now to implement recovery mode for
> asan,
> but I am not sure if or when this patch will be committed)

Arguably a fuzzer changes the game somewhat. It's one thing to have a
crash during interactive testing but fuzzing is most useful if left to
run on a server unattended.

However I think this isn't the whole story here. There are many
different kinds of errors and not all are unrecoverable memory
failures. I may just get an internal error from my own code that I
know should never happen but doesn't represent major corruption. I
would like a way for the callback to return an error condition to the
fuzzer driver to have the test case set aside as a crash or perhaps as
a separate category (or perhaps to allow it to specify the name of the
category).

For that you don't need libFuzzer support, right?
You can intercept your specific type of bug in the target function.

>> Also, one more thing, currently Libfuzzer does not catch SIGABRT and
>> treat it as a fatal event. I've added a SIGABRT handler to my own code
>> and moved StaticDeathCallback to public so I can call it from there.
>>
> Again, this is asan, not libFuzzer.
> You need ASAN_OPTIONS=handle_abort=1
> I hope to make it the default soon-ish.

Ah, that would have made this work a bit simpler. Thanks.

I have yet to really experiment with the sanitizers so I don't know if
asan is really doing anything for me given Postgres's internal memory
management.

That might be an interesting separate topic to discuss.

I couldn't get msan running but it sounds more promising
to me.

If you have custom memory management, msan will be as tricky to use as
asan.
Also, try ubsan for other kinds of bugs.

Sorry, I forgot to check in on this thread. I've made a lot of
progress on my side. I've worked around most of my issues, though
better interfaces to avoid workarounds would be great.

My main problem is the timeouts. Postgres uses ALRM and in fact I want
to specifically invoke that logic to detect cases where it's not
working (I.e. where we don't check for signals for too long). So what
I really want is something independent. I'm actually thinking of
implementing a side process that sends some other signal like SIGQUIT
or SIGFPE periodically.

What I did instead right now is replaced the setitimer with
setrlimit(RLIMIT_CPU) and set the AlarmHandler on SIGXCPU which Linux
fires once a second. That happens to be exactly what I want.

void SetTimer(int Seconds) {
int Res;
struct rlimit limit = {1, RLIM_INFINITY};
Res = setrlimit(RLIMIT_CPU, &limit);
assert(Res == 0);

struct sigaction sigact;
memset(&sigact, 0, sizeof(sigact));
sigact.sa_sigaction = AlarmHandler;
Res = sigaction(SIGXCPU, &sigact, 0);
assert(Res == 0);
}

However I think this isn't the whole story here. There are many
different kinds of errors and not all are unrecoverable memory
failures. I may just get an internal error from my own code that I
know should never happen but doesn't represent major corruption. I
would like a way for the callback to return an error condition to the
fuzzer driver to have the test case set aside as a crash or perhaps as
a separate category (or perhaps to allow it to specify the name of the
category).

For that you don't need libFuzzer support, right?
You can intercept your specific type of bug in the target function.

That's exactly what I'm doing. I'm catching internal errors now and
call abort() so the fuzzer logs that test case as a crash.

What I'm doing now is (man I wish gmail was more useful for code):

diff --git a/FuzzerLoop.cpp b/FuzzerLoop.cpp
index dd81616..aa53046 100644
--- a/FuzzerLoop.cpp
+++ b/FuzzerLoop.cpp
@@ -77,6 +77,28 @@ void Fuzzer::AlarmCallback() {
   }
}

+void Fuzzer::StaticErrorCallback(const char *errorname) {
+ assert(F);
+ F->ErrorCallback(errorname);
+}

Er, yeah. Even a trivial test case doesn't work:

$ cat foo.c
int main(int argc, char *argv, char *envp) {
return 1;
}

$ clang -o foo -fsanitize=memory -fPIE -pie foo.c

$ sysctl kernel.randomize_va_space
kernel.randomize_va_space = 2

$ ./foo
FATAL: Code 0x55873d194390 is out of application range. Non-PIE build?
FATAL: MemorySanitizer can not mmap the shadow memory.
FATAL: Make sure to compile with -fPIE and to link with -pie.
FATAL: Disabling ASLR is known to cause this error.
FATAL: If running under GDB, try 'set disable-randomization off'.
==25950==Process memory map follows:
0x55873d177000-0x55873d216000 /tmp/foo
0x55873d415000-0x55873d419000 /tmp/foo
0x55873d419000-0x55873f88c000
0x7f276d5cf000-0x7f276d921000
0x7f276d921000-0x7f276dac0000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f276dac0000-0x7f276dcc0000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f276dcc0000-0x7f276dcc4000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f276dcc4000-0x7f276dcc6000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f276dcc6000-0x7f276dcca000
0x7f276dcca000-0x7f276dce0000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x7f276dce0000-0x7f276dedf000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x7f276dedf000-0x7f276dee0000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x7f276dee0000-0x7f276dee3000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f276dee3000-0x7f276e0e2000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f276e0e2000-0x7f276e0e3000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f276e0e3000-0x7f276e0e4000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f276e0e4000-0x7f276e1e4000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f276e1e4000-0x7f276e3e3000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f276e3e3000-0x7f276e3e4000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f276e3e4000-0x7f276e3e5000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f276e3e5000-0x7f276e3ec000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f276e3ec000-0x7f276e5eb000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f276e5eb000-0x7f276e5ec000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f276e5ec000-0x7f276e5ed000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f276e5ed000-0x7f276e605000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f276e605000-0x7f276e804000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f276e804000-0x7f276e805000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f276e805000-0x7f276e806000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f276e806000-0x7f276e80a000
0x7f276e80a000-0x7f276e82a000 /lib/x86_64-linux-gnu/ld-2.19.so
0x7f276ea03000-0x7f276ea08000
0x7f276ea1e000-0x7f276ea2a000
0x7f276ea2a000-0x7f276ea2b000 /lib/x86_64-linux-gnu/ld-2.19.so
0x7f276ea2b000-0x7f276ea2c000 /lib/x86_64-linux-gnu/ld-2.19.so
0x7f276ea2c000-0x7f276ea2d000
0x7ffd99d31000-0x7ffd99d52000 [stack]
0x7ffd99d73000-0x7ffd99d75000 [vvar]
0x7ffd99d75000-0x7ffd99d77000 [vdso]
0xffffffffff600000-0xffffffffff601000 [vsyscall]
==25950==End of process memory map.

> I get that even if I put -fPIE in CFLAGS.

Er, yeah. Even a trivial test case doesn't work:

What's the version of Linux and Clang?

Checked out a few days ago. It looks like r246697. I suppose I could
try updating and rebuilding.

$ uname -a
Linux pixel 4.2.0-trunk-amd64 #1 SMP Debian 4.2-1~exp1 (2015-08-31)
x86_64 GNU/Linux

Sorry, svn log in the tools/clang directory shows r246702.

clang revision is good, but the kernel is probably too new.
Evgenii can comment on that.

Yes, the kernel is too new.
This bug has a patch set that's compatible with the new kernel and
does not even require -pie:
https://llvm.org/bugs/show_bug.cgi?id=24155
It breaks MSan ABI though, so we can not apply it upstream yet.

Hm, that bug has been closed as resolved but I still see the problem:

$ clang --version
clang version 3.8.0 (trunk 250848) (llvm/trunk 250846)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin

configure:4042: ./conftest
FATAL: Code 0x5615faea43f0 is out of application range. Non-PIE build?
FATAL: MemorySanitizer can not mmap the shadow memory.
FATAL: Make sure to compile with -fPIE and to link with -pie.
FATAL: Disabling ASLR is known to cause this error.
FATAL: If running under GDB, try 'set disable-randomization off'.
==14645==Process memory map follows:
0x5615fae87000-0x5615faf26000 /home/stark/src/pg/postgresql-master/conftest
0x5615fb126000-0x5615fb12a000 /home/stark/src/pg/postgresql-master/conftest
0x5615fb12a000-0x5615fd59d000
0x7f86a64a3000-0x7f86a67f5000
0x7f86a67f5000-0x7f86a6994000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f86a6994000-0x7f86a6b94000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f86a6b94000-0x7f86a6b98000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f86a6b98000-0x7f86a6b9a000 /lib/x86_64-linux-gnu/libc-2.19.so
0x7f86a6b9a000-0x7f86a6b9e000
0x7f86a6b9e000-0x7f86a6bb4000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x7f86a6bb4000-0x7f86a6db3000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x7f86a6db3000-0x7f86a6db4000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x7f86a6db4000-0x7f86a6db7000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f86a6db7000-0x7f86a6fb6000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f86a6fb6000-0x7f86a6fb7000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f86a6fb7000-0x7f86a6fb8000 /lib/x86_64-linux-gnu/libdl-2.19.so
0x7f86a6fb8000-0x7f86a70b8000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f86a70b8000-0x7f86a72b7000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f86a72b7000-0x7f86a72b8000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f86a72b8000-0x7f86a72b9000 /lib/x86_64-linux-gnu/libm-2.19.so
0x7f86a72b9000-0x7f86a72c0000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f86a72c0000-0x7f86a74bf000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f86a74bf000-0x7f86a74c0000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f86a74c0000-0x7f86a74c1000 /lib/x86_64-linux-gnu/librt-2.19.so
0x7f86a74c1000-0x7f86a74d9000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f86a74d9000-0x7f86a76d8000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f86a76d8000-0x7f86a76d9000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f86a76d9000-0x7f86a76da000 /lib/x86_64-linux-gnu/libpthread-2.19.so
0x7f86a76da000-0x7f86a76de000
0x7f86a76de000-0x7f86a76fe000 /lib/x86_64-linux-gnu/ld-2.19.so
0x7f86a78d7000-0x7f86a78dc000
0x7f86a78f2000-0x7f86a78fe000
0x7f86a78fe000-0x7f86a78ff000 /lib/x86_64-linux-gnu/ld-2.19.so
0x7f86a78ff000-0x7f86a7900000 /lib/x86_64-linux-gnu/ld-2.19.so
0x7f86a7900000-0x7f86a7901000
0x7fff98977000-0x7fff98998000 [stack]
0x7fff989b2000-0x7fff989b4000 [vvar]
0x7fff989b4000-0x7fff989b6000 [vdso]
0xffffffffff600000-0xffffffffff601000 [vsyscall]
==14645==End of process memory map.

Can you open a separate bug with exact repro instructions?

Well the bug tracker seems to require an account.

But in any case I don't see anything specific to reproduce. I did an
svn update of llvm and clang and built and installed (I even tried
make clean and removed the old install but it didn't change anything).
Then any program I compile with -fsanitize=memory behaves just like it
did before the patch:

$ uname -a
Linux pixel 4.2.0-trunk-amd64 #1 SMP Debian 4.2-1~exp1 (2015-08-31)
x86_64 GNU/Linux

$ cat foo.c
int main(int argc, char *argv, char *env) {
return 0;
}

$ clang -fPIE -pie -fsanitize=memory foo.c

$ ./a.out
FATAL: Code 0x562988448390 is out of application range. Non-PIE build?
FATAL: MemorySanitizer can not mmap the shadow memory.
FATAL: Make sure to compile with -fPIE and to link with -pie.
FATAL: Disabling ASLR is known to cause this error.
FATAL: If running under GDB, try 'set disable-randomization off'.
==3120==Process memory map follows:
0x56298842b000-0x5629884ca000 /tmp/a.out
0x5629886ca000-0x5629886ce000 /tmp/a.out
0x5629886ce000-0x56298ab41000
0x7f387d560000-0x7f387d8b2000
...

I've got no explanation for this. I've verified that 0x5555555daf34 is
not considered "out of application range" on linux/x86_64/3.13.0.
Reading the current source code,
  {0x510000000000ULL, 0x600000000000ULL, MappingDesc::APP, "app-2"},
which makes the following line impossible:
  FATAL: Code 0x562988448390 is out of application range.

Could you double-verify that you are using ToT clang? Does anything
change if you run in gdb (without set disable-randomization off)?

I think I finally figured out what I had done wrong here. I had done
an svn update for llvm and tools/clang but not for
projects/compiler-rt.