libFuzzer: add an option to always null-terminate?

Hi all,
While playing with libFuzzer, it’s a little cumbersome to having to copy the buffer just in order to null-terminate it.
Is a null-terminated buffer an often-enough usage scenario to warrant a libFuzzer commandline configuration switch to always generate a null-terminated test case?

Thanks,
Johan

Hi all,
  While playing with libFuzzer, it's a little cumbersome to having to copy
the buffer just in order to null-terminate it.

It's just one line, isn't it?
(Well, in C++; in C this would be 3 lines)

Is a null-terminated buffer an often-enough

It's somewhat frequent, yes.

usage scenario to warrant a libFuzzer commandline configuration switch to
always generate a null-terminated test case?

Such option will need to be *off* by default, because there are lots of
cases where we must not null-terminate the input (otherwise we'll hide some
bugs).
And when an option is off by default and some targets *require* it to be on
in order to function properly it becomes a very bad idea, IMHO.

Besides, the LLVMFuzzerTestOneInput is supposed to be a general interface
between the APIs under test and any fuzzing engine (AFL, honggfuzz, SAGE,
KLEE, etc) and we should not expect all of them to implement the flag.

--kcc

Hi all,
  While playing with libFuzzer, it's a little cumbersome to having to
copy the buffer just in order to null-terminate it.

It's just one line, isn't it?
(Well, in C++; in C this would be 3 lines)

One? I know how to in two. Teach me :slight_smile: (unfortunately in D, it's 4 lines)

Is a null-terminated buffer an often-enough

It's somewhat frequent, yes.

usage scenario to warrant a libFuzzer commandline configuration switch to
always generate a null-terminated test case?

Such option will need to be *off* by default,

definitely

because there are lots of cases where we must not null-terminate the input
(otherwise we'll hide some bugs).
And when an option is off by default and some targets *require* it to be
on in order to function properly it becomes a very bad idea, IMHO.

That's a good argument. I had not realized that all of the other options
aren't requirements (although I've been abusing -only_ascii for that a
little bit). Adding `if (data[size-1]) return 0;` to remove the requirement
probably doesn't work well with the mutation algorithm.
I was hoping I could elide the buffer allocation and copy.

Besides, the LLVMFuzzerTestOneInput is supposed to be a general interface
between the APIs under test and any fuzzing engine (AFL, honggfuzz, SAGE,
KLEE, etc) and we should not expect all of them to implement the flag.

I was quite surprised not being able to find an option to null terminate :slight_smile:

-Johan

Hi all,
  While playing with libFuzzer, it's a little cumbersome to having to
copy the buffer just in order to null-terminate it.

It's just one line, isn't it?
(Well, in C++; in C this would be 3 lines)

One? I know how to in two. Teach me :slight_smile: (unfortunately in D, it's 4 lines)

std::string s(reinterpret_cast<const char*>(Data), Size);

Then use s.c_str() instead of Data.

Is a null-terminated buffer an often-enough

It's somewhat frequent, yes.

usage scenario to warrant a libFuzzer commandline configuration switch
to always generate a null-terminated test case?

Such option will need to be *off* by default,

definitely

because there are lots of cases where we must not null-terminate the
input (otherwise we'll hide some bugs).
And when an option is off by default and some targets *require* it to be
on in order to function properly it becomes a very bad idea, IMHO.

That's a good argument. I had not realized that all of the other options
aren't requirements (although I've been abusing -only_ascii for that a
little bit). Adding `if (data[size-1]) return 0;` to remove the requirement
probably doesn't work well with the mutation algorithm.

It may actually work surprisingly well.
Yes, libFuzzer will spend some extra time creating non-zero terminated
mutations, but it won't spend time executing them (due to early exit)

I was hoping I could elide the buffer allocation and copy.

For performance reasons?
It makes sense to worry about it only if your target is super-fast (e.g. >
100000 exec/s) and you want to make it even faster.