Fuzzing complex programs

I have a project I want to do based on Libfuzzer. Is there a separate
list for it or should I bring up any ideas for it here?

What I have in mind is to fuzz Postgres. Trying to fuzz the SQL
interpreter in general
is not very productive because traditional fuzzers try to execute the
entire program repeatedly and it has a fairly high startup and
shutdown cost. Also the instrumentation-guided approach has
limitations due to the way lexing and parsing works as well as the
large amount of
internal state causing non-deterministic internal behaviour (garbage
collecting persistent data structures, etc).

However there are a number of internal functions that would be very
feasible to fuzz. Things like the datatype input/output functions (I'm
particularly thinking of the datetime parser), regular expression
library, etc.

To do this effectively I think it would be best to invoke the fuzzer
from inside Postgres. Essentially provide bindings for Libfuzzer so
you can I can have Libfuzzer provide all the test cases to repeatedly
call the internal functions on.

Is there any example of doing something like this already? Am I taking
a crazy approach?

There are other approaches possible. It would be nice if I could run
afl or libfuzzer on a client program and have the client program tell
afl or libfuzzer the pid of the server to watch and then request test
cases to feed to the server. That seems like it would be a more
flexible approach for a lot of use cases where the server requires
setting up a complex environment.

I have a project I want to do based on Libfuzzer. Is there a separate
list for it or should I bring up any ideas for it here?

What I have in mind is to fuzz Postgres. Trying to fuzz the SQL
interpreter in general
is not very productive because traditional fuzzers try to execute the
entire program repeatedly and it has a fairly high startup and
shutdown cost. Also the instrumentation-guided approach has

One challenge in leaving the daemon up while testing is knowing how well
isolated the test cases are from one another. It may be the case that the
test cases somehow accumulate some global state (test case N triggers heap
corruption, N + 23 crashes as a result of that earlier corruption). At
least that specific failure mode can probably be mitigated by using one or
more of the sanitizers though.

limitations due to the way lexing and parsing works as well as the
large amount of
internal state causing non-deterministic internal behaviour (garbage
collecting persistent data structures, etc).

However there are a number of internal functions that would be very
feasible to fuzz. Things like the datatype input/output functions (I'm
particularly thinking of the datetime parser), regular expression
library, etc.

To do this effectively I think it would be best to invoke the fuzzer
from inside Postgres. Essentially provide bindings for Libfuzzer so
you can I can have Libfuzzer provide all the test cases to repeatedly
call the internal functions on.

Is there any example of doing something like this already? Am I taking
a crazy approach?

I don't have enough experience to say if it's crazy or not. But if
your LLVMFuzzerTestOneInput() queues some work for the server and pends on
a response -- that seems like a sane approach.

There are other approaches possible. It would be nice if I could run
afl or libfuzzer on a client program and have the client program tell
afl or libfuzzer the pid of the server to watch and then request test
cases to feed to the server. That seems like it would be a more
flexible approach for a lot of use cases where the server requires
setting up a complex environment.

Great idea, but it seems tricky to get the execution coverage feedback in
this case.

Let me know if you're interested in collaborating, it sounds interesting.
Though at first glance, I'd prefer the "not very productive" brute force
option and just toss more resources at it.

I have a project I want to do based on Libfuzzer. Is there a separate
list for it or should I bring up any ideas for it here?

No separate list so far, this one should be good.

What I have in mind is to fuzz Postgres. Trying to fuzz the SQL

interpreter in general
is not very productive because traditional fuzzers try to execute the
entire program repeatedly and it has a fairly high startup and
shutdown cost. Also the instrumentation-guided approach has

One challenge in leaving the daemon up while testing is knowing how well
isolated the test cases are from one another. It may be the case that the
test cases somehow accumulate some global state (test case N triggers heap
corruption, N + 23 crashes as a result of that earlier corruption). At
least that specific failure mode can probably be mitigated by using one or
more of the sanitizers though.

This is true, however accumulating global state increases the chances to
find complex bugs (at the cost of increased cost of analyzing bugs).
We have seen a few such cases, e.g.
https://sourceware.org/bugzilla/show_bug.cgi?id=18043#c11

limitations due to the way lexing and parsing works as well as the

large amount of
internal state causing non-deterministic internal behaviour (garbage
collecting persistent data structures, etc).

However there are a number of internal functions that would be very
feasible to fuzz. Things like the datatype input/output functions (I'm
particularly thinking of the datetime parser), regular expression
library, etc.

In my (biased) opinion libFuzzer is particularly well suited for this task
(fuzzing individual libraries, as opposed to fuzzing the whole postgress).
I've played with a dozen of regular expression libs and found bugs in all
of them
(e.g. search for "Fuzzer" in
http://vcs.pcre.org/pcre2/code/trunk/ChangeLog?view=markup&pathrev=360)

To do this effectively I think it would be best to invoke the fuzzer
from inside Postgres.

Never tied this.
Can't you just link libFuzzer with a part of the code you want to test?

Essentially provide bindings for Libfuzzer so

you can I can have Libfuzzer provide all the test cases to repeatedly
call the internal functions on.

Is there any example of doing something like this already? Am I taking
a crazy approach?

I don't have enough experience to say if it's crazy or not. But if
your LLVMFuzzerTestOneInput() queues some work for the server and pends on
a response -- that seems like a sane approach.

There are other approaches possible. It would be nice if I could run
afl or libfuzzer on a client program and have the client program tell
afl or libfuzzer the pid of the server to watch and then request test
cases to feed to the server. That seems like it would be a more
flexible approach for a lot of use cases where the server requires
setting up a complex environment.

Great idea, but it seems tricky to get the execution coverage feedback in
this case.

Not very tricky, but less efficient.
The major benefit of libfuzzer is that it get the coverage feedback from
inside the process
avoiding any kind of inter-process communication (no syscalls even).
So for things like simple parsers you can get 50K executions per second
(unless the fuzzer finds exponential algorithms in the parser).

The problem I’m specifically trying to tackle is that the code in question can use any of the internal postgres APIs and might have dependencies on anything in the environment.

Even the simplest cases like the date/tone parser will depends on the timezone library which is initialised at startup, the server session state which specifies the current timezone and default date format, etc.

The more interesting cases like arrays and other compound objects will depend on the internal caches of the database schema which is where it finds things like meta information about the data types stored within.

in Fuzzer::RunOneMaximizeTotalCoverage replace the calls to __sanitizer_get_total_unique_coverage__sanitizer_update_counter_bitset_and_clear_counters and pass the result to the client app.

If you can may the change general enough the patch would be more than welcome.

To do this effectively I think it would be best to invoke the fuzzer
from inside Postgres. Essentially provide bindings for Libfuzzer so
you can I can have Libfuzzer provide all the test cases to repeatedly
call the internal functions on.

Is there any example of doing something like this already? Am I taking
a crazy approach?

So on further inspection it seems the API I want, at least for the
in-process plan is mostly there in LLVMFuzzerNoMain. It would be nice
if I could call the driver with a function pointer and void* and it
would call my callback passing that closure along with the fuzzed
input. But I can probably work around that with a global variable.

I'm actually kind of frustrated by a more basic problem. The build
system. It seems LibFuzzer is meant to be compiled as part of LLVM but
it didn't get compiled when I built LLVM because I didn't build it
with sanitize-coverage enabled. Now I can't get it to build because I
get errors like:

$ for i in *.cpp ; do clang -c -std=c++11 $i ; done
$ clang -std=c++11 *.o
FuzzerDriver.o: In function `fuzzer::ReadTokensFile(char const*)':
FuzzerDriver.cpp:(.text+0x56): undefined reference to
`std::allocator<char>::allocator()'
FuzzerDriver.cpp:(.text+0x6d): undefined reference to
`std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string(char const*, std::allocator<char>
const&)'
FuzzerDriver.cpp:(.text+0x8d): undefined reference to
`std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::~basic_string()'
FuzzerDriver.cpp:(.text+0x96): undefined reference to
`std::allocator<char>::~allocator()'
FuzzerDriver.cpp:(.text+0xab): undefined reference to
`std::__cxx11::basic_istringstream<char, std::char_traits<char>,
std::allocator<char>

::basic_istringstream(std::__cxx11::basic_string<char,

std::char_traits<char>, std::allocator<char> > const&,
std::_Ios_Openmode)'
FuzzerDriver.cpp:(.text+0x14c): undefined reference to
`std::allocator<char>::allocator()'
FuzzerDriver.cpp:(.text+0x166): undefined reference to
`std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string(char const*, std::allocator<char>
const&)'
FuzzerDriver.cpp:(.text+0x18f): undefined reference to
`std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::~basic_string()'

And I get similar errors if I try to build it using the LLVM CMake
generated makefiles (after running "cmake
-DLLVM_USE_SANITIZE_COVERAGE=1" in the LibFuzzer directory), in fact I
get errors that I need -std=c++11. Do I need to recompile *all* of
llvm as if I was going to fuzz LLVM just to get libfuzzer built?

I’m fairly sure your compiler (or rather linker) errors are coming from the fact that you are not linking to the C++ runtime library. Use clang++ -std=c++11 *.o, and I’m reasonably sure it will do what you want.

I'm fairly sure your compiler (or rather linker) errors are coming from
the fact that you are not linking to the C++ runtime library. Use `clang++
-std=c++11 *.o`, and I'm reasonably sure it will do what you want.

Correct.

--
Mats

> To do this effectively I think it would be best to invoke the fuzzer
> from inside Postgres. Essentially provide bindings for Libfuzzer so
> you can I can have Libfuzzer provide all the test cases to repeatedly
> call the internal functions on.
>
> Is there any example of doing something like this already? Am I taking
> a crazy approach?

So on further inspection it seems the API I want, at least for the
in-process plan is mostly there in LLVMFuzzerNoMain. It would be nice
if I could call the driver with a function pointer and void* and it
would call my callback passing that closure along with the fuzzed
input. But I can probably work around that with a global variable.

Not sure I understood this correctly.
Example?

I've made a Postgres module which is dynamically loaded by Postgres as
a shared library from which I can call the fuzzer on the SQL function
of my choice. Postgres has enough meta information about the functions
that I think the eventual interface might be pretty flexible and be
able to specify which argument to fuzz and what other constant
arguments to pass etc. So I would want to pass the function's id and
these other arguments and so on through the fuzzer to the fuzz-one
callback. As I said I think I can just use a global variable since
there's no reason to the fuzzer needs to be reentrant.

However I have run into a problem I'm stumped on. I'm not sure if it's
the dynamic linker or something in Postgres that's interfering with
the coverage feedback but it's exiting after one call thinking the
newcoverage isn't increasing over the previous coverage.

The test that causing it to exit is at FuzzerLoop.cpp:250
  if (NewCoverage > OldCoverage || NumNewBits)
    return NewCoverage;

250 if (NewCoverage > OldCoverage || NumNewBits)
(gdb) p NewCoverage
$3 = 14422
(gdb) p OldCoverage
$4 = 14422
(gdb) p NumNewBits
$5 = 0

And after that it just returns.

In fact the only call it makes to my test function is with Data=NULL
Size=NULL which isn't a valid input to the function so I just return.
I'm not clear why it's passing NULL for the data at all but even so
that should still cause at least one bit of coverage.

I do have a second longer term problem. I would really want to call
the fuzzer for some limited number of iterations, say 1,000 or so,
then do some other housekeeping (including checking for query
cancellation). Then continue the fuzzing. However even if I specify
-iterations or -runs AIUI it isn't possible to call the fuzzer a
second time. It tests if it's already been called and if so aborts.
Maybe there's some internal function I could call instead but I
haven't read through all the source thoroughly yet.

> Not sure I understood this correctly.
> Example?

I've made a Postgres module which is dynamically loaded by Postgres as
a shared library from which I can call the fuzzer on the SQL function
of my choice. Postgres has enough meta information about the functions
that I think the eventual interface might be pretty flexible and be
able to specify which argument to fuzz and what other constant
arguments to pass etc. So I would want to pass the function's id and
these other arguments and so on through the fuzzer to the fuzz-one
callback. As I said I think I can just use a global variable since
there's no reason to the fuzzer needs to be reentrant.

You can use a global, you can use C++:
Like here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Fuzzer/test/UserSuppliedFuzzerTest.cpp

However I have run into a problem I'm stumped on. I'm not sure if it's
the dynamic linker or something in Postgres that's interfering with
the coverage feedback but it's exiting after one call thinking the
newcoverage isn't increasing over the previous coverage.

Did you build the Postgres code with -fsanitize-coverage=... ?

Yes:

CC = clang
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -Wno-unused-command-line-argument -g -O0 -fsanitize=address
-fsanitize-coverage=edge,indirect-calls,8bit-counters

What I'm now wondering is I saw somewhere that it was important to use
clang to link. I think the build might have used ld to link.

Is there a way I can test the binary to see what's up?

Hm. No, the final link command still uses clang (even though the
config info seems to indicate otherwise):

clang -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -Wno-unused-command-line-argument -g -O0 -fsanitize=address
-fsanitize-coverage=edge,indirect-calls,8bit-counters -L../../src/port
-L../../src/common -Wl,--as-needed
-Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags -Wl,-E
access/brin/brin.o access/brin/brin_pageops.o
access/brin/brin_revmap.o access/brin/brin_tuple.o
access/brin/brin_xlog.o access/brin/brin_minmax.o
access/brin/brin_inclusion.o access/common/heaptuple.o
access/common/indextuple.o access/common/printtup.o
access/common/reloptions.o access/common/scankey.o
access/common/tupconvert.o access/common/tupdesc.o
access/gin/ginutil.o access/gin/gininsert.o access/gin/ginxlog.o
access/gin/ginentrypage.o access/gin/gindatapage.o
access/gin/ginbtree.o access/gin/ginscan.o access/gin/ginget.o
access/gin/ginvacuum.o access/gin/ginarrayproc.o access/gin/ginbulk.o
access/gin/ginfast.o access/gin/ginpostinglist.o access/gin/ginlogic.o
access/gist/gist.o access/gist/gistutil.o access/gist/gistxlog.o
access/gist/gistvacuum.o access/gist/gistget.o access/gist/gistscan.o
access/gist/gistproc.o access/gist/gistsplit.o access/gist/gistbuild.o
access/gist/gistbuildbuffers.o access/hash/hash.o
access/hash/hashfunc.o access/hash/hashinsert.o access/hash/hashovfl.o
access/hash/hashpage.o access/hash/hashscan.o access/hash/hashsearch.o
access/hash/hashsort.o access/hash/hashutil.o access/heap/heapam.o
access/heap/hio.o access/heap/pruneheap.o access/heap/rewriteheap.o
access/heap/syncscan.o access/heap/tuptoaster.o
access/heap/visibilitymap.o access/index/genam.o
access/index/indexam.o access/nbtree/nbtcompare.o
access/nbtree/nbtinsert.o access/nbtree/nbtpage.o
access/nbtree/nbtree.o access/nbtree/nbtsearch.o
access/nbtree/nbtutils.o access/nbtree/nbtsort.o
access/nbtree/nbtxlog.o access/rmgrdesc/brindesc.o
access/rmgrdesc/clogdesc.o access/rmgrdesc/committsdesc.o
access/rmgrdesc/dbasedesc.o access/rmgrdesc/gindesc.o
access/rmgrdesc/gistdesc.o access/rmgrdesc/hashdesc.o
access/rmgrdesc/heapdesc.o access/rmgrdesc/mxactdesc.o
access/rmgrdesc/nbtdesc.o access/rmgrdesc/relmapdesc.o
access/rmgrdesc/replorigindesc.o access/rmgrdesc/seqdesc.o
access/rmgrdesc/smgrdesc.o access/rmgrdesc/spgdesc.o
access/rmgrdesc/standbydesc.o access/rmgrdesc/tblspcdesc.o
access/rmgrdesc/xactdesc.o access/rmgrdesc/xlogdesc.o
access/spgist/spgutils.o access/spgist/spginsert.o
access/spgist/spgscan.o access/spgist/spgvacuum.o
access/spgist/spgdoinsert.o access/spgist/spgxlog.o
access/spgist/spgtextproc.o access/spgist/spgquadtreeproc.o
access/spgist/spgkdtreeproc.o access/tablesample/bernoulli.o
access/tablesample/system.o access/tablesample/tablesample.o
access/transam/clog.o access/transam/commit_ts.o
access/transam/multixact.o access/transam/parallel.o
access/transam/rmgr.o access/transam/slru.o access/transam/subtrans.o
access/transam/timeline.o access/transam/transam.o
access/transam/twophase.o access/transam/twophase_rmgr.o
access/transam/varsup.o access/transam/xact.o access/transam/xlog.o
access/transam/xlogarchive.o access/transam/xlogfuncs.o
access/transam/xloginsert.o access/transam/xlogreader.o
access/transam/xlogutils.o bootstrap/bootparse.o bootstrap/bootstrap.o
catalog/catalog.o catalog/dependency.o catalog/heap.o catalog/index.o
catalog/indexing.o catalog/namespace.o catalog/aclchk.o
catalog/objectaccess.o catalog/objectaddress.o catalog/pg_aggregate.o
catalog/pg_collation.o catalog/pg_constraint.o catalog/pg_conversion.o
catalog/pg_depend.o catalog/pg_enum.o catalog/pg_inherits.o
catalog/pg_largeobject.o catalog/pg_namespace.o catalog/pg_operator.o
catalog/pg_proc.o catalog/pg_range.o catalog/pg_db_role_setting.o
catalog/pg_shdepend.o catalog/pg_type.o catalog/storage.o
catalog/toasting.o parser/analyze.o parser/gram.o parser/keywords.o
parser/kwlookup.o parser/parser.o parser/parse_agg.o
parser/parse_clause.o parser/parse_coerce.o parser/parse_collate.o
parser/parse_cte.o parser/parse_expr.o parser/parse_func.o
parser/parse_node.o parser/parse_oper.o parser/parse_param.o
parser/parse_relation.o parser/parse_target.o parser/parse_type.o
parser/parse_utilcmd.o parser/scansup.o commands/aggregatecmds.o
commands/alter.o commands/analyze.o commands/async.o
commands/cluster.o commands/comment.o commands/collationcmds.o
commands/constraint.o commands/conversioncmds.o commands/copy.o
commands/createas.o commands/dbcommands.o commands/define.o
commands/discard.o commands/dropcmds.o commands/event_trigger.o
commands/explain.o commands/extension.o commands/foreigncmds.o
commands/functioncmds.o commands/indexcmds.o commands/lockcmds.o
commands/matview.o commands/operatorcmds.o commands/opclasscmds.o
commands/policy.o commands/portalcmds.o commands/prepare.o
commands/proclang.o commands/schemacmds.o commands/seclabel.o
commands/sequence.o commands/tablecmds.o commands/tablespace.o
commands/trigger.o commands/tsearchcmds.o commands/typecmds.o
commands/user.o commands/vacuum.o commands/vacuumlazy.o
commands/variable.o commands/view.o executor/execAmi.o
executor/execCurrent.o executor/execGrouping.o executor/execIndexing.o
executor/execJunk.o executor/execMain.o executor/execProcnode.o
executor/execQual.o executor/execScan.o executor/execTuples.o
executor/execUtils.o executor/functions.o executor/instrument.o
executor/nodeAppend.o executor/nodeAgg.o executor/nodeBitmapAnd.o
executor/nodeBitmapOr.o executor/nodeBitmapHeapscan.o
executor/nodeBitmapIndexscan.o executor/nodeCustom.o
executor/nodeHash.o executor/nodeHashjoin.o executor/nodeIndexscan.o
executor/nodeIndexonlyscan.o executor/nodeLimit.o
executor/nodeLockRows.o executor/nodeMaterial.o
executor/nodeMergeAppend.o executor/nodeMergejoin.o
executor/nodeModifyTable.o executor/nodeNestloop.o
executor/nodeFunctionscan.o executor/nodeRecursiveunion.o
executor/nodeResult.o executor/nodeSamplescan.o executor/nodeSeqscan.o
executor/nodeSetOp.o executor/nodeSort.o executor/nodeUnique.o
executor/nodeValuesscan.o executor/nodeCtescan.o
executor/nodeWorktablescan.o executor/nodeGroup.o
executor/nodeSubplan.o executor/nodeSubqueryscan.o
executor/nodeTidscan.o executor/nodeForeignscan.o
executor/nodeWindowAgg.o executor/tstoreReceiver.o executor/spi.o
foreign/foreign.o lib/binaryheap.o lib/bipartite_match.o
lib/hyperloglog.o lib/ilist.o lib/pairingheap.o lib/rbtree.o
lib/stringinfo.o libpq/be-fsstubs.o libpq/be-secure.o libpq/auth.o
libpq/crypt.o libpq/hba.o libpq/ip.o libpq/md5.o libpq/pqcomm.o
libpq/pqformat.o libpq/pqmq.o libpq/pqsignal.o main/main.o
nodes/nodeFuncs.o nodes/nodes.o nodes/list.o nodes/bitmapset.o
nodes/tidbitmap.o nodes/copyfuncs.o nodes/equalfuncs.o
nodes/makefuncs.o nodes/outfuncs.o nodes/readfuncs.o nodes/print.o
nodes/read.o nodes/params.o nodes/value.o optimizer/geqo/geqo_copy.o
optimizer/geqo/geqo_eval.o optimizer/geqo/geqo_main.o
optimizer/geqo/geqo_misc.o optimizer/geqo/geqo_mutation.o
optimizer/geqo/geqo_pool.o optimizer/geqo/geqo_random.o
optimizer/geqo/geqo_recombination.o optimizer/geqo/geqo_selection.o
optimizer/geqo/geqo_erx.o optimizer/geqo/geqo_pmx.o
optimizer/geqo/geqo_cx.o optimizer/geqo/geqo_px.o
optimizer/geqo/geqo_ox1.o optimizer/geqo/geqo_ox2.o
optimizer/path/allpaths.o optimizer/path/clausesel.o
optimizer/path/costsize.o optimizer/path/equivclass.o
optimizer/path/indxpath.o optimizer/path/joinpath.o
optimizer/path/joinrels.o optimizer/path/pathkeys.o
optimizer/path/tidpath.o optimizer/plan/analyzejoins.o
optimizer/plan/createplan.o optimizer/plan/initsplan.o
optimizer/plan/planagg.o optimizer/plan/planmain.o
optimizer/plan/planner.o optimizer/plan/setrefs.o
optimizer/plan/subselect.o optimizer/prep/prepjointree.o
optimizer/prep/prepqual.o optimizer/prep/prepsecurity.o
optimizer/prep/preptlist.o optimizer/prep/prepunion.o
optimizer/util/clauses.o optimizer/util/joininfo.o
optimizer/util/orclauses.o optimizer/util/pathnode.o
optimizer/util/placeholder.o optimizer/util/plancat.o
optimizer/util/predtest.o optimizer/util/relnode.o
optimizer/util/restrictinfo.o optimizer/util/tlist.o
optimizer/util/var.o port/atomics.o port/dynloader.o port/pg_sema.o
port/pg_shmem.o port/pg_latch.o postmaster/autovacuum.o
postmaster/bgworker.o postmaster/bgwriter.o postmaster/checkpointer.o
postmaster/fork_process.o postmaster/pgarch.o postmaster/pgstat.o
postmaster/postmaster.o postmaster/startup.o postmaster/syslogger.o
postmaster/walwriter.o regex/regcomp.o regex/regerror.o
regex/regexec.o regex/regfree.o regex/regprefix.o regex/regexport.o
replication/logical/decode.o replication/logical/logical.o
replication/logical/logicalfuncs.o replication/logical/reorderbuffer.o
replication/logical/origin.o replication/logical/snapbuild.o
replication/walsender.o replication/walreceiverfuncs.o
replication/walreceiver.o replication/basebackup.o
replication/repl_gram.o replication/slot.o replication/slotfuncs.o
replication/syncrep.o rewrite/rewriteRemove.o rewrite/rewriteDefine.o
rewrite/rewriteHandler.o rewrite/rewriteManip.o
rewrite/rewriteSupport.o rewrite/rowsecurity.o
storage/buffer/buf_table.o storage/buffer/buf_init.o
storage/buffer/bufmgr.o storage/buffer/freelist.o
storage/buffer/localbuf.o storage/file/fd.o storage/file/buffile.o
storage/file/copydir.o storage/file/reinit.o
storage/freespace/freespace.o storage/freespace/fsmpage.o
storage/freespace/indexfsm.o storage/ipc/dsm_impl.o storage/ipc/dsm.o
storage/ipc/ipc.o storage/ipc/ipci.o storage/ipc/pmsignal.o
storage/ipc/procarray.o storage/ipc/procsignal.o storage/ipc/shmem.o
storage/ipc/shmqueue.o storage/ipc/shm_mq.o storage/ipc/shm_toc.o
storage/ipc/sinval.o storage/ipc/sinvaladt.o storage/ipc/standby.o
storage/large_object/inv_api.o storage/lmgr/lmgr.o storage/lmgr/lock.o
storage/lmgr/proc.o storage/lmgr/deadlock.o storage/lmgr/lwlock.o
storage/lmgr/spin.o storage/lmgr/s_lock.o storage/lmgr/predicate.o
storage/page/bufpage.o storage/page/checksum.o storage/page/itemptr.o
storage/smgr/md.o storage/smgr/smgr.o storage/smgr/smgrtype.o
tcop/dest.o tcop/fastpath.o tcop/postgres.o tcop/pquery.o
tcop/utility.o tsearch/ts_locale.o tsearch/ts_parse.o
tsearch/wparser.o tsearch/wparser_def.o tsearch/dict.o
tsearch/dict_simple.o tsearch/dict_synonym.o tsearch/dict_thesaurus.o
tsearch/dict_ispell.o tsearch/regis.o tsearch/spell.o
tsearch/to_tsany.o tsearch/ts_selfuncs.o tsearch/ts_typanalyze.o
tsearch/ts_utils.o utils/adt/acl.o utils/adt/arrayfuncs.o
utils/adt/array_expanded.o utils/adt/array_selfuncs.o
utils/adt/array_typanalyze.o utils/adt/array_userfuncs.o
utils/adt/arrayutils.o utils/adt/ascii.o utils/adt/bool.o
utils/adt/cash.o utils/adt/char.o utils/adt/date.o
utils/adt/datetime.o utils/adt/datum.o utils/adt/dbsize.o
utils/adt/domains.o utils/adt/encode.o utils/adt/enum.o
utils/adt/expandeddatum.o utils/adt/float.o utils/adt/format_type.o
utils/adt/formatting.o utils/adt/genfile.o utils/adt/geo_ops.o
utils/adt/geo_selfuncs.o utils/adt/inet_cidr_ntop.o
utils/adt/inet_net_pton.o utils/adt/int.o utils/adt/int8.o
utils/adt/json.o utils/adt/jsonb.o utils/adt/jsonb_gin.o
utils/adt/jsonb_op.o utils/adt/jsonb_util.o utils/adt/jsonfuncs.o
utils/adt/like.o utils/adt/lockfuncs.o utils/adt/mac.o
utils/adt/misc.o utils/adt/nabstime.o utils/adt/name.o
utils/adt/network.o utils/adt/network_gist.o
utils/adt/network_selfuncs.o utils/adt/numeric.o utils/adt/numutils.o
utils/adt/oid.o utils/adt/oracle_compat.o utils/adt/orderedsetaggs.o
utils/adt/pg_locale.o utils/adt/pg_lsn.o
utils/adt/pg_upgrade_support.o utils/adt/pgstatfuncs.o
utils/adt/pseudotypes.o utils/adt/quote.o utils/adt/rangetypes.o
utils/adt/rangetypes_gist.o utils/adt/rangetypes_selfuncs.o
utils/adt/rangetypes_spgist.o utils/adt/rangetypes_typanalyze.o
utils/adt/regexp.o utils/adt/regproc.o utils/adt/ri_triggers.o
utils/adt/rowtypes.o utils/adt/ruleutils.o utils/adt/selfuncs.o
utils/adt/tid.o utils/adt/timestamp.o utils/adt/trigfuncs.o
utils/adt/tsginidx.o utils/adt/tsgistidx.o utils/adt/tsquery.o
utils/adt/tsquery_cleanup.o utils/adt/tsquery_gist.o
utils/adt/tsquery_op.o utils/adt/tsquery_rewrite.o
utils/adt/tsquery_util.o utils/adt/tsrank.o utils/adt/tsvector.o
utils/adt/tsvector_op.o utils/adt/tsvector_parser.o utils/adt/txid.o
utils/adt/uuid.o utils/adt/varbit.o utils/adt/varchar.o
utils/adt/varlena.o utils/adt/version.o utils/adt/windowfuncs.o
utils/adt/xid.o utils/adt/xml.o utils/cache/attoptcache.o
utils/cache/catcache.o utils/cache/evtcache.o utils/cache/inval.o
utils/cache/plancache.o utils/cache/relcache.o utils/cache/relmapper.o
utils/cache/relfilenodemap.o utils/cache/spccache.o
utils/cache/syscache.o utils/cache/lsyscache.o utils/cache/typcache.o
utils/cache/ts_cache.o utils/error/assert.o utils/error/elog.o
utils/fmgr/dfmgr.o utils/fmgr/fmgr.o utils/fmgr/funcapi.o
utils/hash/dynahash.o utils/hash/hashfn.o utils/hash/pg_crc.o
utils/init/globals.o utils/init/miscinit.o utils/init/postinit.o
utils/mb/encnames.o utils/mb/conv.o utils/mb/mbutils.o
utils/mb/wchar.o utils/mb/wstrcmp.o utils/mb/wstrncmp.o
utils/misc/guc.o utils/misc/help_config.o utils/misc/pg_rusage.o
utils/misc/ps_status.o utils/misc/rls.o utils/misc/sampling.o
utils/misc/superuser.o utils/misc/timeout.o utils/misc/tzparser.o
utils/mmgr/aset.o utils/mmgr/mcxt.o utils/mmgr/portalmem.o
utils/resowner/resowner.o utils/sort/logtape.o
utils/sort/sortsupport.o utils/sort/tuplesort.o
utils/sort/tuplestore.o utils/time/combocid.o utils/time/tqual.o
utils/time/snapmgr.o utils/fmgrtab.o ../../src/timezone/localtime.o
../../src/timezone/strftime.o ../../src/timezone/pgtz.o
../../src/port/libpgport_srv.a ../../src/common/libpgcommon_srv.a
-lcrypt -lm -o postgres

Looks correct.
Can you post the output of libFuzzer here?
Something like

#0 READ cov: 0 bits: 0 units: 97701 exec/s: 0
#1 pulse cov: 732 bits: 0 units: 97701 exec/s: 0
#2 pulse cov: 737 bits: 0 units: 97701 exec/s: 1
#4 pulse cov: 858 bits: 0 units: 97701 exec/s: 2
#8 pulse cov: 880 bits: 0 units: 97701 exec/s: 4

Looks correct.

Ah! With a fresh pair of eyes it's obvious what was wrong. I had
compiled everything with sanitize-coverage except the Fuzzer code
itself but that included the file with the wrapper function which
calls the target function. And with the NULL data argument it wasn't
passing the wrapper function. So no coverage. I'm still puzzled why
about the NULL argument but compiling that file with coverage checking
has made it proceed.

Can you post the output of libFuzzer here?
Something like

I haven't looked into why yet, this is probably something simple but
for the sake of it this is what I'm getting now with the above fixed:

/usr/local/pgsql/bin/psql -c 'select fuzz()'
Flag: verbosity 9
Flag: iterations 100
Flag: runs 10
Flag: save_minimized_corpus 1
Seed: 3416380570
SetTimer 601
Tokens: {}
PreferSmall: 1
#0 READ cov: 0 bits: 0 units: 1 exec/s: 0
Called with Data=(nil) size=0
#1 pulse cov: 13790 bits: 21 units: 1 exec/s: 0
NEW0: 13790 L 0
#1 INITED cov: 13790 bits: 21 units: 1 exec/s: 0
Written corpus of 1 files to /var/tmp/corpus
Reload: read 1 new units.
Called with Data=0x60600000e480 size=64
#2 pulse cov: 14202 bits: 252 units: 1 exec/s: 0
#2 NEW cov: 14202 bits: 252 units: 2 exec/s: 0 L: 64
Written to /var/tmp/corpus/67ffe57491b2903668530b6182e5aeb6113d3f28
Called with Data=0x60600000e480 size=64
#3 NEW cov: 14278 bits: 257 units: 3 exec/s: 0 L: 64
Written to /var/tmp/corpus/67ffe57491b2903668530b6182e5aeb6113d3f28
Called with Data=0x60600000e480 size=64
#4 pulse cov: 14298 bits: 262 units: 3 exec/s: 0
#4 NEW cov: 14298 bits: 262 units: 4 exec/s: 0 L: 64
Written to /var/tmp/corpus/1ae4df94333696e5bba164df9cf5e93df7a72e20
Called with Data=0x60600000e480 size=64
#5 NEW cov: 14311 bits: 267 units: 5 exec/s: 0 L: 64
Written to /var/tmp/corpus/c167e6439183f0df3ea25fcd30da80b27293e737
Called with Data=0x60600000e480 size=64
#6 NEW cov: 14311 bits: 271 units: 6 exec/s: 0 L: 64
Written to /var/tmp/corpus/21e9212a20031de685b5b20d5d7752b17780303a
Reload: read 0 new units.
Called with Data=0x60600000e480 size=64
PANIC: ERRORDATA_STACK_SIZE exceeded
STATEMENT: select fuzz()
LOG: server process (PID 8650) was terminated by signal 6: Aborted
DETAIL: Failed process was running: select fuzz()
PANIC: ERRORDATA_STACK_SIZE exceeded
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

> Looks correct.

Ah! With a fresh pair of eyes it's obvious what was wrong. I had
compiled everything with sanitize-coverage except the Fuzzer code
itself but that included the file with the wrapper function which
calls the target function. And with the NULL data argument it wasn't
passing the wrapper function. So no coverage. I'm still puzzled why
about the NULL argument but compiling that file with coverage checking
has made it proceed.

> Can you post the output of libFuzzer here?
> Something like

I haven't looked into why yet, this is probably something simple but
for the sake of it this is what I'm getting now with the above fixed:

/usr/local/pgsql/bin/psql -c 'select fuzz()'
Flag: verbosity 9
Flag: iterations 100
Flag: runs 10
Flag: save_minimized_corpus 1
Seed: 3416380570
SetTimer 601
Tokens: {}
PreferSmall: 1
#0 READ cov: 0 bits: 0 units: 1 exec/s: 0
Called with Data=(nil) size=0
#1 pulse cov: 13790 bits: 21 units: 1 exec/s: 0
NEW0: 13790 L 0
#1 INITED cov: 13790 bits: 21 units: 1 exec/s: 0
Written corpus of 1 files to /var/tmp/corpus
Reload: read 1 new units.
Called with Data=0x60600000e480 size=64
#2 pulse cov: 14202 bits: 252 units: 1 exec/s: 0
#2 NEW cov: 14202 bits: 252 units: 2 exec/s: 0 L: 64

Ok, so now you are at least getting the coverage feedback.

Yes. Is it intentional that the fuzzer calls the function with Data=NULL once?

It was certainly a surprise for me. I wonder if it's related to
anything I've done that's unusual.

For what it's worth the above crash was because I wasn't resetting the
state good enough between calls. It's a bit of a tradeoff though --
the higher level I reset the state the more memory allocations will do
and other global state the server will set to save state between calls
to do so and it'll be slower too. It looks like I'll have to start a
new transaction (or subtransaction) for each call which I was hoping
to avoid but I had a feeling was going to be necessary. Certainly
it'll be necessary if the function being fuzzed does any database
access but I in this case that wasn't going on so I thought I might be
able to get away without it.