RFC: Extend UBSan with qsort checks

Hi all,

UndefinedBehaviorSanitizer currently does not check for undefined behaviors which result from improper usage of standard library functions.

One notorious instance of such errors is invalid usage of qsort or bsearch routines (or std::sort and friends in case of C++):
* using comparison function that violates ordering axioms (reflexivity, symmetry, transitivity)
* returning unstable results from comparison function
* passing unsorted array to bsearch
* etc.

Errors like this will usually result in slightly invalid output (not fully sorted arrays, rare failed bsearches, etc.) but may as well cause aborts on some systems (3959 – SEGV crash due to random libads/dns.c qsort() comparison function).

I've recently developed a simple proof-of-concept tool SortChecker (GitHub - yugr/sortcheck: Tool for detecting violations of ordering axioms in qsort/bsearch callbacks.) which intercepts calls to qsort and friends (via LD_PRELOAD) and performs various sanity checks before passing control to libc e.g.
* sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements
* etc.

Results were quite inspiring: I've found several errors in popular open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also pretty sure most C++ developers have failed to write correct comparison function at least once.

Could SortChecker functionality be considered for integration into UBSan? Extending SortChecker to C++ land would require significant work which has already been done in sanitizers (compiler instrumentation, portable parallel code, etc.). On the other hand, ability to detect new classes of bugs should benefit UBSan and community.

-Y

Hi Yuri,

Hi all,

UndefinedBehaviorSanitizer currently does not check for undefined
behaviors which result from improper usage of standard library functions.

It's not really an undefined behavior per se: you just violate the contract
provided by library functions (at least I haven't found a clause in C++
standard
that specifies that the behavior of smth. like std::sort() with invalid
operator< is invalid). The actual behavior of these functions is
determined by library authors, and can be anything from UB to unspecified
result, to nicely formatted error report (if you're using a debug version
of system library):
I think I actually encountered all these cases.

That said, the benefit of checking the sorting routines is clear: I'm not
surprised you were able to catch numerous bugs with your tool.

UBSan currently doesn't have interceptors, and all of its checks [1] are
designed to work without runtime library support, so that it can be used as
a mitigation
tool (with -fsanitize-trap=undefined). It's challenging to add UBSan
interceptors at this point: this would limit the tool portability (at least
portability of some functionality),
and complicate interoperation of UBSan with another tools. There is an
opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and check
arguments there.
This functionality can be controlled by a runtime flag.

Why do you need compiler instrumentation? Do you want to automatically
inject hooks to SortChecker into std::sort() functions, so that you don't
need to annotate C++ standard library code?
We submitted some annotations to libc++ code (e.g. to report containter
overflow bugs in sanitizers).

[1] -fsanitize=vptr is an only notable exception

One notorious instance of such errors is invalid usage of qsort or bsearch

(+correct cfe-dev list)

FTR, here is one way to implement this in the library:
https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc+±v3/include/bits/stl_algo.h

Search for “check sort predicate for strict weak ordering”

(+correct cfe-dev list)

Hi Yuri,

Hi all,

UndefinedBehaviorSanitizer currently does not check for undefined
behaviors which result from improper usage of standard library functions.

It's not really an undefined behavior per se: you just violate the
contract provided by library functions (at least I haven't found a clause
in C++ standard
that specifies that the behavior of smth. like std::sort() with invalid
operator< is invalid).

Well, C11 which says that "That is, for qsort they shall define a total ordering on the array, and for bsearch the same object shall always compare the same way with the key." ("total ordering" is of course a misnomer, AFAIK they meant weak ordering). AFAIK violation of "shall" usually means UB.

As for C++11, it has for e.g. srtd::sort:

"Requires: operator< (for the version with no arguments) or comp (for the version with a comparison argument) defines a strict weak ordering (25.4)."

which also sounds like UB.

The actual behavior of these functions is

determined by library authors, and can be anything from UB to unspecified
result, to nicely formatted error report (if you're using a debug version
of system library):
I think I actually encountered all these cases.

That said, the benefit of checking the sorting routines is clear: I'm not
surprised you were able to catch numerous bugs with your tool.

UBSan currently doesn't have interceptors, and all of its checks [1] are
designed to work without runtime library support, so that it can be used as
a mitigation
tool (with -fsanitize-trap=undefined). It's challenging to add UBSan
interceptors at this point: this would limit the tool portability (at least
portability of some functionality),
and complicate interoperation of UBSan with another tools.

Could you give some hints about portability problems? E.g. I thought that ASan (which has a ton of interceptors) has no problems on Windows.

In general, not being able to detect UB in stdlibs sounds like a limitation.

>> There is an

opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and check
arguments there.
This functionality can be controlled by a runtime flag.

It's certainly doable but somewhat "ugly" as [ATM]San are not about UB. On the other hand TSan already has non-thread-related checks anyway. I wonder what other folks think about this.

Why do you need compiler instrumentation? Do you want to automatically
inject hooks to SortChecker into std::sort() functions, so that you don't
need to annotate C++ standard library code?

Exactly, LD_PRELOAD does not work for inline STL functions (actually it does not fully work even for Glibc which has inline bsearch).

We submitted some annotations to libc++ code (e.g. to report containter
overflow bugs in sanitizers).

Annotating all popular stdlibs rather than doing a single wrapper would be a challenge. On the other hand, it may have an additional benefit of being extendable to other popular libraries which provide qsort-like APIs (Glib2, Berkeley DB and many others). E.g. we could have some kind of __attribute__((strict_order)) for callbacks which would then be handled by compiler.

FTR, here is one way to implement this in the library:
https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h
Search for "check sort predicate for strict weak ordering"

Nice, although this wouldn't catch violations of transitivity (which is probably the most important type of bug).

FTR, here is one way to implement this in the library:

https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h
Search for "check sort predicate for strict weak ordering"

Nice, although this wouldn't catch violations of transitivity (which is
probably the most important type of bug).

I thought it would...
Can you give a test?

FTR, here is one way to implement this in the library:

https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h
Search for "check sort predicate for strict weak ordering"

Nice, although this wouldn't catch violations of transitivity (which is
probably the most important type of bug).

I thought it would...

But debug operator() only knows about 2 inputs whereas for transitivity you'll need to consider at least 3.

FTR, here is one way to implement this in the library:

https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h

Search for "check sort predicate for strict weak ordering"

Nice, although this wouldn't catch violations of transitivity (which is
probably the most important type of bug).

I thought it would...

But debug operator() only knows about 2 inputs whereas for transitivity
you'll need to consider at least 3.

Can you give a test?

Here's an example of transitivity violation in GCC: 68988 – reload_pseudo_compare_func violates qsort requirements

Inviting Paul to the party (he wrote the libstdc++ sort checker).

Inviting Paul to the party (he wrote the libstdc++ sort checker).

FTR, here is one way to implement this in the library:

https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h

Search for "check sort predicate for strict weak ordering"

Nice, although this wouldn't catch violations of transitivity (which is
probably the most important type of bug).

I thought it would...

But debug operator() only knows about 2 inputs whereas for transitivity
you'll need to consider at least 3.

As I privately replied to Kostya, I was only willing to add constant overhead.

SortChecker adds N*N overhead:
https://github.com/yugr/sortcheck/blob/master/src/sortchecker.c#L427
except he caps it at N=32 on line 413 (and thus will not detect all
violations either).
Adding up to 1024 comparisons to large sort()s maybe isn't so bad.

Can you give a test?

Here's an example of transitivity violation in GCC:
68988 – reload_pseudo_compare_func violates qsort requirements

Here is one from GDB:
https://sourceware.org/bugzilla/show_bug.cgi?id=19361

Inviting Paul to the party (he wrote the libstdc++ sort checker).

FTR, here is one way to implement this in the library:

https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h

Search for "check sort predicate for strict weak ordering"

Nice, although this wouldn't catch violations of transitivity (which is
probably the most important type of bug).

I thought it would...

But debug operator() only knows about 2 inputs whereas for transitivity
you'll need to consider at least 3.

As I privately replied to Kostya, I was only willing to add constant overhead.

Sure, there are various conflicting tradeoffs. SortChecker started as
an experiment so I was as agressive as possible.

SortChecker adds N*N overhead:
https://github.com/yugr/sortcheck/blob/master/src/sortchecker.c#L427
except he caps it at N=32 on line 413

I'm afraid it's even N^3 (with N=32 cap). This indeed sounds scary but
I have not seen significant slowdowns when running instrumented
distro.

(and thus will not detect all
violations either).

Right, I thought about improving this with testing random 32 elements
instead of the first ones.

Adding up to 1024 comparisons to large sort()s maybe isn't so bad.

Can you give a test?

Here's an example of transitivity violation in GCC:
68988 – reload_pseudo_compare_func violates qsort requirements

Here is one from GDB:
19361 – Invalid comparison functions

Yeah but this one is still in discussion so I didn't bother to mention it.

I'm afraid it's even N^3 (with N=32 cap). This indeed sounds scary but
I have not seen significant slowdowns when running instrumented
distro.

Have you looked at how many sort invocations there are, and what their
size distribution is?

Perhaps naively, I expect very few sort()s to be performed in day to
day distro operation, and even fewer sort()s with more than 20
elements.

Right, I thought about improving this with testing random 32 elements
instead of the first ones.

That has its own disadvantages: it makes the failure detection
non-repeatable, and makes it harder to understand the conditions under
which the predicate is broken.

I'm afraid it's even N^3 (with N=32 cap). This indeed sounds scary but
I have not seen significant slowdowns when running instrumented
distro.

Have you looked at how many sort invocations there are, and what their
size distribution is?

Perhaps naively, I expect very few sort()s to be performed in day to
day distro operation, and even fewer sort()s with more than 20
elements.

That's my feeling as well but I never had time to really profile this.

Right, I thought about improving this with testing random 32 elements
instead of the first ones.

That has its own disadvantages: it makes the failure detection
non-repeatable, and makes it harder to understand the conditions under
which the predicate is broken.

Absolutely (although could be mitigated by printing the seed together
with error message).

Exactly correct.
If your comparison operator does not define a SWO, then you have no
guarantees of the results. Crashes are common.

In principle, I am very much in favor of a checker like this.
Passing an invalid comparison operator is one of the most common misuses of
std::sort.

I worry about the cost, though.

-- Marshall

I don’t have the full thread for context.

There is no practical way to test if a given comparison function satisfies a strict-weak ordering or not. The issue is the requirement of transitivity:

a < b && b < c => a < c

Any test would have to ensure that this holds for all triplets in the sequence being sorted (and you would have to verify the other axioms as well).

The reason why sort can crash if this is violated is because it establishes a sentinel at a partition point and assumes nothing will cross the sentinel - you can bounds check with some additional cost but the ordering is still going to be undefined if this axiom is violated and there is no guarantee a bounds check will catch it.

There are some tests that can be done for common failures though - for example a common failure of sort is trying to sort a floating point sequence with float or doubles that contains NaNs. You can test for NaNs on such a sequence with an additional linear pass. But it doesn’t help if you have a custom comparison that is passed in, that does something like compare the first member of a struct that is of type double.

There are many pre-conditions in STL (and most any other library interface) that cannot easily be validated.

Sean

I don't have the full thread for context.

There you go: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093835.html

There is no practical way to test if a given comparison function satisfies
a strict-weak ordering or not. The issue is the requirement of transitivity:

a < b && b < c => a < c

Any test would have to ensure that this holds for all triplets in the
sequence being sorted (and you would have to verify the other axioms as
well).

Right, that's why I have to intercept qsort to check transitivity.

The reason why sort can crash if this is violated is because it establishes
a sentinel at a partition point and assumes nothing will cross the sentinel
- you can bounds check with some additional cost but the ordering is still
going to be undefined if this axiom is violated and there is no guarantee a
bounds check will catch it.

Ah, good to know, thanks.