Hi all,
We’d like to suggest adding a -memeq-lib-function flag to allow the user to specify a **memeq()**
function to improve string equality check performance.
Right now, when llvm encounters a string equality check, e.g. if (memcmp(a, b, s) == 0)
, it tries to expand to an equality comparison if s
is a small compile-time constant, and falls back on calling memcmp()
else.
This is sub-optimal because memcmp has to compute much more than equality.
We propose adding a way for the user to specify a memeq
library function (e.g. -memeq-lib-function=user_memeq
) which will be called instead of memcmp()
when the result of the memcmp call is only used for equality comparison.
memeq
can be made much more efficient than memcmp
because equality comparison is trivially parallel while lexicographic ordering has a chain dependency.
We measured an very large improvement of this approach on our internal codebase. A significant portion of this improvement comes from the stl, typically std::string::operator==()
.
Note that this is a backend-only change. Because the c family of languages do not have a standard memeq()
(posix used to have bcmp()
but it was removed in 2001), c/c++ code cannot communicate the equality comparison semantics to the compiler.
We did not add an RTLIB entry for memeq because the user environment is not guaranteed to contain a memeq()
function as the libc has no such concept.
If there is interest, we could also contribute our optimized memeq
to compiler-rt.
A proof of concept patch for this for this RFC can be found here: https://reviews.llvm.org/D56248
Comments & suggestions welcome !
Thanks,
Clement