Improvements to std::find and std::count

Hi,

I recently posted a question on StackOverflow regarding the performance of std::find and std::count when provided with char* input: http://stackoverflow.com/questions/43483378/why-arent-stdcount-and-stdfind-optimised-to-use-memchr.

I would propose adding overloads of these functions for char* and const char* inputs that delegate to memchr, something like:

inline const char* find(const char* first, const char* const last, const char value)
{
const auto result = std::memchr(first, value, last - first);
return result != nullptr ? static_cast<const char*>(result) : last;
}

inline typename std::iterator_traits<const char*>::difference_type
count(const char* first, const char* const last, const char value)
{
typename std::iterator_traits<const char*>::difference_type result {0};
while (first && first != last) {
if ((first = static_cast<const char*>(std::memchr(first, value, last - first)))) {
++result;
++first;
}
}
return result;
}

I’ve never contributed to LLVM, so I’m not sure how to proceed, if this is a change that is likely to be accepted?

Best,
Dan

Adding cfe-dev. Removing llvm-dev. Frontend and libary discussions are better represented on that list.

Adding cfe-dev. Removing llvm-dev. Frontend and libary discussions are
better represented on that list.

~Craig

Hi,

I recently posted a question on StackOverflow regarding the performance
of std::find and std::count when provided with char* input:
c++ - Why aren't std::count and std::find optimised to use memchr? - Stack Overflow
-stdcount-and-stdfind-optimised-to-use-memchr.

I would propose adding overloads of these functions for char* and const
char* inputs that delegate to memchr, something like:

inline const char* find(const char* first, const char* const last, const
char value)
{
const auto result = std::memchr(first, value, last - first);
return result != nullptr ? static_cast<const char*>(result) : last;
}

inline typename std::iterator_traits<const char*>::difference_type
count(const char* first, const char* const last, const char value)
{
typename std::iterator_traits<const char*>::difference_type result {0};
while (first && first != last) {
if ((first = static_cast<const char*>(std::memchr(first, value, last -
first)))) {
++result;
++first;
}
}
return result;
}

I’ve never contributed to LLVM, so I’m not sure how to proceed, if this
is a change that is likely to be accepted?

If the change *measurably* improves performance while maintaining
correctness then yes, it is very likely to be accepted.

The first step would be to write a benchmark that demonstrates the
performance difference by adding a test in
`libcxx/benchmarks/algorithms.bench.cpp`.
Personally I would write two different benchmarks, one that uses
`std::find` and another that uses `std::memchr` on the same set of input.

/Eric

True, but we might also want to consider whether llvm’s loop idiom recognition pass should be able to catch this.