[ADT] is_splat and empty ranges

Hi folks,

Recently I touched a bit of code related to llvm::is_splat (introduced in D49958) and found it surprising that empty ranges are not considered splat. Currently, is_splat is implemented as follows:

/// Wrapper function around std::equal to detect if all elements
/// in a container are same.
template <typename R>
bool is_splat(R &&Range) {
  size_t range_size = size(Range);
  return range_size != 0 && (range_size == 1 ||
         std::equal(adl_begin(Range) + 1, adl_end(Range), adl_begin(Range)));

Going by the comment, intuitively, I’d say that all elements are same when we can’t find a counterexample, i.e., two elements that are not equal. This seems consistent with llvm::all_of that returns true when the input range is empty.

I haven’t heard of the concept of splat before, so I’m not sure if it follows some well-established definition that makes it false for empty ranges.

I went ahead and changed the definition to consider empty ranges splat and both check-llvm and check-mlir passed for me. I turned the zero-size case into an assertion and there were no failures either. With that, would it make sense to remove this special case if it is not intentional?

cc: @shchenz, @LebedevRI, @dblaikie

Got any details of the history of the change? Who committed it (pointers to the revision/review might be helpful)? Where it was first used?

It was introduced in D49958 to fix misues of std::equal.

Cool - yeah, I’d be in favor of/happy to approve changing the semantics to make the empty list return true here. Be nice to have some unit tests while you’re there, if you want to add them.

& I wonder if “splat” is sufficiently idiomatic/known term or whether this perhaps should have some other name - like all_same or all_equal or something?

I have similar doubts and like all_equal a lot.

I propose the following:

  1. Add all_equal to STLExtras.h and make is_splat check for zero and call all_equal otherwise.
  2. Mass-replace all uses of is_splat with all_equal.
  3. Mark is_splat as deprecated.
  4. Wait a couple of weeks and remove is_splat.

Does this sound reasonable to folks?

Sounds good to me - thanks!

I don’t know about this. “Splat” is a known term that is already used in other places in LLVM, so I’m not sure if renaming it is a good idea. It’s a pretty standard name for the operation (alongside with “broadcast”).

Edit: on the other hand, C++ library already has all_of, so maybe all_equal is good name.

Is there a definition somewhere that covers the empty case?

Splat takes a value and produces a sequence where this value is repeated a number of times. It’s typically a vector operation, where you produce a vector with the same value in each lane, so the case of the empty sequence doesn’t usually apply. After thinking about it some more, I think that changing the name is probably a good plan after all.

“splat” or “broadcast” - they’re the same - are commonly used names for this operation on vectors.

However, I think you’re mixing two things here. A splat or broadcast specifically is only defined when there is a single value. The operation which is either a splat or an empty vector is something else. It might be reasonable to call that all_equal, or something else, but it’s not a splat in the typical sense.

As such, by renaming splat to all_equal you’re actually loosing information conveyed by the name. Assuming the current is_splat interface actually checks for a splat operation, please leave it as is.

I think it would be useful to decouple the meaning of splat in vector operations and as a general C++ utility function used in the broader context of llvm-project. I did an experiment and added an assertion to see if any existing test relies on the existing behavior for empty ranges, and there is none that passes an empty range at all (modulo the unit test for is_splat itself). This makes me think that the old name may be somewhat accidental and not strictly related to the context of vector operations, and from a perspective of being useful/familiar to C++ developers, having an all_equal predicate would be better.

If some vector-related code needs to check for splat operations, it could easily define their own predicate or directly check !empty(R) && all_equal(R). If we introduce all_equal, I would rather not have it alongside the existing is_splat in STLExtras.h, because folks will probably sometimes use is_splat out of habit not realizing there is a semantic difference between the two.

1 Like

Thanks for all the inputs. I started a revision stack to gradually replace is_splat with all_equal:

  1. ⚙ D132334 [ADT] Add all_equal predicate
  2. ⚙ D132335 [ADT] Deprecate is_splat and replace all uses with all_equal
  3. ⚙ D132336 [ADT] Remove is_splat