[RFC] Extend ranges infrastructure to better match C++20

I’d like to propose some minor extensions to enable functionality resembling C++20 ranges. We already have a bunch of the heavy coding done in STLExtras.h, but we are missing some features which make ranges a little more accessible and consistent with C++20.

It seems all the current range-like code generates common_ranges (i.e. ranges with the same iterator type representing both begin+end). We could continue this practice into new infrastructure to reduce complexity + compile-time and maximize compatibility with existing code.

I think it makes sense to adopt an API which mirrors C++20 ranges. That is, view adapters are expressed as constexpr inline variables with call + pipe operator overloads (likely in a new namespace to avoid name collisions… call it llvm::views). View adapters implement both functional (views::reverse(A)) and a pipe syntax (A | views::reverse). The pipe syntax enables more ergonomic code when closures get complex.

Example:

auto R = reverse(drop(take(Vec, 2), 3));

vs

auto R = Vec | take(2) | drop(3) | reverse;

Additionally, users should be able to store a series of pipe-syntax adapters to a named variable or return the collection of adapters from a function:

auto AdapterSeq = take(2) | drop(3) | reverse;
auto R = Vec | AdapterSeq;

If you’d like to have an idea of the complexity of the code involved, I created a Phabricator review which implements the pipe syntax described above. It uses a single template class (well two specializations of one class) to implement all the view adapters, which makes adding new view adapters very simple once you have your custom iterator / range implemented. It also implements for named variable storage of an adapter sequence (although I realize now that I could use a std::tuple and a fold expression to simplify this code somewhat). This does not modify llvm::iterator_range as described above since I would like community approval before moving forward with that implementation.

The following are the steps in the patch series as I see them:

Bug fix(es) in existing range code:

  1. Iterator types which carry a callable object are not currently copy-assignable (llvm::mapped_iterator, llvm::filter_iterator, etc.).
    • Example compile failure.
    • See this explanation as to why the ranges library uses std::optional to encapsulate the callable object.
    • Can easily make these iterators default constructible with std::optional encapsulation.
    • No need to use std::optional if std::is_function_v is true… just store as a pointer (or std::reference_wrapper if default construction is not a priority.
  2. Audit existing eager evaluation to ensure behavior matches C++20 spec.
    • For example, llvm::drop_begin() is evaluated eagerly while std::ranges::views::drop is evaluated lazily.
  3. Use llvm::adl_begin() and llvm::adl_end() when forming ranges instead of member functions.

Create a place to house concept-like code.

  1. Particularly any utilities used to determine whether a type is a forward iterable range and iterator types associated with a container-like type.

Proposed llvm::iterator_range-related updates:

  1. Create select different base class / implementation layer depending on category of the iterator type.
    • For example, a iterator of category std::forward_iterator_tag selects a base class of llvm::forward_range.
    • More member functions are layered into the selected base class as the category advances (operator[] for random-access, etc).
  2. Add an untemplated tag class as the base class for all iterator_range classes.
    • Helps to identify any class derived from iterator_range or one of its bases in SFINAE.

Implement pipe syntax:

  1. Define the view adapter implementation template.
  2. Adopt new view adapter implementation for existing range adapter functions.
    • Possibly define adapters in a new namespace to allow correlation between LLVM view adapter names and C++20 names (i.e. llvm::views::transform instead of llvm::map_range).
  3. Define the adapter sequence template to enable storage of adapters to a named variable.
    • Example:
auto Adaters = reverse | elements<2> | drop_while([ ](auto X){ return X<10; });

Implement to() function from C++23 spec (at least overloads which create a container).

  1. Enable simple copy of range elements to a specified container type.
  2. Example:
std::list<char> L{ 'a', 'B', 'c', 'D', 'e', 'F };
auto IsLower = [](char X){ return X > 'a' && X <= 'z'; };
auto Str = L | filter(IsLower) | transform(::toupper) | to<SmallString<10>>();

Benefits of more C++20-like ranges infrastructure:

  1. More useful ranges with additional category-specific member functions.
  2. Parity with C++20 spec makes eventual C++20 adoption simpler.
  3. Potential to write more ergonomic + efficient code.

Concerns mentioned in Phabricator

  1. Compile-time with range-heavy code. Might make it easier for devs to inflate build times.

I figured I could kick off this work with some of the (hopefully) less controversial items.

[ADT] Make mapped_iterator copy assignable

  • fixes compilation errors when using this iterator type with certain STL code.
  • Example compile error: Compiler Explorer