[RFC] A better fold API using more generic Adaptors

zero9178 · December 24, 2022, 2:01pm

Motivation

The fold method of operation is likely one of the most used and fundamental methods within MLIR.
Ergonomically, its API has some drawbacks however.

As a quick reminder, the fold function signature of an OP may either be

OpFoldResult fold(ArrayRef<Attribute> operands) for a single result op
LogicalResult fold(ArrayRef<Attribute> operands, SmallVectorImpl<OpFoldResult> &results) otherwise

The issue lies within the first parameter respectively. The ArrayRef<Attribute> is an array with the size equal to the count of operands of the Op being folded. Accessing each attribute is achieved by indexing into the array using the operands index, whose potentially constant value should be read.

API wise, this has following drawbacks from an implementor of fold perspective:

Indexing into the ArrayRef requires knowledge about how the Op was declared in TableGen, its position/index within the input list. This is further complicated by the existence of Variadic and Optional, which can lead to odd looking indexing expressions.
Indexing into ArrayRef is not resilient to refactoring and extension of the Op. Adding a new operand in TableGen in any position, but the last requires going through the fold method and reevaluating all indexing expressions as the indices of operands have changed.
The code is not very descriptive. From TableGen generated Ops we are used to the generation of convenient getters with descriptive names using the parameter names given in TableGen. getLhs() is a lot more descriptive than operands[0].

Proposed changes

The solution to the above problem is having a wrapping class that can map getter methods to the precise indices used for accessing operands.
We happen to already generate such a class however! The Adaptor classes that are generated with every Op, essentially do just that. They have the various getter methods and using the attributes and codegen from TableGen, calculates the correct indices for returning operand or operand ranges. The only thing required is to generalize the concept to work on any range, not just mlir::ValueRange, most importantly ArrayRef<Attribute>

The implementation in this proposal splits the Adaptor class into 3 separate classes, connected by inheritance:

A {OpName}GenericAdaptorBase within a detail namespace. This class contains all value range independent methods and fields. That is the DictionaryAttr, the RegionRange, optional operation name and all attribute getters
A {OpName}GenericAdaptor class with the template param typename RangeT. This inherits from the aforementioned class and adds getters for ALL operands, returning the element type of the range passed in. Requirements for the RangeT, as of my current implementation, is simply being iterable, having a .size() method, and being constructible from an iterator pair, of its own iterator type at the very least. This lax requirement is implemented by pretty much all containers, but most importantly by ArrayRef and ValueRange
The {OpName}Adaptor class, aka the same class name as previously used for the Adaptors. It inherits from {OpName}GenericAdaptor<ValueRange>, and adds constructors from {OpName}
and the verify method implementation.

The last class, having the same name and through inheritance the exact same API surface as previously, achieves source compatibility with the old Adaptor class.
Rationale for the GenericAdaptorBase class is partly compile time, as these methods do not
need to be template instantiated, but is also required, as Adaptors index calculating methods have a ciruclar dependency on Ops, if they have the AttrSizedOperandSegments trait.

TableGen Output Example of affine::PrefetchOpAdaptor before and after

Before:

class AffinePrefetchOpAdaptor {
public:
  AffinePrefetchOpAdaptor(::mlir::ValueRange values, ::mlir::DictionaryAttr attrs = nullptr, ::mlir::RegionRange regions = {});

  AffinePrefetchOpAdaptor(AffinePrefetchOp op);

  std::pair<unsigned, unsigned> getODSOperandIndexAndLength(unsigned index);
  ::mlir::ValueRange getODSOperands(unsigned index);
  ::mlir::Value getMemref();
  ::mlir::ValueRange getIndices();
  ::mlir::ValueRange getOperands();
  ::mlir::DictionaryAttr getAttributes();
  ::mlir::BoolAttr getIsWriteAttr();
  bool getIsWrite();
  ::mlir::IntegerAttr getLocalityHintAttr();
  uint32_t getLocalityHint();
  ::mlir::BoolAttr getIsDataCacheAttr();
  bool getIsDataCache();
  ::mlir::LogicalResult verify(::mlir::Location loc);
private:
  ::mlir::ValueRange odsOperands;
  ::mlir::DictionaryAttr odsAttrs;
  ::mlir::RegionRange odsRegions;
  ::llvm::Optional<::mlir::OperationName> odsOpName;
};

After:

namespace detail {
class AffinePrefetchOpGenericAdaptorBase {
protected:
  ::mlir::DictionaryAttr odsAttrs;
  ::mlir::RegionRange odsRegions;
  ::std::optional<::mlir::OperationName> odsOpName;
public:
  AffinePrefetchOpGenericAdaptorBase(::mlir::DictionaryAttr attrs = nullptr, ::mlir::RegionRange regions = {});

  std::pair<unsigned, unsigned> getODSOperandIndexAndLength(unsigned index, unsigned odsOperandsSize);
  ::mlir::DictionaryAttr getAttributes();
  ::mlir::BoolAttr getIsWriteAttr();
  bool getIsWrite();
  ::mlir::IntegerAttr getLocalityHintAttr();
  uint32_t getLocalityHint();
  ::mlir::BoolAttr getIsDataCacheAttr();
  bool getIsDataCache();
};
} // namespace detail
template <typename RangeT>
class AffinePrefetchOpGenericAdaptor : public detail::AffinePrefetchOpGenericAdaptorBase {
  using ValueT = ::llvm::detail::ValueOfRange<RangeT>;
  using Base = detail::AffinePrefetchOpGenericAdaptorBase;
public:
  AffinePrefetchOpGenericAdaptor(RangeT values, ::mlir::DictionaryAttr attrs = nullptr, ::mlir::RegionRange regions = {}) : Base(attrs, regions), odsOperands(values) {}

  std::pair<unsigned, unsigned> getODSOperandIndexAndLength(unsigned index) {
     return Base::getODSOperandIndexAndLength(index, odsOperands.size());
  }

  RangeT getODSOperands(unsigned index) {
    auto valueRange = getODSOperandIndexAndLength(index);
    return {std::next(odsOperands.begin(), valueRange.first),
             std::next(odsOperands.begin(), valueRange.first + valueRange.second)};
  }

  ValueT getMemref() {
    return *getODSOperands(0).begin();
  }

  RangeT getIndices() {
    return getODSOperands(1);
  }

  RangeT getOperands() {
    return odsOperands;
  }

private:
  RangeT odsOperands;
};
class AffinePrefetchOpAdaptor : public AffinePrefetchOpGenericAdaptor<::mlir::ValueRange> {
public:
  using AffinePrefetchOpGenericAdaptor::AffinePrefetchOpGenericAdaptor;
  AffinePrefetchOpAdaptor(AffinePrefetchOp op);

  ::mlir::LogicalResult verify(::mlir::Location loc);
};

Generated Ops would then simply get an additional typedef, which I tentatively call FoldAdaptor, which will be an alias to {OpName}GenericAdaptor<::llvm::ArrayRef<::mlir::Attribute>>.
For other code wanting to make use of an Ops GenericAdaptor a templated using declaration
is added to the Op as well.

The new fold API would then be:

OpFoldResult fold(FoldAdaptor adaptor) for a single result op
LogicalResult fold(FoldAdaptor adaptor, SmallVectorImpl<OpFoldResult> &results) otherwise

Example fold of index::CmpOp before and after

Before:

OpFoldResult CmpOp::fold(ArrayRef<Attribute> operands) {
  assert(operands.size() == 2 && "compare expected 2 operands");
  auto lhs = dyn_cast_if_present<IntegerAttr>(operands[0]);
  auto rhs = dyn_cast_if_present<IntegerAttr>(operands[1]);
  if (!lhs || !rhs)
    return {};
  ...
}

After:

OpFoldResult CmpOp::fold(FoldAdaptor adaptor) {
  auto lhs = dyn_cast_if_present<IntegerAttr>(adaptor.getLhs());
  auto rhs = dyn_cast_if_present<IntegerAttr>(adaptor.getRhs());
  if (!lhs || !rhs)
    return {};
  ...
}

Compatibility and Opt-in mechanism

For the purpose of initial source compatibility, the new API would be disabled by default.
This is the part I am most unsure about, but my idea as opt in mechanism is to repurpose the hasFolder field of Op, by making it
an int (effectively enum) with one of three values:

kEmitNoFolder (0) does not emit any fold method
kEmitRawAttributesFolder (1), emits the old ArrayRef<Attribute> fold
kEmitFoldAdaptorFolder (2), emits the new FoldAdaptor fold

The choice of values ensures source compatibility. Upstream and downstream fold methods could then be gradually migrated to use the FoldAdaptor fold method.
If deemed strictly worse, the old API could be removed at a future time.
For reference, git grep "::fold(" | wc -l yields approximately 256 methods upstream that would require a signature change if fully migrated.

Additional Pros

The generic adaptor is designed to take any range with any default constructible/nullable element type. Any kind of code requiring a mapping of an Ops operands to some other kind of value would strongly benefit from this nicer API. An example for this would be the LLVM exporter which could make use of {OpName}::GenericAdaptor<llvm::ArrayRef<llvm::Value*>> to easily map getter names to LLVM values.

Notes

The extra code generated, by itself, seems to have little to no impact on compile time. As a benchmark I compiled TestDialect with clang-cl in release mode on my Windows machine (i9 13900kf), both with, and without the TableGen related code for the proposal. Using hyperfine as benchmarking tool I got:
- Before: 42.881s +/- 0.479s
- After: 42.532 +/- 0.471s seconds

As you can see the difference is statistically insignificant, and essentially equal (within noise levels).
That said, making extra uses of the generic adaptors would certainly have the compile time cost of instantiating the template.

TLDR

New opt-in fold API using new TableGen generated generic adaptors for being able to use getters insteading of indexing, no initial compile time impact and fully source compatible with existing code. New generic Adaptors also useful for other Op operand mapping tasks such for abstract interpretation or for exporters like the LLVM exporter.

Implementation Status

I have an initial implementation fully implementing the semantics and APIs described above, that is just in need of writing more tests and general code cleanup.
I will post a link to a review here as soon as it is on Phabricator.
Until then I thought I’d post this RFC for more visibility, discussion and feedback

zero9178 · December 25, 2022, 1:56pm

I have posted ⚙ D140660 [mlir][tblgen] Generate generic adaptors for Ops as the first of very likely two patches for this RFC. It implements the generation of generic adaptors using TableGen.
The second patch will make use of them within the fold API as described above.

jpienaar · December 27, 2022, 8:21pm

I like the idea of the generic adaptors. That has come up in some form multiple times. Given those adaptors, do we effectively have (using different names rather than overloading function to make it easier to refer to one or the other)

OpFoldResult foldWithAdaptor(FoldAdaptor adaptor)  { ... }

// is equivalent to/

OpFoldResult foldWithArrayRef(ArrayRef<Attribute> operands) {
  return foldWithAdaptor(FoldAdaptor(operands));
}

? And so in your example we could have also done

OpFoldResult CmpOp::fold(ArrayRef<Attribute> operands) {
  FoldAdaptor adaptor(operands);
  auto lhs = dyn_cast_if_present<IntegerAttr>(adaptor.getLhs());
  auto rhs = dyn_cast_if_present<IntegerAttr>(adaptor.getRhs());
  if (!lhs || !rhs)
    return {};
  ...
}

Is the compatibility and opt-in mechanism only for migration? Or are there cases where the adaptor approach won’t work?

zero9178 · December 27, 2022, 8:59pm

jpienaar:

Given those adaptors, do we effectively have (using different names rather than overloading function to make it easier to refer to one or the other)

OpFoldResult foldWithAdaptor(FoldAdaptor adaptor)  { ... }

// is equivalent to/

OpFoldResult foldWithArrayRef(ArrayRef<Attribute> operands) {
  return foldWithAdaptor(FoldAdaptor(operands));
}

? And so in your example we could have also done

OpFoldResult CmpOp::fold(ArrayRef<Attribute> operands) {
  FoldAdaptor adaptor(operands);
  auto lhs = dyn_cast_if_present<IntegerAttr>(adaptor.getLhs());
  auto rhs = dyn_cast_if_present<IntegerAttr>(adaptor.getRhs());
  if (!lhs || !rhs)
    return {};
  ...
}

Besides having to also construct the FoldAdaptor with the attribute dictionary (for cases such as attribute sized segments or variadics-of-variadic ops), yes, these are exactly equivalent, and the fold hook for the newer fold method simply does the convenience of constructing the adaptor for you.

Since adaptors already have a getOperands method, that in the case of the FoldAdaptor returns the underlying ArrayRef<Attribute>, code assuming the old representation can still be migrated to work with the new signature with ease.
I am therefore not aware, nor have encountered in my local builds, any scenarios where the old signature offers any functionality over the old, or any other scenarios where the adaptor approach would not work.

The opt-in mechanism is to allow a gradual migration for both upstream and downstream projects. Beyond that, I don’t see a reason to keep the old fold around in the future, given the strict superset in functionality of the FoldAdaptors.
With the opt-in mechanism, as is currently in the propsal, this could eg. be implemented by a future patch that would make all hasFolder values not equal to 0, always emit the new fol API, and possibly even removing the kEmitRawAttributesFolder constant.
This would then, only require a single action per op, the actual migration from old fold to new fold API for upstream and downstream.

clattner · December 28, 2022, 12:08am

This is a nice proposal, and I agree it would be great to get more-typed access to operands in fold methods. The approach for phasing in makes sense to get progressive adoption, but if we go with this direction I think it is important that we eventually (after a long transition period) get back to a single style - and remove the “raw” fold methods. This approach of phasing things in has worked well for other transitions, so I don’t have any significant concerns about it, just mentioning it for completeness.

In any case, thank you for driving this forward!

-Chris

zero9178 · January 3, 2023, 1:41pm

I have posted ⚙ D140886 [mlir] Add a new fold API using Generic Adaptors as the second and last patch implementing this RFC. It makes use of the generic adaptors generated in the parent revision to generate the new fold method siganture.
Docs are also updated to use the new API to encourage use of it instead of the old one.

zero9178 · January 4, 2023, 9:37pm

@Mogball gave the feedback within the review that he’d prefer the selection of the fold API to be per dialect rather than per operation.
After further thought I think that’d be the better mechanism too, as it leads to less required changes within the TableGen files when migrating and no left over kEmitFold... constants/values after migration (essentially going back to the TableGen files looking the same as they did at the start).
It is also a familiar process and mechanism for those who have gone through the generated accessor migration.

The new mechanism leaves the hasFolder API of Ops as it was but instead adds a new useFoldAPI (bikeshedding welcome) field to dialects, which controls which fold API should be generated for Ops within a dialect.
The constants remain the same as in the initial RFC, that is kEmitRawAttributesFolder for the legacy API and kEmitFoldAdaptorFolder for the new one, with the exception of kEmitNoFolder no longer being required, nor their exact values being of any importance.

The migration process would then go roughly the same as it did for when the new generated accessor prefixes were implemented:

Change fold of upstream dialects to use the FoldAdaptor
Some time after a PSA, change the default fold API to use FoldAdaptors
Some time after a PSA, remove the option to generate the old fold API entirely

Please tell if you have any issues or concerns with these proposed changes.

jpienaar · January 4, 2023, 9:42pm

+1 that’s what I was also fishing around with the migration or not question, if this is temporary, then requiring less changes SGTM.

zero9178 · January 11, 2023, 1:36pm

The two patches implementing the changes described in the RFC have now landed on main.

Thank you everyone for your participation and feedback! I have also posted an PSA announcing the changes to everyone and instructing how to migrate.

As a next immediate step I will start posting patches today to update in-tree dialects.

Topic		Replies	Views
[PSA] New improved `fold` method signature has landed (please update your downstream projects) Deprecation & Important Refactoring	0	1251	January 11, 2023
Improve Operation::fold implement MLIR	4	345	February 26, 2022
Declarative folding MLIR	1	347	March 31, 2023
[PSA] Default `fold` method signature has been switched to using `FoldAdaptor` Deprecation & Important Refactoring	0	457	February 7, 2023
[PSA] Old 'fold' signature has been removed entirely Deprecation & Important Refactoring	0	450	February 22, 2023