[RFC] Introduce OpAsm{Type,Attr}Interface for pretty-print in AsmPrinter

TLDR: similar to OpAsm{Op,Dialect}Interface, OpAsm{Type,Attr}Interface is for pretty-printing ASM names/alias based on Type/Attribute.

Demo implementation in [mlir] Add OpAsmType/AttrInterface · ZenithalHourlyRate/llvm-project@c220063 · GitHub

Example

func.func @result_name_from_op_asm_type_interface() {
  // CHECK: %op_asm_type_interface =
  %0 = "test.default_result_name"() : () -> !test.op_asm_type_interface
  return
}

Recap on OpAsm{Op,Dialect}Interface

struct OpAsmOpInterface {
  void getAsmResultNames(...);
  void getAsmBlockArgumentNames(...);
};

struct OpAsmDialectInterface {
  AliasResult getAlias(Attribute attr, raw_ostream &os);
  AliasResult getAlias(Type type, raw_ostream &os);
};

Dialect and Op all have their own ability to hook into AsmPrinter to get custom name/alias. Yet the default implementation is “owned” by the Op/Dialect such that outside user could not affect the behavior.

Proposed Change

struct OpAsmTypeInterface {
  StringRef getAsmName();
};

struct OpAsmAttrInterface {
  StringRef getAsmName();
};

// Default impl of OpAsmOpInterface use the Interface to generate name
// Default impl of OpAsmDialectInterface use the Interface to generate alias,
//   if the Dialect owns the Type/Attr

Motivation

Op does not always recognize its operands/results/block arguments (from other Dialect), so it might be better to provide an interface for it to know how to generate name for them.

For example, for a generic op like func::FuncOp, we would want its argument to be meaningful instead of always being argX. With this change, we can get ASM output like the belowing (taken from bgv-to-lattigo: lower client-interface and plain op by ZenithalHourlyRate · Pull Request #1226 · google/heir · GitHub)

func.func @dot_product(
  %evaluator: !evaluator,
  %params: !params,
  %encoder: !encoder,
  %ct: !ct,
  %ct_0: !ct) -> !ct

For Dialect, instead of TypeSwitch on Type/Attr in dialect implemenentaion for OpAsmDialectInterface (see heir/lib/Dialect/LWE/IR/LWEDialect.cpp at 276bc7c53ad9b074ad3b8dde2756b1980e4c2707 · google/heir · GitHub), we can directly define these names in .td together with the Type/Attr definition so it is more manageable.

Impact

This is opt-in as the user has to add the Interface to its own Type/Attr. User could also implement their own {Op,Dialect}Interface to avoid/extend the default impl behavior. Existing user/downstream will not see change if they did not encounter any Type/Attr with this interface.

Review

Cc @j2kun, @AlexanderViand-Intel and @asraa for downstream HEIR project. Cc @River707 @mehdi_amini for relevent commit history.

3 Likes

I decide to separate it into the following parts, for ease of reviewing/merging.

To respect the this separation, the API design now becomes the following, which is aligned with the getAsmResultNames and getAlias API design in OpAsm{Op/Dialect}Interface

struct OpAsmTypeInterface {
  void getAsmName(OpAsmSetNameFn);
  AliasResult getAlias(raw_ostream &os);
};

struct OpAsmAttrInterface {
  AliasResult getAlias(raw_ostream &os);
};

The first part is now in [mlir] Add OpAsmTypeInterface for pretty-print by ZenithalHourlyRate · Pull Request #121187 · llvm/llvm-project · GitHub

I’m supportive of this change, but let’s give folks time to react. Given the season, we should think O(weeks) rather than O(days).

If the change documenting the existing interfaces is separable, let’s land that first.

Downstream has an implementation of TypeAsmInterface in use type interface for result asm suggestions by copybara-service[bot] · Pull Request #1241 · google/heir · GitHub, similar to OpAsmTypeInterface proposed here.

One improvement of it is using tablegen like let nameSuggestion = "ct"; to define simple getAsmName. Though there are hacks to make OpAsmOpInterface working with TypeAsmInterface for func::FuncOp.

As for the documentation part, I opened a PR on OpAsmDialectInterface in [mlir][docs] Guide on generating alias for type/attribute by ZenithalHourlyRate · Pull Request #121698 · llvm/llvm-project · GitHub as a starting point.

I’m not sure TBH. So would argue both sides :slight_smile:

I feel the list in descending order of who could provide the most meaningful names are 1) the user (using the NamedLoc feature Maks just added - which I believe is able to allow you to specify func arg names), 2) the op producing the value (which the dialect of the op has knowledge of), 3) the type. But the type is rather low level knowledge (vector or tensor vs accumulating register or weight matrix).

One has the type printed already and the editor plugins (like those driven by the LSP) also shows the type on hover. The prefix is something non-unique that type could dictate. It does probably make it easier locally if you prefixed with (say) t tensors, and for uncommon types it makes things very obvious. In the tensor case it only helps if one has non-tensors too, else its just a uniform prefixing.

Although as you did, one could also do that with a dialect interface & result names (for everything except args, where you’d need something like (1)). Now that wouldn’t help if you have different dialects with different naming conventions, although I don’t think this proposal would help there either as dialect/op has priority on naming of result. So yes then it is just making a TypeSwitch into a type interface.

1 Like

descending order 1) the user 2) the op 3) the type

I acknowledge the order. I assume the user includes 1) end-user and 2) compiler passes manipulating the IR.

The inconvenience with the NamedLoc approach I would encounter is that 1) not all MLIR input starts with NamedLoc info, so the compiler has no information 2) for newly created value, the NamedLoc then should be provided by the pass in some way, then the pass becomes responsible, but we should not rewrite all passes to be aware of this.

I assume the AsmPrinter would get the name in the following order: NamedLoc → Op → Dialect → Type, so having both NamedLoc/Type does not conflict.

After all, the TypeInterface is only meant for providing default prefix when there is no information at all, and I think a prefix is always better than a plain %0.

editor plugins (like those driven by the LSP) also shows the type on hover

When there is a ton of %0 and there are various types, having a prefix would clearly distinguish them.

So yes then it is just making a TypeSwitch into a type interface.

The whole point of making a type interface is “one dialect is not fully aware of another dialect when they encounter foreign values”. It is not equivalent to a TypeSwitch where the caller (e.g. FuncOp) needs to enumerate all possible types.

(for everything except args, where you’d need something like (1))

NamedLoc does not solve the problem as when the compiler has no information, we have to get it in some way, and the type nterface is a safe place to ask.

In general supportive of adding the interfaces for attribute and types, definitely a missing need here.

The one part of the API I don’t quite understand is the void getAsmName(OpAsmSetNameFn); API. Couldn’t the OpAsmPrinter just check to see if the type has an alias, and then use that for the value? The intermixing of value and type naming is a bit weird to me in the interface, it would be good to disconnect them as if we can.

Couldn’t the OpAsmPrinter just check to see if the type has an alias, and then use that for the value?

This is derived from downstream need, where a more verbose naming for type is needed and a simple shorthand for SSA name is wanted. Check the following example

!ct_L0_ = !lwe.new_lwe_ciphertext...
!ct_L1_ = !lwe.new_lwe_ciphertext...
!ct_L2_ = !lwe.new_lwe_ciphertext...
  func.func @simple_ckks_bootstrapping(%cc: !openfhe.crypto_context, %ct: !ct_L2_) -> !ct_L2_ {
    %ct_0 = openfhe.mod_reduce %cc, %ct : (!openfhe.crypto_context, !ct_L2_) -> !ct_L1_
    %ct_1 = openfhe.mod_reduce %cc, %ct_0 : (!openfhe.crypto_context, !ct_L1_) -> !ct_L0_
    %ct_2 = openfhe.bootstrap %cc, %ct_1 : (!openfhe.crypto_context, !ct_L0_) -> !ct_L2_
    return %ct_2 : !ct_L2_
  }

Agreed (highlighting mine). But I’d also add that the various types are probably from the same application wherein unique. Some letter may get more contention (for example, you couldn’t use c as prefix in your code as it could look like convention followed for constant), but that’s not an implementation issue. And in some domains I think almost every op will produce same type (your example below has only contexts, so its only a consistent prefix, same for most tensor programs). That’s not a blocker, just what I consider scoping of value (so I’d see less readability gains where there is only one type or multiple types with same prefix; in the latter case it may even add an extra disambiguation step for reader).

In the case where the ops/dialect don’t yes. One are sets of conventions baked into compiler and the other is from user input/compiler fronten. And it is indeed convenient where useful.

This was again where we don’t have ops produce naming too. In your example in your repo you had the ops producing names, and in doing so passes need not be modified. So the argument of decoupling I find stronger motivator.

Goal being, having consistent naming based purely on type across multiple dialects with common type system with low implementation effort. I was going to say no coupling additional needed, but that is only true for ops that produce those values rather than propagate (bb args etc). And for this goal, this would seem to have low implementation/compile time side impact while enabling users. Which SGTM.