Announcing `eudsl v0.0.1`

Merry Christmas, Happy New Year, Happy Hanukkah, and all the other holidays to all the fine MLIR people :slight_smile:

TL;DR

I’d like to announce the first release of eudsl[1].

Currently, the project is the sum total of three components:

  1. eudsl-tblgen: Python bindings to libLLVMTableGen;
  2. eudsl-nbgen: A source-to-source translator that translates MLIR headers[2] into direct nanobind bindings;
  3. eudsl-py: Direct Python bindings to MLIR, generated using eudsl-nbgen.

Before I say a little about why the heavy emphasis on Python despite prior aspiring to support all/more languages, here’s a colab that demos eudsl-py, which I suppose is the most intriguing part (or most confusing :person_shrugging:):

Why Python

image

The stated goals of eudsl were (and continue to be) enabling all language frontends to target MLIR.

My initial idea for achieving that goal was to extend upstream’s libMLIRTableGen. But who wants to write C++ to just munge strings when there are so many better string munging languages. Hence eudsl-tblgen, a fairly complete, direct to C++, binding of libLLVMTableGen. Note, these first bindings were for libLLVMTableGen because one needs to be able to build and manage the actual llvm::RecordKeeper to pass to the various functions in mlir-tblgen.

In order to bootstrap those bindings (i.e., because I’m lazy), I wrote a little source-to-source translator that just blindly emitted stuff like

nb::class_<Record>(m, "Record")
  .def_prop_ro("id", &Record::getID)
  .def_prop_ro("name", &Record::getName)

which is of course made possible by nanobind’s very nice templates. But then I massaged those bindings and the plan wasn’t to build out a source-to-source translator[3].

But having done all of this binding I/you/one immediately runs into a problem with the approach (enabling writing ODS backends in Python): ODS isn’t actually a spec and much of the semantics of ODS is actually buried in the implementation of mlir-tblgen and libMLIRTableGen; e.g., when exactly are InferTypeOpInterface traits emitted :thinking:. So you end up having to not only bind lots of stuff from libMLIRTableGen but also rewrite lots of stuff now in Python against those new bindings. Not fun and probably all the way at the right end of the spectrum between “high impact” and “vanishing/diminishing returns”.

So what to do? The answer in meme form:

So that’s what I did: I wrote a clang::ASTFrontendAction to crawl the ODS generated headers and emit nanobind bindings. And shockingly enough (primarily because nanobind is so nice) it worked/works pretty well. This post is already pretty long so I won’t go into weedy details but probably like 90% of the methods in 90% of the classes have working, generated, bindings for them. The obvious/known absences are templated things like the adaptors

template <typename RangeT>
class AddFOpGenericAdaptor : public detail::AddFOpGenericAdaptorBase

But dialect ops, attributes, types, and enums all work[4]; e.g., if you go to the colab above you will see

shape = SmallVector[np.int64]([10, 10])
f32_ty = Float32Type.get(ctx)
memref_ty = MemRefType.Builder(ArrayRef(shape), f32_ty).memref_type()
td = nvgpu.TensorMapDescriptorType.get(
    ctx,
    memref_ty,
    nvgpu.TensorMapSwizzleKind.SWIZZLE_64B,
    nvgpu.TensorMapL2PromoKind.L2PROMO_64B,
    nvgpu.TensorMapOOBKind.OOB_NAN,
    nvgpu.TensorMapInterleaveKind.INTERLEAVE_16B,
)

# !nvgpu.tensormap.descriptor<
#   tensor = memref<10x10xf32>, 
#   swizzle = swizzle_64b, 
#   l2promo = l2promo_64b, 
#   oob = nan, 
#   interleave = interleave_16b
# >
print(td)

which I’m happy about because a perennial sore spot about our upstream bindings is that one has to bind all of these by hand.

A Few technical notes

  1. Yes this does require doing the unthinkable: it requires building LLVM with -frtti. Definitely not upstreamble;
  2. This uses libLLVM.so and libMLIR.so and has the nice side-effect that multiple downstream users (of these bindings) could conceivably use the same base set of bindings (assuming same compile flags etc etc etc);
  3. The bindings are generated at build time of the host project so the compile flags are correct/matching the flags required by the LLVM distro (I forward LLVM_DEFINITIONS both at parse and build time);
  4. nanobind says it compiles 4x faster and that might be true (I didn’t compare against pybind) but it’s still an egregiously long compile by default - single TU for all the bindings will timeout the 6 hour limit on GHA and takes hours even on my M1 mbp. To compensate I had to implement a similar sort of sharding as upstream;
  5. Windows isn’t currently supported but could/might be in the future.

Fringe benefits

In “binding all the things” I discovered a lot of dangling header decls from extraClassDeclaration and elsewhere:

[mlir][arith] DCE getPredicateByName

[mlir][scf] DCE unimplemented decls in TDs

[mlir][llvmir] implement missing attrs getChecked

[mlir][xegpu] DCE decl in TD

[mlir][emitc] DCE unimplemented decls

[mlir][linalg] DCE unimplemented extra decl

[mlir][shape] DCE unimplemented extra decl

So at least that’s good.


  1. See this post from last year if you have no idea what eudsl is. ↩︎

  2. Yes, C++ headers… ↩︎

  3. Because for C++ that’s a fool’s errand right? ↩︎

  4. YMMV! ↩︎

5 Likes