[npcomp] Next steps for Torch IR -> ATen dialect

stellaraccident · September 11, 2020, 7:16pm

We’ve been having parts of a running design discussion on the #mlir-npcomp discord channel, which is a bad habit that I’m breaking here. Including some excerpts for context and to continue.

@powderluv:

Folks - just catching up on the discussions here and trying to understand the current and future design so heads up on some noob questions. We currently have a path down from TF2XLA → MLIR → Our Custom Dialects → “Our Distributed / fine grained parallel” runtime → Hardware. We currently have a functional Torchscript–>Hacks–>runtime that works well. To get rid of the hacks we been playing with npcomp–>MLIR (Aten) with pytorch 1.3 frontend -->…–> Our Custom Dialects → Runtime → Hardware. However we don’t particularly want to expose a pseudo-device since our runtime tries to abstract the algorithm from the concept of a device since that is a late binding runtime construct. Reading the comments earlier in this channel there was some interest in Torchscript IR <> Aten MLIR Dialect → . Is that still an option ?
Also playing with Torchscript IR we have hit a few limitations like Convert dlrm to torch.jit.script model · Issue #69 · facebookresearch/dlrm · GitHub because the supported types are a subset TorchScript Language Reference — PyTorch master documentation .

@_sean_silva

@powderluv yep, that is “4b” that I mentioned in Discord (AST Extraction is kind of a synonym for TorchScript in this context). My/Stella’s team are very interested in that direction, since it is important for mobile deployment. I really like how you described the device concept as “late binding runtime construct” – I’m going to steal that from now on

Yes, going directly from TorchScript IR to MLIR entirely at the C++ level is an option (and a good one in my opinion). The TorchScript IR is actually very, very similar to MLIR structurally – a straightforward traveral / conversion is very possible.

@powderluv

about the conversion just FYI here is a python version of torchscript to RelayIR tvm/python/tvm/relay/frontend/pytorch.py at main · apache/tvm · GitHub . Maybe a good example to use to generate MLIR using the python binding of MLIR for a proof of concept

stellaraccident · September 11, 2020, 7:19pm

Thanks @powderluv - I had indeed referred to the TVM/Relay lowering

My current thinking is that I am doing some python work to introspect the Torch API/op-set with the goal of:

Upgrading the ATen dialect code generation (currently a python script that scrapes some obsolete exports from the C API).
Generates a C++ data structure that can be used to systematically drive Torch IR->ATen dialect for most of the ops.
I had toyed with providing a reference Python version of the converter, but will probably just jump to C++. I would like it to be mostly table driven, though.

powderluv · September 11, 2020, 7:37pm

That sounds great. I don’t think we necessarily need anything in python and we will just use the C++ interface. Please let us know if we can help the process along. We can switch from the python TVM implementation to the C++ version you are using and get behind it once it is in a form for others to help.

stellaraccident · September 11, 2020, 8:06pm

Good concrete milestone/use case that is motivating for me to put some real time into this over the next ~days. Let me try to get it over the point where it exists enough for more contribution/use.

_sean_silva · September 11, 2020, 9:18pm

Dumb question, but do the TorchScript folks already have some sort of declarative specification of their IR? If possible we should use that.

stellaraccident · September 11, 2020, 9:45pm

Not that I’ve found, but it could be hiding in a corner somewhere that I missed. Assuming no one else speaks up, I could use a second set of eyes looking for such a thing.

powderluv · September 11, 2020, 10:29pm

Posted where all the Torchscript eyes exist

stellaraccident · September 11, 2020, 10:45pm

Lol, I’m used to this question being such a quagmire of reverse-engineer-it-yourself on the TensorFlow side that it didn’t even occur to me to simply ask.

stellaraccident · September 12, 2020, 2:07am

From the response, sounds like not much in the form of op specs.

stellaraccident · September 24, 2020, 5:14am

I figured out how to get what I need: https://github.com/llvm/mlir-npcomp/pull/55

https://pastebin.pl/view/fb1f6b9f

Topic		Replies	Views
Does torch-mlir support input binary TorchScript model (*.pt) and convert it to Torch Dialect MLIR directly (not python) torch-mlir	2	670	May 9, 2023
Graduating mlir-npcomp -> torch-mlir Incubator	4	1202	September 24, 2021
Choosing the dialect for generating MLIR torch-mlir	4	741	March 30, 2022
Torch-MLIR - Bridging PyTorch and MLIR ecosystems MLIR	2	897	October 5, 2021
[npcomp] Torch Dialect round-tripping torch-mlir	6	894	October 5, 2020

[npcomp] Next steps for Torch IR -> ATen dialect

Related topics