[RFC] New dialect for modelling asynchronous execution at a higher-level

jpienaar · July 5, 2020, 5:40pm

Depending on the platform targetted that is unavoidable (e.g., HW where it can’t communicate when a part of the program is completed or where we can only flag N events and so support some number of partial results). And not necessarily bad as performance is the goal not concurrency. The best performance may have 0 concurrency. What I like about the approach wrt splitting async.region is that one then has explicit control and can incur the overhead of tracking completions where needed rather than accidentally/always.

async.regions express what can execute async from one another, forming them seems simple as it is a property of the program (where one comes from a higher level data flow abstraction simplifies this too). Merging them is also simple, but adds a constraint and so one has to consider when to do it, but the how seems simple given the execution expressed, splitting would also require some analysis (side-effect free ones would be easy).

How would that work if only some values produced are produced? E.g., say we have async.region that consumes results from multiple async.regions but only some of the values then it would seem we lose use-def tracking here and have to jump through extra hoops to find out if a value produced is actually used. Being explicit seems easier. Of course one could consider eliding it in the pretty print form …

That sounds like a good idea.

Topic		Replies	Views
Question about GPU Dialect Async Tokens in MLIR MLIR gpu	4	94	April 15, 2025
Development of high-level Tensor Compute Primitives dialect(s) and transformations Tensor Compiler	79	11370	September 8, 2021
MLIR News, 33rd edition (5/1 - 5/14/2021) Newsletter	0	1120	May 5, 2021
How to Implement Asynchronous Concurrent Execution Between gpu.launch Operations? MLIR gpu	4	91	March 11, 2025
Are there components in MLIR for analyzing GPU kernel dependencies and scheduling? MLIR gpu	0	52	March 31, 2025

[RFC] New dialect for modelling asynchronous execution at a higher-level

Related topics