[RFC] New dialect for modelling asynchronous execution at a higher-level

Depending on the platform targetted that is unavoidable (e.g., HW where it can’t communicate when a part of the program is completed or where we can only flag N events and so support some number of partial results). And not necessarily bad as performance is the goal not concurrency. The best performance may have 0 concurrency. What I like about the approach wrt splitting async.region is that one then has explicit control and can incur the overhead of tracking completions where needed rather than accidentally/always.

async.regions express what can execute async from one another, forming them seems simple as it is a property of the program (where one comes from a higher level data flow abstraction simplifies this too). Merging them is also simple, but adds a constraint and so one has to consider when to do it, but the how seems simple given the execution expressed, splitting would also require some analysis (side-effect free ones would be easy).

How would that work if only some values produced are produced? E.g., say we have async.region that consumes results from multiple async.regions but only some of the values then it would seem we lose use-def tracking here and have to jump through extra hoops to find out if a value produced is actually used. Being explicit seems easier. Of course one could consider eliding it in the pretty print form …

That sounds like a good idea.