Proposal
I propose to add an Async/Await dialect to MLIR to support task asynchronous programming model. This dialect is inspired by the Async/Await programming model in C# and Scala. The goal of the dialect is to express asynchronous/multi-threaded/concurrent programs in a sequential manner, and use LLVM compiler passes to convert asynchronous functions (functions with async operations in it) into LLVM coroutines.
Example
func @computation_1() { … }
func @computation_2() { … }
func @computation_3() { … }
func @async_compute(%arg0: !async.runtime) -> !async.handle {
%hdl1 = async.call %arg0 : !async.runtime
@computation_1() : () -> !async.handle
%hdl2 = async.call %arg0 : !async.runtime
@computation_2() : () -> !async.handle
async.await %hdl1 : !async.handle
async.await %hdl2 : !async.handle
call @computation_3(): () -> ()
%hdl3 = async.ret_handle : !async.handle
return %hdl3 : !async.handle
}
In this example @async_compute starts two asynchronous compute tasks #1 and #2, and “awaits” for their completion. When the first two tasks are completed it runs a third compute task.
Although this function looks like a very simple regular sequential function, it is converted into an asynchronous state-machine. The function caller immediately gets a handle to an asynchronous computation, that will become ready when all compute tasks are completed.
In the example above compute function #1 and #2 are running on a separate threads, compute function #3 is running on a thread that completed the last pending task.
Async Dialect
Async Runtime
Async runtime hides all the details of launching concurrent/async tasks, managing worker threads, etc. It could be multiple implementations, relying on different concurrency primitives (e.g. fixed-size vs dynamically-sized thread pools)
func @async_func(%runtime : !async.runtime) {
}
Async Handle
Async handle is a handle into an asynchronously running task, that becomes ready when the task is completed. Inside the async function it is possible to await for asynchronous task completion using async.await operation.
Async Call
Async call launches a function on a separate thread, and returns a handle that will signal its completion.
%handle = async.call %runtime : !async.runtime
@callee() -> !async.handle
Async Runtime API
Async operations lowered to async runtime API calls. All runtime types passed in as opaque pointers, and runtime implementations are free to choose concurrency primitives.
typedef struct MLIR_AsyncRuntime MLIR_AsyncRuntime;
typedef struct MLIR_AsyncHandle MLIR_AsyncHandle;
using TaskFunction = void (*)(); // asynchronous task task function
using CoroHandle = void *; // coroutine handle
using CoroResume = void (*)(void *); // coroutine resume function
// Get the default runtime instance.
extern "C" MLIR_AsyncRuntime *MLIR_AsyncRT_DefaultRuntime();
// Create an asynchronous task and return a handle.
extern "C" MLIR_AsyncHandle *MLIR_AsyncRT_Call
(MLIR_AsyncRuntime *, TaskFunction);
// Async wait that resumes suspended coroutine.
extern "C" void MLIR_AsyncRT_Await
(MLIR_AsyncHandle *, CoroHandle, CoroResume);
// Wait for the async handle blocking the caller thread.
extern "C" void MLIR_AsyncRT_SyncAwait(MLIR_AsyncHandle *);
// Create handle in not-ready state.
extern "C" MLIR_AsyncHandle * MLIR_AsyncRT_CreateHandle(MLIR_AsyncRuntime *);
// Mark handle ready.
extern "C" void MLIR_AsyncRT_EmplaceHandle(MLIR_AsyncHandle *);
Conversion to LLVM
To convert async functions into the asynchronous state machines we will use LLVM Coroutines passes.
This function:
func @async_await(%arg0: !async.runtime) -> !async.handle {
%handle = async.call %arg0 : !async.runtime
@callee() : () -> !async.handle
async.await %handle : !async.handle
%ret_handle = async.ret_handle : !async.handle
return %ret_handle : !async.handle
}
Will be converted to roughly this LLVM IR with coro intrinsics:
func @async_await(%arg0: !llvm<"i8*">) -> !llvm<"i8*"> {
// Initialize coroutine id and frame.
%id = llvm.call @llvm.coro.id(...)
%size = llvm.call @llvm.coro.size.i64()
%alloc = llvm.call @malloc(%4)
%hdl = llvm.call @llvm.coro.begin(%3, %5)
// Prepare return handle and call function asynchronously.
%ret_hdl = llvm.call @MLIR_AsyncRT_CreateHandle(%arg0)
%call_hdl = llvm.call @MLIR_AsyncRT_Call(%arg0, %8)
// Save coroutine state.
%coro_state = llvm.call @llvm.coro.save(%6)
// Pass coroutine handle to async runtime and suspend it.
// When `%call_hdl` will become ready, runtime will resume coroutine `%hdl`.
llvm.call @MLIR_AsyncRT_Await(%call_hdl, %hdl)
%suspend = llvm.call @llvm.coro.suspend(%hdl)
swtich %suspend ^suspend, ^resume, ^cleanup
// Resume coroutine after async call completed.
^resume:
// Emplace ret handle.
llvm.call @MLIR_AsyncRT_EmplaceHandle(%7)
br ^cleanup
// Cleanup coroutine state.
^cleanup: // 2 preds: ^bb1, ^bb2
%mem = llvm.call @llvm.coro.free(%hdl)
llvm.call @free(%mem)
br ^suspend
// Return a not-ready handle from the coroutine ramp function.
^suspend:
%19 = llvm.call @llvm.coro.end(%hdl)
return %7 : !llvm<"i8*">
}
Async Vs cppcoro::task
C++20 added coroutines support and one example of async tasks implemented on top of coroutines is cppcoro.
The main difference between async.handle and cppcoro::task is that async handle is a handle into “running computation” and handle will become ready eventually, while the cpp task is a “handle” (coroutine handle) into the “suspended computation” (suspended coroutine), and the caller must explicitly resume it to get back the result.