I’m a developer on the Apache Arrow project: we provide a
language-independent columnar memory format.
Recently we’ve started discussion and work around providing a
consistent format for representing relational algebraic operations
against that data 1.
My own intuition is that we could learn a great deal from the MLIR
project since the space targeted by this project has comparable
complexity and heterogeneity, and specifically that we should
probably emulate the less constrained, lowerable/optimizable dialect
structure of MLIR. Most of the other voices in the conversation feel
this is counter to the problem statement and would prefer to declare
representation of lowered operations as out-of-scope.
I am far from an expert on either MLIR or relational algebra, so I’m
writing in hopes that someone here is interested enough to join the
conversation. Regardless of the analogy I’m interested in promoting,
I’m sure you could provide useful notes on our design.
The summary of my proposal (which is out of sync with the rest of the code in
that PR) is here: