I don’t think you’re going to be able to come up with a portable concept of “ABI type” that is both (1) significantly less complicated than the Clang AST and (2) captures everything from C that any possible platform might be interested in. Platform ABIs can and often do vary according to all sorts of minor details of what’s been written in the source. Please look at the actual record layout code in Clang for an idea of what has to be supported here. I am very concerned about this approach, because I don’t think there’s a viable path for Clang to adopt it, and I’m afraid that that means it is doomed to never be more than a buggy re-implementation.
I think this is a correctable problem. You’re imagining a single, portable library that takes in high-level information and a target specification and spits out ABI details. I would suggest instead breaking it down more like this:
First, the low-level lowering decisions are made by target-specific libraries. These libraries consume low-level information — basically, this is implementing the algorithms described in the target psABI, strictly in terms of the cases that the psABI distinguishes. For example:
- If the psABI says that e.g.
_Complex T
is treated exactly likestruct { T x,y; }
, then the API shouldn’t have a case for_Complex T
. But if the psABI says that_Complex T
is treated specially, it needs to be a case you can represent. - The argument layout code shouldn’t get passed a high-level struct type, it should get whatever details of the aggregate layout that argument lowering cares about. If aggregates are always passed on the stack, this is probably just the size and alignment. If they can be broken up into registers, you might also need to take the result of the aggregate classification algorithm (which would be available as a separate function in the library).
Don’t be afraid of writing these libraries in creative ways that only work because of the details of the target. Like, in the abstract, argument layout needs to get passed the complete argument list ahead of time because it might pass the float
in void (float, double)
differently from the float
in void (float, int)
. In practice, it’s an online algorithm on every single target I know of: you consider the return type, then each argument in order. And that means you can just have the argument layout algorithm be a class type that you call methods on to add specific kinds of argument. And that might make a lot of things easier and more performant around things like aggregate layout.
Once those target-specific libraries are written, you can build portable libraries on top of them. Each library would consume a specific kind of high-level input; for example:
- You could have a portable library that expects Clang ASTs.
- You could also have a portable library that expects some intentionally-simplified type system. Since you’re not trying to handle all of C in this, you can just leave out difficult cases, like bit-fields.
The latter would be enough to get simple cases working, which is probably enough for Rust and other frontends. Someone trying to matching the C ABI for a really complex C use case should probably just be using Clang as a library, though.