There are some type casts that should be replaced with intrinsics (this is Hexagon-specific). Those only occur in calls to Hexagon builtins, and right now this replacement happens as a part of clang's codegen.
However, this replacement is only triggered by types in the AST and the corresponding types in the LLVM IR, and is not dependent on the details (semantics) of any particular builtin, so I'd like to make it orthogonal to the cgbuiltin code that deals with specific intrinsics. I'd like the replacement to transform AST->AST, so the result can be easily composed with any code that does AST->LLVM IR.
Is there a precedent for something like this? I could do this right at the beginning of EmitHexagonBuiltinExpr, but I was wondering if there was a better way.
I'm not sure I follow what you mean. Do you mean that given, for example "float x; __builtin_foo(-x);", you call "llvm.foo(-x)", and given "double x; __builtin_foo((float)-x)", you call "llvm.bar(-x)"? I don't think there's any precedent for that, no; normally that's the sort of optimization you'd handle in instcombine.
Assume for brevity that
vector_int = attribute((vector_size(32 * sizeof(int)))) int
vector_bool = attribute((vector_size(128 * sizeof(_Bool)))) _Bool
Consider this AST. It contains bitcasts from vector_bool to vector_int and the other way around.
`-ImplicitCastExpr 0x807c212b0 <col:10, col:51> ‘HVX_Vector’:‘vector_int’
`-CallExpr 0x807c21220 <col:10, col:51> ‘vector_bool’
-ImplicitCastExpr 0x807c21208 col:10 ‘vector_bool (*)(vector_bool, vector_bool)’
`-DeclRefExpr 0x807c21180 col:10 ‘’ Function 0x807c21000 ‘__builtin_HEXAGON_V6_pred_and_128B’ ‘vector_bool (vector_bool, vector_bool)’
-ImplicitCastExpr 0x807c21268 col:45 ‘vector_bool’
`-ImplicitCastExpr 0x807c21250 col:45 ‘HVX_Vector’:‘vector_int’
`-DeclRefExpr 0x807c211a0 col:45 ‘HVX_Vector’:‘vector_int’ lvalue ParmVar 0x807a43d10 ‘a0’ ‘HVX_Vector’:‘vector_int’
`-ImplicitCastExpr 0x807c21298 col:49 ‘vector_bool’
`-ImplicitCastExpr 0x807c21280 col:49 ‘HVX_Vector’:‘vector_int’
`-DeclRefExpr 0x807c211c0 col:49 ‘HVX_Vector’:‘vector_int’ lvalue ParmVar 0x807a43d88 ‘a1’ ‘HVX_Vector’:‘vector_int’
- These casts are not no-ops, there are instructions that do it (with corresponding builtins/intrinsics). In the actual hardware, the registers holding vector_int are 1024 bits long, the registers holding vector_bool are 128 bits long.
- The “bool” vectors are not storable directly (i.e. no way to store the 128 bits to memory). Instead, they have to be “expanded” into vector_int (or contracted from vector_int). In other words, vector_int is the universal storage format for both types.
Long story short, in C code we want to make these two types look (more or less) the same (i.e. both are 128 bytes), but in LLVM IR they are different (since 32i32 is not bitcastable to 128i1). Now, I want to replace these bitcasts with calls to proper builtins, i.e. transform clang’s Expr to another Expr. Right now we deal with it in CGBuiltin as a part of codegen, but it’s not elegant (not composable with other per-builtin handling that may need to happen in CGBuiltin).
If I’m understanding correctly, in the C AST, the builtin takes an operand of type “vector_bool”, which is represented in memory like a “<128 x i8>”. But the corresponding LLVM builtin takes a “<128 x i1>”. So you need to emit a “trunc” or something like that to do the conversion.
From your description, it sounds like you can declare variables of type vector_bool. (If not, I’m not sure why you’re declaring your intrinsics to have arguments of type vector_bool.) Given that, the bitcasts seem like a distraction, so I’ll pretend they’re not there, and the question is just whether we should emit an AST node to represent the conversion from a “memory” vector_bool to a “register” vector_bool.
I think this is not something we should be exposing in the AST. There is no type conversion at the source level. The fact that we have to expand the intrinsic call to a sequence of multiple LLVM instructions is a detail of the LLVM backend, not a property of the source language. The source language is intentionally hiding the fact that a conversion is necessary to invoke the instruction in question. The AST should reflect that: it should not include a conversion operator that’s meaningless at the source level.
Note that clang CodeGen does distinguish between “memory” and “register” types for scalar booleans; see CodeGenTypes::ConvertType vs. CodeGenTypes::ConvertTypeForMem . Maybe that would be more helpful for solving your issue?