Hi Dave,
I highly appreciate your idea of integrating heterogeneous computing features directly into LLVM-IR. I believe this can be a way worth going, but I doubt now is the right moment for it. I don't share your opinion that it is easy to move LLVM-IR in this direction, but I rather believe that this is an engineering project that will take several months of full time work. Possibly not the implementation itself, but designing it, discussing it, implementing it and ensuring that the new feature does not increase run-time and memory footprint or reduce maintainability of LLVM. Due to the large amount of changes that would be needed all over LLVM, I really think we should first get some experience in this area, before we burn this feature into LLVM-IR.
The llvm.codegen intrinsic seems the perfect match to build up such experience. It requires no changes to LLVM-IR itself and only very local changes to the generic back end infrastructure. It may possibly not be as generic as other solutions, but it is far from being an ugly hack. Quite in contrast, it is a close match for OpenCL like run times and works well with the existing PTX back end.
Do you have definitiv plans to add heterogeneous computing capabilities to LLVM-IR within the next couple (3-4) months? Will these capabilities superseed the llvm codegen intrinsic?
In case such plans do not exist, what do you think about adding the llvm.codegen() intrinsic for now? If mid-term plans exist for heterogeneous extensions to LLVM-IR, we can document them along the intrinsic.
Cheers
Tobi