Ok, I am developing an intrinsic instruction and I have the codegen
working (and tested). However, some of the more complex cases of the
intrinsic are reducable to LLVM + simpler cases of the intrinsic. How
would I go about conditionally reducing the intrinsic? I could deal
with the issue in the codegen, but that gets ugly quickly.
Andrew
Ok, I am developing an intrinsic instruction and I have the codegen
working (and tested). However, some of the more complex cases of the
intrinsic are reducable to LLVM + simpler cases of the intrinsic. How
would I go about conditionally reducing the intrinsic? I could deal
with the issue in the codegen, but that gets ugly quickly.
Andrew
I suppose you could do an LLVM->LLVM lowering pass which reduces the
complex cases to the simple cases. This would then allow you to use
the redundancy-elimination passes like LICM to clean up the resulting
code.
-Brian
Well, the complexity only occurs on x86, other archs are simpler. Since
this is not used much outside the c library, I can work around it in the
library and be satisifed with the simple case.
Oh, I suppose I should mention what I was working on. I made a syscall
intrinsic with codegen for linux/x86. It seemed a missing peice in
having a pure llvm compiled userland (mostly, being able to have a full
bytecode glibc).
On x86, syscalls with 5 and few args are passed in registers, but for
6, all args (except syscall number) go in a memory block passed as the
first argutment. I wasn't sure if when I generate code I could safely
manipulate the stack pointer (that it was gaurentied to be correct) at
the point I was inserting instructions. If so, then I can add the >6
case directly to the codegen.
Anyway, attached is some sample assembly output from a simple test
(write). There are some corners to clean up (the codegen only accepts
ints for args to a syscall, which requires casting).
Andrew
a.out.ll (1.83 KB)
Hi,
Oh, I suppose I should mention what I was working on. I made a syscall
intrinsic with codegen for linux/x86. It seemed a missing peice in
having a pure llvm compiled userland (mostly, being able to have a full
bytecode glibc).
This sounds like a good and useful thing. Are you coordinating
your work with John Criswell?
Well, the complexity only occurs on x86, other archs are simpler. Since
this is not used much outside the c library, I can work around it in the
library and be satisifed with the simple case.
Architecture-specific calling conventions are typically dealt with
in the various target support libraries (lib/Target/*).
I think what you should do for the X86 codegen is to use the
MachineFrameInfo to build your memory block and add the >6 case
"directly to the codegen", as you say. I recommend against inserting
MachineInstrs that explicitly manipulate the stack pointer; you are
likely to fall afoul of the PrologEpilogInserter and register
allocator.
-Brian
A good way to figure out how to do this is to look at the code that is
generated for the alloca instruction in the X86 backend.
-Chris