Reid Spencer wrote:
In order to get to the next stage with LLVM (like compiling a kernel) we
need to allow "pass through" of inline assembly so things like device
drivers, interrupt vectors, etc. can be written. While this feature
breaks the "pure" LLVM IR, I don't see any way around it.
Actually, there should be a way around it. I'm currently working on extensions to LLVM for operating system support. You wouldn't be able to take the stock i386 Linux kernel and compile it, but you could write an operating system that would be completely compilable by LLVM (once I finish, that is).
Currently, I'm modifying the Linux kernel to use LLVM intrinsics instead of inline asm. Currently, the intrinsics are simply library routines linked into the kernel, but someday (if all goes according to plan) they will become LLVM intrinsics.
The difficult part of an OS is not actually all the funky hardware stuff. The intrinsics for those are actually very straightforward and easy to implement. I/O, for example, is really volatile loads and stores with MEMBAR's. Registering interrupt handlers takes some very straitforward intrinsics. The I/O intrinsics are already implemented for LLVM in the x86 code generator (minus the FENCE/MEMBAR instructions).
The difficult part is the code of the OS that changes native hardware state. The kernel's code for changing the program counter to execute a signal handler, or the code in fork() that sets up the new process to return zero when it begins running for the first time: these are the hard parts, because native i386 state is visible in LLVM programs (more accurately; for our research, we don't want it visibile).
So, I thought I'd bring it up here so we can discuss potential
implementations. I think we should take the "shoot yourself in the foot
approach". That is, we add an instruction type to LLVM that simply
encapsulates an assembly language statement. This instruction type is
just simply ignored (but retained) by all the optimization passes. When
code generation happens, the inline assembly is just blindly put out and
if the programmer has shot himself in the foot, so be it.
Question: Do you want inline asm to be able to compile programs out of the box? Or do you want it so that we can use native hardware features that we can't use now?
For the former, we need inline i386/sparc/whatever support. For the latter, LLVM intrinsics should do the trick, and do it rather portably.
The approach you suggest might work, although the code generator will need to know not to tromp on your registers, I guess.
The bigger problem is GCC. GCC provides extended inline asm stuff that will probably be painful to pass from GCC to LLVM (and Linux, BTW, uses this feature a lot).
My impression is that inline assembly bites us a lot not because it's used a lot but because the LLVM compiler enables #defines for the i386 platform that we don't support.
I think a lot of code has the following:
slow C code
The LLVM GCC compiler still defines _i386 (or its equivalent), so configure and llvm-gcc end up trying to compile inline assembly code when they don't really need to.
I have to admit that this is an impression and not something I know for sure, but it seems reasonable that many application programs use i386 assembly because i386 is the most common platform, and speedups on it are good.
Changing llvm-gcc to disable the _i386-like macros might make compilation of userspace programs easier.
o If you just want access to native hardware, the intrinsics I'm developing will be much cleaner than inline asm support (and portable too).
o If you want inline asm to compile programs out of the box, it'll be more painful than what you've described.
o Changing llvm-gcc so that it doesn't look like an i386 compiler might make it easier to compile applications with optional inline asm.
Sorry if this is a bit rantish; my thoughts on the matter are not well organized.
One other thing we can do that *might* be useful. If a function contains
only inline assembly instructions, we could circumvent the usual calling
conventions for that function.
LLVM Developers mailing list
-- John T.