Inline Assembly

In order to get to the next stage with LLVM (like compiling a kernel) we
need to allow "pass through" of inline assembly so things like device
drivers, interrupt vectors, etc. can be written. While this feature
breaks the "pure" LLVM IR, I don't see any way around it.

So, I thought I'd bring it up here so we can discuss potential
implementations. I think we should take the "shoot yourself in the foot
approach". That is, we add an instruction type to LLVM that simply
encapsulates an assembly language statement. This instruction type is
just simply ignored (but retained) by all the optimization passes. When
code generation happens, the inline assembly is just blindly put out and
if the programmer has shot himself in the foot, so be it.

One other thing we can do that *might* be useful. If a function contains
only inline assembly instructions, we could circumvent the usual calling
conventions for that function.

Thoughts?

Reid.

Reid Spencer wrote:

In order to get to the next stage with LLVM (like compiling a kernel) we
need to allow "pass through" of inline assembly so things like device
drivers, interrupt vectors, etc. can be written. While this feature
breaks the "pure" LLVM IR, I don't see any way around it.

<shameless plug>
Actually, there should be a way around it. I'm currently working on extensions to LLVM for operating system support. You wouldn't be able to take the stock i386 Linux kernel and compile it, but you could write an operating system that would be completely compilable by LLVM (once I finish, that is).

Currently, I'm modifying the Linux kernel to use LLVM intrinsics instead of inline asm. Currently, the intrinsics are simply library routines linked into the kernel, but someday (if all goes according to plan) they will become LLVM intrinsics.
</shameless plug>

<technical aside>
The difficult part of an OS is not actually all the funky hardware stuff. The intrinsics for those are actually very straightforward and easy to implement. I/O, for example, is really volatile loads and stores with MEMBAR's. Registering interrupt handlers takes some very straitforward intrinsics. The I/O intrinsics are already implemented for LLVM in the x86 code generator (minus the FENCE/MEMBAR instructions).

The difficult part is the code of the OS that changes native hardware state. The kernel's code for changing the program counter to execute a signal handler, or the code in fork() that sets up the new process to return zero when it begins running for the first time: these are the hard parts, because native i386 state is visible in LLVM programs (more accurately; for our research, we don't want it visibile).
</technical aside>

So, I thought I'd bring it up here so we can discuss potential
implementations. I think we should take the "shoot yourself in the foot
approach". That is, we add an instruction type to LLVM that simply
encapsulates an assembly language statement. This instruction type is
just simply ignored (but retained) by all the optimization passes. When
code generation happens, the inline assembly is just blindly put out and
if the programmer has shot himself in the foot, so be it.

Question: Do you want inline asm to be able to compile programs out of the box? Or do you want it so that we can use native hardware features that we can't use now?

For the former, we need inline i386/sparc/whatever support. For the latter, LLVM intrinsics should do the trick, and do it rather portably.

The approach you suggest might work, although the code generator will need to know not to tromp on your registers, I guess.

The bigger problem is GCC. GCC provides extended inline asm stuff that will probably be painful to pass from GCC to LLVM (and Linux, BTW, uses this feature a lot).

Another thought:

My impression is that inline assembly bites us a lot not because it's used a lot but because the LLVM compiler enables #defines for the i386 platform that we don't support.

I think a lot of code has the following:

#ifdef _i386
inline asm
#else
slow C code
#endif

The LLVM GCC compiler still defines _i386 (or its equivalent), so configure and llvm-gcc end up trying to compile inline assembly code when they don't really need to.

I have to admit that this is an impression and not something I know for sure, but it seems reasonable that many application programs use i386 assembly because i386 is the most common platform, and speedups on it are good.

Changing llvm-gcc to disable the _i386-like macros might make compilation of userspace programs easier.

So, summary:

o If you just want access to native hardware, the intrinsics I'm developing will be much cleaner than inline asm support (and portable too).

o If you want inline asm to compile programs out of the box, it'll be more painful than what you've described.

o Changing llvm-gcc so that it doesn't look like an i386 compiler might make it easier to compile applications with optional inline asm.

Sorry if this is a bit rantish; my thoughts on the matter are not well organized.

One other thing we can do that *might* be useful. If a function contains
only inline assembly instructions, we could circumvent the usual calling
conventions for that function.

Thoughts?

Reid.

------------------------------------------------------------------------

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev

-- John T.

It's worse than just knowing what registers are used by inlined
assembler. You want the inline assembler to be able to reference local
and global variables and function arguments. Plus, you have to be able
to handle transfers of control inside the inlined assembler, such as a
return, a branch to a label defined outside of the inlined assembler, or
even calls to other functions (to properly handle inter-procedural
optimzation). It can get quite messy. It will be a lot of work to do
it as well as gcc or Microsoft's compiler.

Actually, there should be a way around it. I'm currently working on
extensions to LLVM for operating system support. You wouldn't be able
to take the stock i386 Linux kernel and compile it, but you could write
an operating system that would be completely compilable by LLVM (once I
finish, that is).

Being able to use intrinsics is definitely good, but it's not sufficient.
There will always be things we don't cover, and inline asm will be
required. In any case, compiling programs off the shelf certainly does
require inline asm support, so we do need it regardless of what intrinsics
we have.

The difficult part is the code of the OS that changes native hardware
state. The kernel's code for changing the program counter to execute a
signal handler, or the code in fork() that sets up the new process to
return zero when it begins running for the first time: these are the
hard parts, because native i386 state is visible in LLVM programs (more
accurately; for our research, we don't want it visibile).

Some things really do want to be written in inline asm, and those things
are obviously non-portable. This is not a problem, the goal of LLVM isn't
to turn every non-portable program into a portable one :slight_smile:

The bigger problem is GCC. GCC provides extended inline asm stuff that
will probably be painful to pass from GCC to LLVM (and Linux, BTW, uses
this feature a lot).

Actually, the inline asm support provided by GCC is quite well thought out
and makes a lot of sense (inline asms are required to define their side
effects in target-independent terms). The big complaint that I have is
it's incredibly baroque syntax. Eventually we should also support other
forms of inline asm by translating them into the LLVM inline asm format,
but keeping the inline asm format symantically equivalent to the GCC
format is basically what we want.

My impression is that inline assembly bites us a lot not because it's
used a lot but because the LLVM compiler enables #defines for the i386
platform that we don't support.

We should aspire to be as compatible with GCC as reasonable, and including
inline asm support is a big piece of that.

In terms of implementation, adding inline asm support is just a "small
matter of implementation": it shouldn't cause any fundamental problems
with the llvm design. In particular, LLVM should get an "asm"
Instruction, which takes a blob of text and some arguments. The big
missing feature in LLVM is multiple return value support, which is
required by asms that define multiple registers. My notes on multiple ret
values are here if anyone is interested:
http://nondot.org/sabre/LLVMNotes/MultipleReturnValues.txt

-Chris

When I was working on porting glibc (currently being held up by a C99
support bug) the most straight forward approach was to define a new
architecture string and implement a new target in glibc based on that
machine string.

So I propose that llvm-gcc not consider itself any type of x86-linux (or
what ever it platform it was compiled on), but rather create a new
architecture, say llvm (or perhaps 2, one for each bit and little
endian). Thuse llvm-gcc -dumpmachine would return llvm-os.

This would make system library (and OS kernel!) ports easier to maintain
since arch llvm would be supported by adding stuff rather than changing
stuff, and all the inline asm for known archs would go away and the C
version would be used. In most cases the config scripts should consider
compiling with llvm on a host as a cross compile from host arch to arch
llvm.

Andrew

So I propose that llvm-gcc not consider itself any type of x86-linux (or
what ever it platform it was compiled on), but rather create a new
architecture, say llvm (or perhaps 2, one for each bit and little
endian). Thuse llvm-gcc -dumpmachine would return llvm-os.

Hrm, I would much rather just have LLVM be a drop in replacement for a C
compiler. As such, it should expose identical #defines to GCC.

This would make system library (and OS kernel!) ports easier to maintain
since arch llvm would be supported by adding stuff rather than changing
stuff, and all the inline asm for known archs would go away and the C
version would be used. In most cases the config scripts should consider
compiling with llvm on a host as a cross compile from host arch to arch
llvm.

We *will* eventually support inline assembly, it just has not yet been
implemented yet. Patches accepted :slight_smile:

-Chris

A drop in replacement for "a C compiler" is rather a different
requirement than a drop in replacement for GCC. If the goal is pure GCC
compatibility then sure, identical defines are fine. My main point is
software tends to expect certain compilers (or a small number of
compilers) on certain platforms and tends to have work arounds (or
exploits) for their unique "features". Which do you want to fix? Make
config scripts and headers think they are compiling to an arch "llvm" or
make llvm work for evey assumption a piece of sw makes about how gcc
would behave on that arch/os? I guess you are thinking the latter. I
just thought that gcc has enough problems with things breaking when new
versions of gcc come out because they depended too closely on how gcc
behaved on a certain peice of code, that perhaps we don't want to try to
go down that path too. (for example, although the linux kernel supports
several versions of gcc, there have been a slew of patches recently to
make things work on 3.5).

An argument for a seperate arch string is also that it makes llvm
bytecode (especially with a fully ported c library) very close to being
identical for similar platforms (pointer size and endian size being the
same mostly). Unless I am missing something here. Obviously OS
specific interfaces cannot be general, but for most software, those are
wrapped by the c library.

Why don't we have platform independence as an optional bytecode feature
for well behaved programs? A couple intrinsics to do htonl and friends
would let a peice of bytecode be endian agnostic.

Andrew

> Hrm, I would much rather just have LLVM be a drop in replacement for a C
> compiler. As such, it should expose identical #defines to GCC.

A drop in replacement for "a C compiler" is rather a different
requirement than a drop in replacement for GCC. If the goal is pure GCC
compatibility then sure, identical defines are fine. My main point is

My goal is to get very close to being a drop-in replacement for GCC. As
time progresses, we will converge on that goal. An important part of that
though is that we don't have to be more compatible with GCC than it is
with itself: if something breaks between versions of GCC, we shouldn't
really have to worry about it either.

An argument for a seperate arch string is also that it makes llvm

A separate arch string would imply a LOT of complexity and many other
problems: it's not a panacea...

bytecode (especially with a fully ported c library) very close to being
identical for similar platforms (pointer size and endian size being the
same mostly). Unless I am missing something here. Obviously OS
specific interfaces cannot be general, but for most software, those are
wrapped by the c library.

Not true at all. In particular, various data structures have different
sizes based on their implementation (e.g. FILE), and many APIs may be
different. C is not a language that is designed to be portable.

*If* we controlled all of the header files, *and* restricted the C
compiler to reject "nonportable" features, then we could provide
portability for the C subset that is left. I actually think that this
would be a very interesting project, but it's not something I'm even
considering doing myself.

Why don't we have platform independence as an optional bytecode feature
for well behaved programs? A couple intrinsics to do htonl and friends
would let a peice of bytecode be endian agnostic.

LLVM bytecode files produced from portable languages (e.g. Java,
verifiable MSIL, or many others) should be portable, assuming the
front-end isn't doing something silly. The problem with C is C, not LLVM.

-Chris

Actually, I should clarify something there. You're right that *many* C
programs might be portable. For example, this is a portable C program:

int main() { return 42; }

The problem is that you want to be able to guarantee that a program, if
accepted by the compiler, is portable. This is not something we have
today.

Another problem is that most C programs use more of the standard library
than this program. The problem with the C standard library is that the
headers often include a ton of implementation details as inline functions
or macros. This is why you need to control the headers as well as the
language dialect being compiled.

I'm afraid that I'm sounding too negative about this idea, but I don't
mean to. If this was implemented, it would be a huge boon to the free
software and open source communities: suddenly programs could be
distributed in binary form instead of source form, easing distribution.
Getting to this point though will take a lot of work. :slight_smile:

-Chris