LLVM and libc

John Criswell kirjoitti:

So, let me see if I understand this right:

First, it sounds like you're programming on the bare processor, so your
I/O instructions are either special processor instructions or volatile
loads/stores to special memory locations.

Yes. In more detail, instruction words directly control the
data transports inside the processor, and I/O is handled
by transporting data into a special function unit.

In that case, you would not
use a "system call" intrinsic;

Correct.

you would use an ioread/iowrite intrinsic
(these are similar to load/store and are briefly documented in the
LLVA-OS paper).

Which I should probably read, it seems.

  If you're doing memory mapped I/O, you could probably

use LLVM volatile load/store instructions and not have to add any
intrinsics.

We could do memory mapped I/O, but I don't thing it is being
considered.

Second, you could implement these "intrinsics" as either external
functions or as LLVM intrinsic functions specially handled by the code
generator. I believe you're saying that the encoding of the I/O
instructions change.

It is up to the processor designer to decide where the
I/O units are located in the processor, so yes.

If that is the case, then you are right: adding an
intrinsic and having the LLVM code generator generate the right
instructions would probably be easiest.

Do I understand things correctly?

It would seem so. Thanks for your insights!

Why do you need this?

-Chris

Why? As an end user, I'd be very unhappy if I got different code from -emit-llvm + llc then from normal llvm-gcc.

-Chris

Chris Lattner wrote:

BTW: Would be nice if the frontend defined a manifest constant if it is
generating byte code vice generating native. But that's a refinement for
another day...

Why? As an end user, I'd be very unhappy if I got different code from
-emit-llvm + llc then from normal llvm-gcc.

For example, I've been messing around with various high performance code
in which the codebase has three versions of code in certain functions:
one that uses AltiVec intrinsics, another that uses SSE[23] intrinsics,
and plain ol' C code.

Yes, llvm handles the intrinsics just fine but what about the case when
llvm can't/won't/doesn't?

Just a thought... I'm inclined to agree with you about bytecode
versions, since one ought not be messing around with bytecode-specific
anything. But knowing whether bytecode is being emitted could be useful.

-scooter

For example, I've been messing around with various high performance code
in which the codebase has three versions of code in certain functions:
one that uses AltiVec intrinsics, another that uses SSE[23] intrinsics,
and plain ol' C code.

ok

Yes, llvm handles the intrinsics just fine but what about the case when
llvm can't/won't/doesn't?

If llvm can't handle them with -emit-llvm, it can't handle them without it. -emit-llvm is orthogonal to language/feature support.

Just a thought... I'm inclined to agree with you about bytecode
versions, since one ought not be messing around with bytecode-specific
anything. But knowing whether bytecode is being emitted could be useful.

This is n detail about *how* something is being compiled, not about any code-visible feature.

The best analogy is 'gcc -S x.c; as x.s' vs 'gcc -c x.c'. There is no #define available for test for -S'ness.

-Chris

Pertti Kellomäki wrote:

John Criswell kirjoitti:
  

So, let me see if I understand this right:

First, it sounds like you're programming on the bare processor, so your
I/O instructions are either special processor instructions or volatile
loads/stores to special memory locations.
    
Yes. In more detail, instruction words directly control the
data transports inside the processor, and I/O is handled
by transporting data into a special function unit.
  

So a particular instruction actually specifies where in the
miroarchitecture a particular piece of data should go. Directing it to
a specific functional unit makes it do I/O. Right?

In that case, you would not
use a "system call" intrinsic;
    
Correct.

you would use an ioread/iowrite intrinsic
(these are similar to load/store and are briefly documented in the
LLVA-OS paper).
    
Which I should probably read, it seems.
  

The LLVA-OS paper contains descriptions of intrinsics that we would add
to LLVM to support an operating system. In our implementation, we wrote
them all as external library functions. While I suspect that most of
our paper is not relevant in solving the problems you're working on,
some bits of information (like ioread/iowrite and llva_syscall) may be
of interest to you.

If you read about something in the LLVA-OS paper and have questions on
it, please feel free to ask. The paper is rather light on details
because we had an incredibly short page limit.

So, llva_ioread() and llva_iowrite() read and write values to I/O
locations. The locations can be anything: I/O port numbers (e.g. x86),
the memory address of a memory mapped device register, or the identifier
of a functional unit. The important part is that llva_ioread() and
llva_iowrite() are code generated into the correct assembly code
sequences for I/O and, at the LLVM level, they are considered volatile,
so they are not moved around or eliminated by LLVM optimization passes.

  If you're doing memory mapped I/O, you could probably
  

use LLVM volatile load/store instructions and not have to add any
intrinsics.
    
We could do memory mapped I/O, but I don't thing it is being
considered.

Second, you could implement these "intrinsics" as either external
functions or as LLVM intrinsic functions specially handled by the code
generator. I believe you're saying that the encoding of the I/O
instructions change.
    
It is up to the processor designer to decide where the
I/O units are located in the processor, so yes.
  

If you end up designing a new intrinsic for handling I/O due to any
limitations in ioread/iowrite, please let us know. We wanted our LLVA
intrinsics to be adaptable to novel hardware designs; we'd appreciate
any feedback we can get.

  

If that is the case, then you are right: adding an
intrinsic and having the LLVM code generator generate the right
instructions would probably be easiest.

Do I understand things correctly?
    
It would seem so. Thanks for your insights!
  

Your welcome.

-- John T.

John Criswell kirjoitti:

So a particular instruction actually specifies where in the
miroarchitecture a particular piece of data should go. Directing it to
a specific functional unit makes it do I/O. Right?

Yes. For each transport bus inside the processor, there is a slice
in each instruction that specifies how data is transported on the bus.
Computation occurs as a side-effect of data transports. This includes
loads and stores to data memory, and I/O as well.

If you read about something in the LLVA-OS paper and have questions on
it, please feel free to ask.

Will do!

Just to keep people informed, we had an internal discussion
about the need for libc I/O functions. At least for now, we
decided to skip them, the rationale being that in our intended
applications I/O will be application specific anyway.