LLVM and Cell processor

Hi all,

I've been following LLVM for a couple of years, and though I've not found an occasion to use it (yet), I'm always curious about its potential uses.

I was discussing with friends about the Cell processor, and some of the arguments were that it is difficult to program and difficult to optimize. One of the concerns was that the PPE (central processor) is "in order", which implies that the compiler has to be able to optimize the code in a much more advanced way.

I don't know if LLVM will support the Cell, but I was wondering if there are features that allow to produce code that will be optimized for "in order" processors.

Thanks :slight_smile:

  -- Sébastien

I've been following LLVM for a couple of years, and though I've not
found an occasion to use it (yet), I'm always curious about its
potential uses.

cool. :slight_smile:

I was discussing with friends about the Cell processor, and some of
the arguments were that it is difficult to program and difficult to
optimize. One of the concerns was that the PPE (central processor) is
"in order", which implies that the compiler has to be able to
optimize the code in a much more advanced way.

Right.

I don't know if LLVM will support the Cell, but I was wondering if
there are features that allow to produce code that will be optimized
for "in order" processors.

Sure. LLVM has strong support for vector operations and has support for low-level scheduling with hazard recognition/pipeline modeling. I'm not aware of anyone working on a Cell port of LLVM, but it would be a welcome addition if anyone were interested in tackling the project.

-Chris

I was going through documentation and source lately, and I decided how to
make llvm bytecode more compatible to C++:
1) thiscall sould be introduced, which would take N arguments and the
first argument would always be the C++ "this" argument. This would
abstract llvm compiler dependant C++ code emition.

2) the ret instruction should be able to return structs (as Chris has
already written on his page)

3) the EH could be done at code emission and would leave bytecodes
portable. Each backend would emit code compatible (if possible) with the
compiler it is compiled with; if llvm is compiled with VC8, it would emit
code compatible with VC8.

I will probably soon begin with development. Please let me know if there
is any thing more I need to consider/know, or if there is someting wrong
with the design.

Regards,
Žiga

I was going through documentation and source lately, and I decided how to
make llvm bytecode more compatible to C++:
1) thiscall sould be introduced, which would take N arguments and the
first argument would always be the C++ "this" argument. This would
abstract llvm compiler dependant C++ code emition.

While that could be done, it would be redundant with existing
facilities. One of the key design criteria for LLVM is to keep its IR
"low level" and "not redundant". That is, when we introduce a facility
into the IR its because there is no other way to do it. A "thiscall"
instruction is completely redundant with a regular call that includes
the "this" argument.

This design is one of the reasons optimization passes are relatively
simple to write in LLVM. There aren't hundreds of special cases to
consider.

2) the ret instruction should be able to return structs (as Chris has
already written on his page)

Again, it can be implemented using a pointer argument. I'm not going to
lament this issue any more. While it would be convenient to return
structs from a front end point of view, its not convenient for LLVM or
the backends.

3) the EH could be done at code emission and would leave bytecodes
portable. Each backend would emit code compatible (if possible) with the
compiler it is compiled with; if llvm is compiled with VC8, it would emit
code compatible with VC8.

That needs to be done anyway (emitting code that meets the platform ABI
for exceptions), but I think you're still going to need to transmit some
kind of exception information through the LLVM IR. This is what the
invoke/unwind instructions were intended for. However, there are other
details that make exception handling language, front-end compiler, and
ABI specific.

I will probably soon begin with development. Please let me know if there
is any thing more I need to consider/know, or if there is someting wrong
with the design.

I don't know what the design is, so its hard to say. If you've been
discussing this with Chris, he's out on vacation until Monday and will
probably respond then.

Regards,
Žiga

Reid.

Sébastien Pierre wrote:

Hi all,

I've been following LLVM for a couple of years, and though I've not
found an occasion to use it (yet), I'm always curious about its
potential uses.

I was discussing with friends about the Cell processor, and some of
the arguments were that it is difficult to program and difficult to
optimize. One of the concerns was that the PPE (central processor) is
"in order", which implies that the compiler has to be able to
optimize the code in a much more advanced way.

I don't know if LLVM will support the Cell, but I was wondering if
there are features that allow to produce code that will be optimized
for "in order" processors.

For a real mind bender, you should have heard Marc Tremblay at
Supercomputing talk about the new Scout feature on Niagara -- O-O
execution with O-O retirement. I'm still working on getting all of the
slides from the General-Purpose GPU Computing: Practice and Experience
workshop (Burton Smith's slides were also pretty interesting.)

I'm not sure whether in-order really matters, per se. What does matter
is how to partition tasks and work between the PPE and the SPE(s), which
is no trivial task. For that matter, the same kind of problem crops up
if you want to generate ATI CTM assembly (also NVIDIA CUDA) and
partition work between your CPU and GPU.

I've been hacking away at a Cell SPE module and making some progress. I
got diverted recently into gcc4 frontend issues as I was attempting to
compile vim and ncurses into bytecode...

-scooter