Since it is available in several and different platforms like Intel x86, PowerPc and Sparc, how do you handle the differences between these platforms in your bytecode?
For example, Can a bytecode generated under MacOS run in the LLVM version under Linux x86? How do you handle things like little endian / big endian? How do you handle the difference in pointer sizes: Sparcs’ 64 bit pointers and Intel x86 32 bits pointers?
There are two ways to do this. A front-end (e.g. our Java front-end in development) that generates portable LLVM code (code that does not depend on the target) can just do so and things will magically work. LLVM bytecode like this is portable.
For front-ends that compile non-type-safe languages (e.g. our C/C++ front-ends), we explicitly encode the target, and some additional information (including pointersize and endianness) in the bytecode file. These files MAY be portable, or they may not be, but there is no guarantee. In the case of C/C++, basically anything that includes a standard header will not be portable, at least not across systems with different implementations of libc.
Both and either. If you include a standard system header, this will pull in system specific #defines and inline functions. These won't work if you move to another system that doesn't match those.
Hmm, do you know of any glibc headers that pull in CPU-arch specific
code? My cunning plan of using LLVM to distribute Linux CPU-arch
independent binaries may have a slight hole if so
Yes, I think that it will probably work, at least for a bunch of programs. Another thing you could do is build a modified distribution of glibc headers that had the problematic features pulled out of line. LLVM already disables inline assembly in the headers, so that's about 75% of the battle...