I am new to LLVM and I am interested in using LLVM for
an application that I plan to develop on Solaris. I
want to execute the generaed code on different
machines (both little endian and big endian). In my
application endianess is important. How does LLVM
deal with Endian issues? Does LLVM take care of byte
swapping?
Thanks
If you compile a program from a type-safe language (like Java) to LLVM, the endianness won't matter: you can move the LLVM bytecode file around just like you can a JVM bytecode file without a problem.
For C/C++, matters are a little bit more difficult. There, the endianness CAN make a difference (due to "unsafe" casting and such). To handle this, LLVM allows the front-end to record the endianness of the host the program was compiled on, and the C/C++ front-end does so.
One problem, however, is that the code generators do not yet insert the byteswap instructions they need. At one point we had this (which allowed us to run simple Sparc programs on X86), and it worked great. The problem was that non-trivial programs use system header files, and the system header files are not compatible across systems. As a simple example, on Solaris, stdin is #defined to "__iob[0]" or something: if you try to run a program that uses stdin/out/err on a linux box (even with byte swapping), you will get an "__iob symbol not defined" error.
The only way that I'm aware of to work around this is to build a target-independent C library, or (better yet) port another libc to work on solaris and with LLVM. This would be an incredibly useful project. Once this was done, adding back the byte swapping routines is really simple.
-Chris