Endian Emulation Help

I've been working on a little home project to port llvm/clang to the Stratus VOS operating system. So far, it's been pretty straightforward, but I have hit an area where I could use some advice.

The biggest issue with a VOS port is the compilers are expected to always present a big endian view of memory to the programmer. There are a number of reasons we do that, but the over-riding one is that we are a high availability company and have customers who run 24/7. Even when upgrading to new hardware, the customers expect to only be down for a few minutes. This doesn't allow for converting large databases from big endian to little endian. Also, it turns out that doing endian emulation is only about a 10% performance hit when optimized appropriately. This is often acceptable just to avoid a lot of software maintenance.

I am doing the endian emulation by writing a function pass to insert bswap intrinsics for scalars and shuffles for vectors (with the appropriate casting). The LLVM IR is an amazingly good fit for this, BTW.

I thought I was just about done -- all I had left was dealing with the initialization (and some function pointer issues, but that's another story). That's where things get interesting. Integers aren't a problem, but I have to be able to emit ELF for byte swapped relocatable pointers (VOS extensions) in GlobalVariable initialization. It also seemed unwise to mess with byte swapping floating point constants until they actually get emitted.

It looks to me like the best way to represent this is to represent the bswap as a constant expression. Am I on the right track there?

If I am going to do that, there would be three approaches:

A. Cast everything as a vector of i8 and use the shuffle constant expression operator. The catch is that instruction doesn't appear to be supported everywhere (in particular ExecutionEngine.cpp). It will also make constant propagation harder, because one should turn some of the shuffles into bswaps when transforming constant expressions into instructions.

B. Add a new constant expression subclass that will bswap any first class type. This might well be too disruptive. Also, it has the same problem with constant propagation.

C. Use shifts and masks. Except they also aren't implemented everywhere. And there is the same problem with constant propagation.

D. Implement the BSWAP intrinsic in constant expressions (with the call instruction limited to just intrinsics when used in constant expressions). I like this, because it parallels the instruction IR and this looks like it mostly just works in the constant folding code.

Any advice here?

TIA
Herbie Robinson