we're trying to come up with a static compiler based on llvm for a 4-way vliw architecture with full support for predicated execution. a distinguishing feature is the omission of flag registers. instead, conditions can be paired with a particular instruction within the same bundle.
a performance critical issue will be proper use of predicated execution. if-conversion can either be performed early in the code generation process  (exposing larger basic blocks to the optimizers) or deferred until code generation is almost
for the time being, i'm planning to go with the second approach and have a late optimization pass over the selected machine instructions that
a] preliminarily schedules and bundles the selected instructions
b] speculatively executes instructions in the predecessor block if there are unused resources
c] converts blocks B into a appropriately predicated version (eliminating branches) if it's profitable for the particular architecture.
this approach will require only some minor extensions to the existing class hierarchy to represent instruction bundles and predicates. additionally, it appears to be more reasonable to make the decision what to execute conditionally late in the compilation process when we can easily account for architectural constraints than perform if-conversion early and undoing it if necessary. also, previous work  indicates, that this approach is able to retain almost all opportunities for if-conversion present in the input program.
for what i've seen so far (i'm not yet fully familiar with llvm), there's almost no existing support for predicated execution. however, architectures such as arm and ia64 should show a significant speedup. are there already people working on those enhancements or are there any short-term plans? also, does anybody have strong objections against the approach outlined above? any comments are kindly welcome!
cheers and thanks in advance,