Alternate instruction sequences

I was wondering, is there any way to express in the IR that an instruction/instruction sequence/basic block/region/function/module/whatever is an alternate version of another? e.g. let's keep things simple and say that I have an instruction I. An optimization pass reads it and could be able to produce one or more functionally-equivalent instructions I1, ..., In without being really able to ascertain which one is "best" w.r.t. the optimization goal. While this could be true even for a single, isolated, pass it's all the more true for chains of optimizations (i.e. 'which one of the alternate instructions will be the "best" after all other optimizations have run?'). In this case, wouldn't it be interesting to be able to insert into the IR all (or some of) the alternate versions, mark them as such and defer the decision about which one will be used after all optimizations have run?
B.r.,
Carlo Alberto

There is not a way to represent --- instruction I1 is an alternative for instruction I2 --- in LLVM IR.

Could there be any interest in this functionality? Do you think bending the meaning of existing constructs like
select i1 undef, <ty> <val0>, <ty> <val1>
(for instructions) or
switch i1 undef, label <bb0>, i1 1, label <bb1>
(for basic blocks) could be feasible/acceptable?

I forgot, the above instructions should be (optionally) marked with metadata so that alternates-aware passes can recognize them, e.g.
select i1 undef, <ty> <val0>, <ty> <val1>, !alternates
switch i1 undef, label <bb0>, i1 1, label <bb1>, !alternates

Hi Carlo, the general policy is to canonicalize IR to some particular
choice. For example, if I1, I2, etc are all equivalent to each other,
then usually one of them will have been chosen as the canonical form
(say I1) and all the others will be turned into I1. If this is not
the best choice for some target, then the code generators for that
target need to transform it into something better for the target, but
this is not at the IR level.

Ciao, Duncan.

If you're interested in this kind of thing, you might want to have a look at
these papers:

Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality
saturation: a new approach to optimization. In Proceedings of the 36th
annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
(POPL '09). ACM, New York, NY, USA, 264-276. DOI=10.1145/1480881.1480915
http://doi.acm.org/10.1145/1480881.1480915

Michael Stepp, Ross Tate, and Sorin Lerner. 2011. Equality-based translation
validator for LLVM. In Proceedings of the 23rd international conference on
Computer aided verification (CAV'11), Ganesh Gopalakrishnan and Shaz Qadeer
(Eds.). Springer-Verlag, Berlin, Heidelberg, 737-742.
Doi: 10.1007/978-3-642-22110-1_59
http://dx.doi.org/10.1007/978-3-642-22110-1_59

The first one talks about such optimizations for Java bytecode, while the
second one is about translation validation for LLVM. However, they use the
same framework, so I'm guessing optimizations for LLVM should also be within
reach. Their framework builds a graph representing a large number of
alternate versions of the same program, and optimization is essentially the
task of selecting a "best" alternative.

This might work for single instructions or simple instruction sequences but doesn't apply to, let's say, regions. As an example, take a loop that could be rewritten (unrolled, tiled, etc.) so that different versions of the same loop have wildly different memory access patterns. In this case, in order to choose the "best" version, the IR-level transform would have to know the details of the cache/memory subsytems of the current arch.

> Hi Carlo, the general policy is to canonicalize IR to some particular
> choice. For example, if I1, I2, etc are all equivalent to each
> other,
> then usually one of them will have been chosen as the canonical form
> (say I1) and all the others will be turned into I1. If this is not
> the best choice for some target, then the code generators for that
> target need to transform it into something better for the target, but
> this is not at the IR level.

This might work for single instructions or simple instruction sequences
but doesn't apply to, let's say, regions. As an example, take a loop
that could be rewritten (unrolled, tiled, etc.) so that different
versions of the same loop have wildly different memory access patterns.
In this case, in order to choose the "best" version, the IR-level
transform would have to know the details of the cache/memory subsytems
of the current arch.

There has been some good work in this general area by Jason Ansel and
collaborators at MIT. They have developed a project called PetaBricks
(which has been released under the MIT license) --
http://projects.csail.mit.edu/petabricks/ -- and I think that this is
going to be a very important technique, at least for numerical
algorithms, in the near future. As there is apparently interest, I think
that we should strongly consider added an "alternatives" feature (on the
block and/or function level) to the LLVM IR in order to facilitate
implementing this kind of approach. I think that it is often true that
only the backend code generator has enough information to choose between
alternative implementations, and so I think this would be an appropriate
addition. If we specify that each alternative must have identical side
effects, then I think that the default (backward-compatible)
implementation is easy: ignore the alternatives.

-Hal

Probably adding an explicit mechanism will be met with considerable resistence because it involves modifying the IR. That's why I was proposing to sidestep the issue by leveraging what the IR already provides.
Unfortunately I haven't heard any suggestion/feedback about the proposed syntax. Should I interpret this lack of comments as an indication of it being a good solution or what? Can I go ahead and make a patch for the docs to formalize the idea?
B.r.
Carlo Alberto