From: "Andrew Trick" < atrick@apple.com >
To: "Hal Finkel" < hfinkel@anl.gov >
Cc: " llvmdev@cs.uiuc.edu List" < llvmdev@cs.uiuc.edu >, "Michele
Scandale" < michele.scandale@gmail.com >
Sent: Friday, February 8, 2013 1:52:55 PM
Subject: llvm.meta (was Rotated loop identification)
As long as this is brainstorming time, I actually like the idea of
an
llvm.invariant intrinsic that the optimizers know to ignore. I
like
it for other purposes, but would happen to work for you as a
temporary workaround. It could take one or two IR values (as
metadata operands) and a metadata language describing the
invariant,
such as a relational operator and optional constant. In your case,
you want to know that the loop counter's starting value is less
than
its limit, so you could conveniently plop one of those in the loop
preheader. The invariant would only go away if no one else used
the
value, which in your case would make sense (e.g. if the loop test
were rewritten in terms of %b, you probably wouldn't need the
invariant any more).
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121210/158601.html
If you're in favor of this approach, can I pop out of the woodwork
and say that it appears that we have a reasonably-large group of
contributors in favor of an invariant intrinsic and we should move
forward on that basis?
A general llvm.invariant does seem like a convenient thing. I'm
not
pushing very hard because I don't need it for anything I'm working
on yet, and wasn't aware that anyone else liked this particular
idea. But I haven't heard any argument against it. I am aware that
a
lot of people want to attach new semantics to the IR that are not
necessarily easy to preserve, so I don't want to oversell a
solution.
I don't have an opinion on whether each type of invariant should
get
its own intrinsic ID or we should have one and use the flexibility
of metadata operands. The important thing is that the optimizer
have
a uniform approach to handling them, and that they truly cannot
interfere with optimization other than as intended. For the sake
of
discussion, and to avoid confusion with the current, badly named
llvm.invariant, I'll just call these llvm.meta intrinsics.
I would only endorse a new llvm.meta approach if we're sure we can
unify or replace most of the current meta-style intrinsics. But I
first need to better understand the motivation for those
approaches,
which I have not personally worked with much (they're somewhat
marginalized).
Thanks for writing this! As you point out there are already a
number of "pseudo-users" in LLVM, and we don't have a good general
scheme for handling them. Currently, special handling for
llvm.dbg, expect, etc. are coded into several (many) different
places to make them appear free, and adding more in a similar way
might become unmanageable. Even worse, these intrinsics break the
"hasOneUser" check, and this interferes with optimization. This
may not be important for debug intrinsics, but certainly is for
features intended to help optimization.
Based on this, it seems that we need to differentiate two classes
of users: real users and, as Chandler called them, ephemeral
users. We could update hasOneUser() to be hasOneUser(bool
countEphemerals = false), or something like that, and the trick
will be to support this without requiring additional iteration
over all users. Thoughts?
Redefining hasOneUser() would be the most efficient approach, and
definitely possible. But it would add complexity to a fundamental
property of the IR and potentially cause bugs when developers forget
to check the meta users.
I was proposing to use meta-data operands in the intrinsic instead.
They add no visible users. They can be left dangling and cleaned up
lazily. WeakVH are expensive, but I'm not worried about cost yet.
The added complexity in this case happens at SSA update, which does
need to visit meta users explicitly. I believe this is currently
broken for dbg.value and works by luck for simple cases. So we need
to fix this one way or another.
I'd still like to be able to chart a way forward on this. Regarding
defining invariants, I don't understand how using metadata
intrinsics really helps because, while a metadata operand could be
used to tie the invariant to a value being constrained, the
invariant expression itself would add additional uses to the values
used in its subexpressions. When I had discussed this with Chandler
(many months ago now), we had developed a working assumption that if
the invariant is worth having, then it is worth carrying as a
dependency of the values it constrains. This may not be true, but I
think is a viewpoint worth evaluating.
Metadata is a better approach because it doesn’t interfere with
optimizations that are free to ignore it.
Yes, if the invariant expression requires IR values and real uses,
then that defeats the purpose. I prefer a meta-data language to the
alternative of ephemeral expressions. I could be convinced either
way. I just think it will be confusing to mix ephemeral expressions
with program logic and impractical to make it transparent to
optimizations. Not the end of the world though, and I’ll listen to
any strong arguments to the contrary.
When you say "metadata language", what do you have in mind? I can think of two general schemes:
1. A language in which each expression in the language is a metadata node; maybe something like:
!invariant1 = !{ !"eq", !{ !"urem", i32 %value, i32 48 }, i32 0 }
2. A language encoded in a string with some syntax for substitution values; maybe something like:
!invariant1 = !{ !"(eq (urem #1, 48), 0)", i32 %value}
On the other hand, when I started down this road I was motivated by a
desire to implement support for __builtin_assume_aligned. I would
still like to do this, or something like this to get equivalent
functionality. The idea of metadata references seems like it might
work well for alignment assumptions, as a restricted special case.
Maybe something like this:
tail call void @llvm.assume.aligned(metadata !{i32* %ptr}, i32 16)
I think this concept is sound except for the problem of enforcing SSA
properties. Do meta-data operands need to dominate? If so, we need
to fix SSA updater to deal with this. Preserving the meta-data after
code duplication would effectively require a meta-phi. We don’t want
a real phi because we don’t want a real use. Admittedly this would
be annoying, but we could always drop the metadata users instead.
You could get by initially without solving this problem, as we’ve
done with dbg.value.
I’m a bit less enthusiastic about meta-intrinsics after realizing
that the only current example, dbg.value, is a poor use of the
concept. It should have been meta-data attached directly to the
value with source line info instead.
What if we say that they don't need to dominate? The dependency graph can certainly be well defined without the vector of IR instructions being a topological ordering of that graph. I understand that we like it this way for a number of reasons, but it is not clear to me that any of those reasons apply in this case. If the metadata intrinsic does not return a value, we can't introduce unwanted cycles. If they do return a value and could introduce a cycle, then we do decide how we would handle that.
If we could demonstrate the advantage of converting lifetime
intrinsics to meta-instrinsics (not real uses), then I’d be all for
it.
but then the problem becomes making sure that these things stick
around long enough to help the interesting inlining cases. Thoughts?
Ensuring meta-data sticks around seems like the dual of the problem
of ensuring ephemeral values and uses don’t interfere with
optimization. I’d rather have the first problem? Is it really
harder?
So, you have two problems:
1) Expressing the invariants. A meta-data language could be invented
for this initially. Or simply constant operands to an alignment
intrinsic as in your example.
2) Associating the invariants with IR values. Options are, in order
from the easiest to implement and sell to the public to the most
complicated but robust:
2a) Attach meta-data to the value. This only works when constraint
dominates all uses. Inlining would need to check before propagating.
2b) Add a real use to a special intrinsic: tail call void
@llvm.assume.aligned(i32* %ptr, i32 16)
2c) Add a meta-use to an intrinsic (pure meta-intrinsic): tail call
void @llvm.assume.aligned(metadata !{i32* %ptr}, i32 16)
I think it’s a question of whether the simpler approaches work well
enough for you.
Is it obvious which of these will be easier to deal with in InstCombine? SROA? -- I think that adding a real use will kill a lot of optimizations that InstCombine might do. On the other hand, updating InstCombine to propagate the invariants might also be quite complicated (this is the benefit of adding a real use... we get the "updating" of the invariants for free, and optimizations are blocked only when they'd kill the invariant (or, at least that's the theory)).
Thanks again,
Hal