The LLVM Module has an optional target triple and target datalayout.
Without them, an llvm::DataLayout can't be constructed with meaningful
data. The benefit to making them optional is to permit optimization that
would work across all possible DataLayouts, then allow us to commit to a
particular one at a later point in time, thereby performing more
optimization in advance.
This feature is not being used. Instead, every user of LLVM IR in a
portability system defines one or more standardized datalayouts for their
platform, and shims to place calls with the outside world. The primary
reason for this is that independence from DataLayout is not sufficient to
achieve portability because it doesn't also represent ABI lowering
constraints. If you have a system that attempts to use LLVM IR in a
portable fashion and does it without standardizing on a datalayout, please
share your experience.
Nick, I don't have a current system in place, but I do want to put
forward an alternate perspective.
We've been looking at doing late insertion of safepoints for garbage
collection. One of the properties that we end up needing to preserve
through all the optimizations which precede our custom rewriting phase is
that the optimizer has not chosen to "hide" pointers from us by using
ptrtoint and integer math tricks. Currently, we're simply running a
verification pass before our rewrite, but I'm very interested long term in
constructing ways to ensure a "gc safe" set of optimization passes.
As a general rule passes need to support the whole of what the IR can
support. Trying to operate on a subset of IR seems like a losing battle,
unless you can show a mapping from one to the other (ie., using code
duplication to remove all unnatural loops from IR, or collapsing a function
to having a single exit node).
What language were you planning to do this for? Does the language permit
the user to convert pointers to integers and vice versa? If so, what do you
do if the user program writes a pointer out to a file, reads it back in
later, and uses it?
Java - which does not permit arbitrary pointer manipulation. (Well,
without resorting to mechanism like JNI and sun.misc.Unsafe. Doing so
would be explicitly undefined behavior though.) We also use raw pointer
manipulations in our implementation (which is eventually inlined), but this
happens after the safepoint insertion rewrite.
We strictly control the input IR. As a result, I can insure that the
initial IR meets our subset requirements. In practice, all of the opto
passes appear to preserve these invariants (i.e. not introducing inttoptr),
but we'd like to justify that a bit more.
One of the ways I've been thinking about - but haven't actually
implemented yet - is to deny the optimization passes information about
pointer sizing.
Right, pointer size (address space size) will become known to all parts
of the compiler. It's not even going to be just the optimizations,
ConstantExpr::get is going to grow smarter because of this, as
lib/Analysis/ConstantFolding.cpp merges into lib/IR/ConstantFold.cpp. That
is one of the major benefits that's driving this. (All parts of the
compiler will also know endian-ness, which means we can constant fold
loads, too.)
I would argue that all of the pieces you mentioned are performing
optimizations. However, the exact semantics are unimportant for the
overall discussion.
Under the assumption that an opto pass can't insert an ptrtoint cast
without knowing a safe integer size to use, this seems like it would outlaw
a class of optimizations we'd be broken by.
Optimization passes generally prefer converting ptrtoint and inttoptr to
GEPs whenever possible.
This is good to hear and helps us.
I expect that we'll end up with *fewer* ptr<->int conversions with this
change, because we'll know enough about the target to convert them into
GEPs.
Er, I'm confused by this. Why would not knowing the size of a pointer
case a GEP to be converted to a ptr <-> int conversion?
Having target data means we can convert inttoptr/ptrtoint into GEPs,
particularly in constant expression folding.
Or do you mean that after the change conversions in the original input IR
are more likely to be recognized?
My understanding is that the only current way to do this would be to not
specify a DataLayout. (And hack a few places with built in assumptions.
Let's ignore that for the moment.) With your proposed change, would there
be a clean way to express something like this?
I think your GC placement algorithm needs to handle inttoptr and
ptrtoint, whichever way this discussion goes. Sorry. I'd be happy to hear
others chime in -- I know I'm not an expert in this area or about GCs --
but I don't find this rationale compelling.
The key assumption I didn't initially explain is that the initial IR
couldn't contain conversions. With that added, do you still see concerns?
I'm fairly sure I don't need to handle general ptr <-> int conversions. If
I'm wrong, I'd really like to know it.
So we met at the social and talked about this at length. I'll repeat most
of the conversation so that it's on the mailing list, and also I've had
some additional thoughts since then.
You're using the llvm type system to detect when something is a pointer,
and then you rely on knowing what's a pointer to deduce garbage collection
roots. We're supposed to have the llvm.gcroots intrinsic for this purpose,
but you note that it prevents gc roots from being in registers (they must
be in memory somewhere, usually on the stack), and that fixing it is more
work than is reasonable.
Your IR won't do any shifty pointer-int conversion shenanigans, and you
want some assurance that an optimization won't introduce them, or that if
one does then you can call it out as a bug and get it fixed. I think that's
reasonable, but I also think it's something we need to put forth before
llvm-dev.
Note that pointer-to-int conversions aren't necessarily just the
ptrtoint/inttoptr instructions (and constant expressions), there's also
casting between { i64 }* and { i8* }* and such. Are there legitimate
reasons an optz'n would introduce a cast? I think that anywhere in the
mid-optimizer, conflating integers and pointers is only going to be bad for
both the integer optimizations and the pointer optimizations.
It may make sense as part of lowering -- suppose we find two alloca's, one
i64 and one i8* and find that their lifetimes are distinct, and i64 and i8*
are the same size, so we merge them. Because of how this would interfere, I
don't think this belongs anywhere in the mid-optimizer, it would have to
happen late, after lowering. That suggests that there's a point in the pass
pipeline where the IR is "canonical enough" that this will actually work.
Is that reasonable? Can we actually guarantee that, that any pass which
would break this goes after a common gc-root insertion spot? Do we need
(want?) to push back and say "no, sorry, make GC roots better instead"?
Nick