Essentially, I think target-independent optimizations are still
attractive, but we might want to just force them to go through an actual
target-implemented API to interpret the scopes rather than making the
interpretation work from first principles. I just worry that the targets
are going to be too different and we may fail to accurately predict future
If we have a target-implemented API, then just opaque numbers should also
be sufficient, right? For the API, all we care about is queries that
interesting optimizations will want answered from the target. This could be
at the instruction level: "is it okay to remove this atomic store with
scope n1 that is immediately followed by atomic store with scope n2?". Or
it could be at the scope level: "does scope n2 include scope n1"?
I think it is significantly more friendly (and easier to debug mistakes) if
the textual IR uses human readable names. We already have a hard time due
to the totally opaque nature of address spaces -- there are magical address
spaces for segment stuff in x86.
The strings are only opaque to the target-independent optimizer. While
integers and strings are equally friendly to the code in the target,
strings are significantly more friendly to humans reading the IR.
The other advantage is that it makes it much harder to accidentally write
code that relies on the particular values for the integers. =]
I think the "strings" can be made relatively clean.
What I'm imagining is something very much like the target-specific
attributes which are just strings and left to the target to interpret, but
are cleanly factored so that the strings are wrapped up in a nice opaque
attribute that is used as the sigil everywhere in the IR. We could do this
with metadata, and technically this fits the model of metadata if we make
the interpretation of the absence of metadata be "system". However, I'm
quite hesitant to rely on metadata here as it hasn't always ended up
working so well for us. ;]
Metadata was the first thing to be considered internally at AMD. But it
was quickly shot down because the Research guys were unwilling to accept
the possibility of scope being lost and replaced by a default "system"
scope. Current models are useful only when all atomic accesses for a given
location use the same scope throughout the application, i.e., all threads
running on all agents. So it is not okay for the compiler to "promote" the
scope in just one kernel unless it has access to the entire application;
the result is undefined. This is true for OpenCL source as well as HSAIL
target. This may change in the near furture:
HRF-Relaxed: Adapting HRF to the complexities of industrial heterogeneous
But even then, it will be difficult to say if the same models can be
applied to heterogeneous systems that don't resemble OpenCL or HSAIL.
Yea, I'm not really surprised by this.
I'd be interested in your thoughts and others' thoughts on how me might
encode an opaque string-based scope effectively. If we can find a
reasonably clean way of doing it, it seems like the best approach at this
- It ensures we have no bitcode stability problems.
- It makes it easy to define a small number of IR-specified values like
system/crossthread/allthreads/whatever and singlethread, and doing so isn't
ever awkward due to any kind of baked-in ordering.
- In practice in the real world, every target is probably going to just
take this and map it to an enum that clearly spells out the rank for their
target, so I suspect it won't actually increase the complexity of things
I seem to be missing something here about the need for strings. If they
are opaque anyway, and they are represented by sigils, then the sigils
themselves are all that matter, right? Then the encoding is just a number...
See above for why I'd prefer not to use a raw number in the IR.
But while the topic is wide open, here's another possibly whacky
approach: we let the scopes be integers, and add a "scope layout" string
similar to data-layout. The string encodes the ordering of the integers. If
it is empty, then simple numerical comparisons are sufficient. Else the
string spells out the exact ordering to be used. Any known current target
will be happy with the first option. If some target inserts an intermediate
scope in the future, then that version switches from empty to a fully
specified string. The best part is that we don't even need to do this right
now, and only come up with a "scope layout" spec when we really hit the
problem for some future target.
This isn't a bad approach, but it seems even more complex. I think I'd
rather go with the fairly boring one where the IR just encodes enough data
for the target to answer queries about the relationship between scopes.
I am not really championing scope layout strings over a target-implemented
API, but it seems less work to me rather than more. The relationship
between scopes is just an SWO, and it can be represented as a graph. A
practical target will have a very small number of scopes, say not more than
16. It should be possible to encode this into a graphviz-style string. Then
instead of having every target implement an API, they just have to specify
the relationship as a string.
I see where you're going here, and it sounds feasible, but it honestly
seems much *more* work and certainly more complex for the IR. We can always
add such a representation to communicate the relationships if it becomes
important, but I'd rather communicate via a boring target API to start with