Hi all,
A while ago there was a discussion on changing the current “set of trees” representation of TBAA metadata to be more expressive, prompted by the need to support C structs. Dan Gohman also talked about the issue here: http://llvm.org/devmtg/2012-11/Gohman-AliasAnalysis.pdf. It was suggested that the trees be replaced by a type DAG then. While working on this compiler, I ended up using an undirected graph to represent aliasing instead, I believe it might be suitable for TBAA’s purposes as well, for the following reasons.
- Does the graph need to be acyclic? Consider these struct types:
struct a { type1 x; type2 y }
struct b { type2 y; type3 z }
struct c { type3 z; type1 x }
They form the following alias graph (sorry about the crappy ascii art):
a → b → c -+
^______________|
Which won’t be representable if we force our tbaa graph to be acyclic.
- Does it need to be directed? If we use a directed graph, then I suppose we’d be relying on “reachability” in both directions to find out if something aliases something else:
±-> char <–+
float int
^ ^
___ MyClass __|
To answer “does myclass alias char?”, we’d need to check (myclass is reachable from char || char is reachable from myclass), similar to how two things alias if they are ascendant/descendant of each other in the current tree representation. Since we’re going both ways, I think it makes sense to just reify the reachability edges like this (note the lack of direction, now to check if two things alias, we simply ask if there is an edge connecting them):
±-- char —+
float | int
___ MyClass __|
This can be represented with a (pretty dense) adjacency matrix and queried in constant time, so I thought it might be faster.
From what I can gather without delving deep into the codebase, it’s easiest right now to change from the “set of trees” representation to a directed graph, but I don’t think changing to an undirected graph would be much harder (please correct me if I’m wrong). I’d like to try and tackle the implementation as well, if that’s ok.
Thoughts?
Thanks!