ConditionalOperator::setCond is gone

Hi @clang

for in-place AST transformations of my source-to-source transformation tool I use the setXXX member functions of the various expression classes (e.g setLHS, setRHS etc.) quiet heavily. Now I had to notice that ConditionalOperator::setCond was removed without any appropriate replacement. Is there any reason behind that decision? For the moment I have re-added it locally but if there is a general tendency to remove the setXXX member functions then I better start to refactor my code early (however I would argue against a removal beforehand).

Best regards
Olaf Krzikalla

Yes, we have been removing setXXX member functions, because mutating an AST in-place is extremely dangerous. AST mutations look local, but it's very easy for them to have non-local effects that break the invariants of the AST.

Plus, immutable ASTs are far better when dealing with serialized representations of ASTs, because coping with in-place modifications to ASTs when serializing again is *extremely* complicated. See, for example, the chained PCH work in the AST reader/writer.

  - Doug

Hi,

Yes, we have been removing setXXX member functions, because mutating an AST in-place is extremely dangerous.

Hmm that sounds really nasty. If all setXXX member functions are gone, then there are no means to efficiently manipulate an AST, aren't there?

AST mutations look local, but it's very easy for them to have non-local effects that break the invariants of the AST.

Plus, immutable ASTs are far better when dealing with serialized representations of ASTs, because coping with in-place modifications to ASTs when serializing again is *extremely* complicated. See, for example, the chained PCH work in the AST reader/writer.

Until now I hadn't any problems, but I didn't serialize manipulated ASTs but rather rewrite them only. The question is whether it is worth to sacrifce options for safety. If you take into account that I am a C++ - programmer, then you know the answer :slight_smile:

Best regards
Olaf Krzikalla

It's bad library design to write a library that can't maintain its own invariants, but that's exactly the problem with ConditionalOperator::setCond() and the many other mutators in the AST. I'm fine with having some way of mutating the AST, if it preserves invariants.

  - Doug

OK, agreed. So please can you tell me which invariants in particular are violated by inplace mutations. Maybe we can find a satisfying solution.

Best Olaf

Several things come to mind immediately:

  1) There's *no* checking that the types of the replacement expressions actually make sense within their node. If the type changes
  2) If any temporaries are contained in the new subexpression (or were in the old subexpression that was replaced), those temporaries need to be bound and noted in an outer expression
  3) A number of properties, including type- and value-dependent, whether there is a parameter pack in the subexpression, etc., are incrementally computed. If replacing a subexpression changes any of those, you need to propagate that change upward.

  - Doug

Given the ongoing discussion in "AST transformations" there is the need for a transformation capability. After thinking about the points made there, the points made here, the experience collected during 1.5 years transforming ASTs and also one of your points that lead to the removal of the AST XML printer I came to the conclusion, that the clang AST in its current form is not an abstract syntax tree (with emphasis on syntax!) anymore. Not only the nodes are augmented with semantic stuff, but some nodes by its own are there for semantic reasons only (ImplicitCastExpr springs to mind and indeed that node causes some trouble from time to time).
So what we would need is a pure syntax tree with the taxonomy of the current AST and the source locations only. Nothing more (e.g. not even return types for expressions). Such a tree could also be subject of XML streaming, which then only represents pure syntax (omitting e.g. references to types, which are clang-AST specific) but nothing more.
However, I don't consider the introduction of yet another layer containing a bunch of new classes mirroring the AST taxonomy a good solution.
A solution I could like:

class MutableASTNode
{
   // the clang AST node which is represented by this node:
   Stmt* origin;

   // children of this node, interpretation of the index
   // depends on the actual type of the origin
   // (which can be computed via RTTI)
   SmallVector<MutableASTNode*, 4> children;

   // this courtesy could be introduced
   MutableASTNode* parent;

public:
   // as minimal as possible:
   // node type, source range, children manipulation,
   // rewrite functionality (maybe including XML)
};

Unfortunately such a solution has its own drawbacks. E.g. it is not possible to use the current rewriter for MutableASTNode, as the rewriter traverses the original AST. Same is true for all the static analyzer functionality.
So maybe there is another idea?

Best Olaf