Limits on AST depth and stack usage?

This program creates an AST of depth 10000:

template <class T, T…V> struct seq {

constexpr bool zero() { return (true && … && (V == 0)); };

constexpr unsigned N = 10000;
auto x = __make_integer_seq<seq, int, N>{};

static_assert(!, “”);

On my machine, clang-10 parses this successfully (slowly!) but crashes if I increase to 100000.
I suppose -ftemplate-depth doesn’t apply as there’s no actual recursion here (just fold-expressions), but I feel like some implementation limit should apply: clang relies heavily on being able to traverse the AST recursively, and will inevitably crash if the tree is too tall.

Should there be a limit here? It seems tempting to say one of:
-ftemplate-depth should constrain either the number of arguments a pack can represent
-ftemplate-depth should constrain the size of pack that can be unfolded
-some limit should constrain the overall AST height

(doh, obviously that first “constexpr” shouldn’t be there)

RecursiveASTVisitor::TraverseStmt has a DataRecursionQueue *Queue in its signature to prevent stack overflow by default (as Richard once helpfully explained to me). ASTMatchers uses the DataRecursionQueue, so that searching for matches in Stmt children should never overflow the stack, I believe (though it does confuse users who expect a more orderly traversal…but that’s another matter).

But memoizedMatchesAncestorOfRecursively does not use data recursion, as you’ve noted. So in other words it is using data recursion when traversing descendants of deeply nested Exprs, but not when traversing the ancestors of those at the bottom.

So, I think it could just be a limited problem with a simple solution: there should be a non-recursive, loop-based version of memoizedMatchesAncestorOfRecursively specific to traversing the ancestors of Stmts/Exprs, mirroring what RecursiveASTVisitor::TraverseStmt does.

Whether there are more general issues with e.g. evaluating extremely deep ASTs, is beyond my own depth, though…


Since you asked about more general issues:

In general terms, it would be nice if we could accept arbitrarily-deep ASTs. We make some effort to do so, with three major efforts (in addition to expecting 8MB of stack space in each thread that might run Clang):

  1. Data recursion. RecursiveASTVisitor does this by default, and the constant evaluator does it when evaluating integer-valued expressions. But this is limited: you only get it where you apply it, and RAV’s automatic data recursion turns itself off when it thinks the difference might be observable in subclasses.
  2. Minimizing stack space during recursion. For example, during “regular” function template instantiation at the end of the translation unit, we try to keep the size of the recursive frame as small as possible.
  3. The “runWithSufficientStackSpace” mechanism that tries to check that “enough” stack space is available and pops out a diagnostic (and as a slow-but-correct fallback, allocates a new stack) if we’re getting close to running out.

Applying these techniques (largely in the above order) should be encouraged, and if we can make memoizedMatchesAncestorOfRecursively do so, that seems like a good directed approach for the local problem. But it’s unlikely that we’ll ever cover enough ground here to consistently prevent ever running out of stack in all cases – not unless we fully move away from recursion as an implementation technique, which seems implausible, or add “runWithSufficientStackSpace” calls to every call graph SCC, which seems expensive. So this is currently done pragmatically: we fix stack usage issues as we see them become a problem. And in the remaining cases, we crash :frowning: Not the end of the world for a compiler, but not great for more or less any other user of Clang as a library.

For standards-conformance, our backstop here are so-called “implementation limits” – we’re permitted to bail out if things get too complex, and we largely get to define what that means. C has some minimal requirements here – for example, that we can support at least one program with 63 levels of nested parentheses; C++ has only suggestions and no hard requirements. But in either case, if our limits are violated, there are no constraints on our behavior; we’re allowed to crash or whatever. So that puts us in quality-of-implementation land.

Now, we do have some defined implementation limits (and there has been some work done towards improving and formalizing them: In the specific case of expanding a fold-expression, the relevant implementation limit is LangOptions::BracketDepth, which limits the total number of parentheses, braces, or square brackets that can be open at once, but that limit is currently only applied when parsing, not during template instantiation. So in this specific case, as a bare minimum, I think we’re missing a check for that language option during fold-expression expansion (each level of a fold-expression introduces a set of parentheses, so even a pedantic interpretation of the rules has us covered reusing the same limit here).

As for limiting the size of packs in general: I think it might make sense to impose a limit – not to avoid excessive recursion; we should have other mechanisms to deal with that – but to avoid huge memory usage during compilation due to forming overly-large packs. Eg, passing a very large integer to the __make_integer_seq builtin can easily cause Clang to fall into a hole from which it will never escape, assuming you don’t run out of address space and just crash.

Setting an implementation limit on the overall depth of the AST could be interesting, if there’s a way we can practically do it. This seems like a hard problem – there are just too many different places and ways in which AST nodes are created, many of them aren’t equipped to cope with the operation failing in some kind of gracefully-recoverable way, and just keeping track of the AST depth without paying a code complexity and memory usage cost seems like a challenge – and we’d need to be sure the cost of solving it is worth the benefit.

Thanks very much both, the data recursion was the biggest thing I was missing, but all the extra details are very helpful.
(Richard: I might try paraphrase the “3 techniques” section of your email into some documentation if that’s OK)

It sounds like the low-hanging fruit here is:
A) make BracketDepth limit foldexprs (to 256 by default)
B) have ASTMatchFinder use explicit iteration rather than recursion when walking back up the tree (similar to data recursion)

C) heap-allocate TemplateArgumentLocInfo::Template (introducing a discriminator enum in the low bits)

Haojian and I will take these on, I think we understand well enough what needs to be done.

Any one of these would fix our clang-tidy problems. Side benefits: A should make tools safer by limiting ways to create huge ASTs, and C will mean DynTypedNote is 40 bytes rather than 56 now and 32 before, reduce overall memory usage, and let us add a few member-of-union asserts while there.

As for limiting the size of packs in general: I think it might make sense to impose a limit – not to avoid excessive recursion; we should have other mechanisms to deal with that – but to avoid huge memory usage during compilation due to forming overly-large packs.

This makes sense, I might not pick this up though. I don’t have much idea sense of the programs likely to be affected (large packs that are neither recursively expanded nor unfolded…)