Recursion in Clang FE

Hello

I build Clang FE as a shared library and link with other application. That application should never crash with any sources given for compilation.

I have a test based on this one https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/gcc.c-torture/compile/limits-pointer.c . But PTR4 is changed to PTR5 at the last line. I run the test with the following command line:

clang.exe -c C:\work\tmp\limits-pointer.c

Then clang crashed with stack overflow exception caused by very deep recursion during preprocessing. And that is not the only one case.

I have noticed that there are many recursive calls in Clang’s source code. And depth of recursion in many cases depends on the source code which is being compiled.

Is it possible to eliminate all such recursions in Clang?

If not, how can I workaround (if it is possible) stack overflow ?

Best regards.

Alexey Sotkin.

It is (typically) complicated to simply replace recursion in compilers with non-recursive variants, as that means extra code to store and restore the state, which is further complicated by the fact that there are so many different calls that can follow any particular call - in other words, the possible things that may follow a particular point in the code are many. It’s relatively easy to remove recursion in a “search the best path in a maze” or “find the right number to solve the Sudoku puzzle with” because, essentially, the same steps for each recursion level, except for “which of the possible paths” or “which of the possible numbers”. In a compiler, you can have MANY different paths from each particular location, and the information you need to save is quite different when you are in the middle of a struct compared to in the middle of a switch statement - but someone can declared a struct inside a switch that is inside a class declaration, and you can have a class declaration inside a struct. So using the call-stack makes life a lot easier to figure out “now what do we do” when you get back from the little excursion of parsing a particular construct [e.g. the struct inside the case of a switch].

However, it is probably sane to check the nesting level of calls in some way, and aborting the parsing if the nesting level is too high. Having one million levels of pointer indirection is quite unlikely to be a sensible program of any sort, so it’s perfectly OK to say “Sorry , I can’t compile this”.

I’m not sure how or where this should be applied, but C++ does allow this to be “almost automatic” by using a static member variable that is counted up on entry and down on leaving a function, and adding a test that confirms that "we can cope with X recursive calls, which should be enough for any sane implementation]

There’s similar limit for template recursion depth.