Endless operator-> chain causing infinite loop

Hi all,

clang version 3.4 (192772)

This is with respect to the following gcc testsuite TC:

template< int n >
struct a {
a< n+1 > operator->()
{
return a< n+1 >();
}
};

int main() {
a<0>()->x;
}

This TC goes into an infinite loop when compiled. Ideally it should throw the error
recursive template instantiation exceeded maximum depth of 256.

On initial investigation I found that when the constructor
Sema::InstantiatingTemplate::
InstantiatingTemplate(Sema &SemaRef, SourceLocation PointOfInstantiation,
Decl *Entity,
SourceRange InstantiationRange);

is run on each recursive object creation,
the first thing the constructor does is check whether the recursive instantiation depth is reached or not by calling the function

Invalid = CheckInstantiationDepth(PointOfInstantiation,
InstantiationRange);

The above function checks whether the size of SemaRef.ActiveTemplateInstantiations(a container which stacks all the template instantiations originating from a particular PointOfInstantiation) is within the limit as specified by templateinstantiationdepth(256 by default).

So far, so good.

Now when CheckInstantiationDepth returns false, the constructor pushes the current Inst using the following statement:
SemaRef.ActiveTemplateInstantiations.push_back(Inst);

Also the push_back function correctly increments the EndX value.

So ideally the size of SemaRef.ActiveTemplateInstantiations should increase from 1 to 2 to 3 to …256 and than the error should get printed.

But, the EndX value which was incremented in the push_back function call is no longer reflected

in the size computation done as part of the function call CheckInstantiationDepth(PointOfInstantiation,
InstantiationRange);.

i.e SemaRef.ActiveTemplateInstantiations.size() always returns zero in the function CheckInstantiationDepth.

My question is where is the EndX value getting reset after it was rightly incremented in the push_back function call?

Am I missing something in my analysis above? Any help on the same would be appreciated.

Thanks,

Rahul

Hi all,

clang version 3.4 (192772)

This is with respect to the following gcc testsuite TC:

template< int n >
struct a {
    a< n+1 > operator->()
        {
        return a< n+1 >();
        }
};

int main() {
    a<0>()->x;
}

This TC goes into an infinite loop when compiled. Ideally it should throw
the error
recursive template instantiation exceeded maximum depth of 256.

On initial investigation I found that when the constructor
Sema::InstantiatingTemplate::
InstantiatingTemplate(Sema &SemaRef, SourceLocation PointOfInstantiation,
                      Decl *Entity,
                      SourceRange InstantiationRange);

is run on each recursive object creation,
the first thing the constructor does is check whether the recursive
instantiation depth is reached or not by calling the function

Invalid = CheckInstantiationDepth(PointOfInstantiation,
                                    InstantiationRange);

The above function checks whether the size of
SemaRef.ActiveTemplateInstantiations(a container which stacks all the
template instantiations originating from a particular PointOfInstantiation)
is within the limit as specified by templateinstantiationdepth(256 by
default).

So far, so good.

Now when CheckInstantiationDepth returns false, the constructor pushes the
current Inst using the following statement:
SemaRef.ActiveTemplateInstantiations.push_back(Inst);

Also the push_back function correctly increments the EndX value.

So ideally the size of SemaRef.ActiveTemplateInstantiations should
increase from 1 to 2 to 3 to .....256 and than the error should get printed.

But, the EndX value which was incremented in the push_back function call
is no longer reflected
in the size computation done as part of the function call
CheckInstantiationDepth(PointOfInstantiation,
                                    InstantiationRange);.

i.e SemaRef.ActiveTemplateInstantiations.size() always returns zero in the
function CheckInstantiationDepth.

My question is where is the EndX value getting reset after it was rightly
incremented in the push_back function call?

Am I missing something in my analysis above? Any help on the same would be
appreciated.

Assuming there's no memory corruption, etc, presumably there's a symmetric
"pop_back" call to echo the push_back (I'd hope that the symmetry is in the
same object - some sort of RAII or otherwise scoped situation) you'd want
to find. If it's not discoverable via code inspection, you could set a data
watchpoint in your debugger to try to find it.

- David

Thanks David, will check out and get back with further analysis.

Thanks,
Rahul

Hi all,

clang version 3.4 (192772)

This is with respect to the following gcc testsuite TC:

template< int n >
struct a {
    a< n+1 > operator->()
        {
        return a< n+1 >();
        }
};

int main() {
    a<0>()->x;
}

This TC goes into an infinite loop when compiled. Ideally it should throw
the error
recursive template instantiation exceeded maximum depth of 256.

That's not the right behavior; there's no recursive template instantiation
here. Each operator-> is instantiated from within the context of 'main',
not from within some other instantiation.

If we want to limit this, we should put a limit on the number of times we
go around the loop looking for an overloaded operator->
in Sema::ActOnStartCXXMemberReference. However, I've seen people use this
in practice in template metaprogramming to get around the recursive
template instantiation depth, so this might break existing code.

On initial investigation I found that when the constructor

Hi Richard,

Just checked, and g++-4.7.2 doesn't give a 'recusive template instantiation' error.
The error is:

error: template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum) instantiating ‘a<(n + 1)> a<n>::operator->() [with int n = 900]’

So, g++ agrees with you. They are just limiting the depth of template instantiation, without
relying on the standard definition of recursive template instantiation. If they did not emit
this error, then the compiler would eventually run out of resources and crash. No compiler
can compile that code as written.

In the standard at [temp.inst]

<14.7.1 note d>
There is an implementation-defined quantity that specifies the limit on the total depth
of recursive instantiations, which could involve more than one template. The result of an
infinite recursion in instantiation is undefined.

[Example:
      template<class T> class X {
          X<T>* p; // OK
          X<T*> a; // implicit generation of X<T> requires
                        // the implicit instantiation of X<T*> which requires
                        // the implicit instantiation of X<T**> which ...
      };
— end example ]
</14.7.1 note d>

enjoy,
Karen

>
> >
> > Hi all,
> >
> > clang version 3.4 (192772)
> >
> > This is with respect to the following gcc testsuite TC:
> >
> > template< int n >
> > struct a {
> > a< n+1 > operator->()
> > {
> > return a< n+1 >();
> > }
> > };
> >
> > int main() {
> > a<0>()->x;
> > }
> >
> >
> > This TC goes into an infinite loop when compiled. Ideally it should
throw
> > the error
> > recursive template instantiation exceeded maximum depth of 256.
> >
>
> That's not the right behavior; there's no recursive template
instantiation
> here. Each operator-> is instantiated from within the context of 'main',
> not from within some other instantiation.
>
> If we want to limit this, we should put a limit on the number of times we
> go around the loop looking for an overloaded operator->
> in Sema::ActOnStartCXXMemberReference. However, I've seen people use this
> in practice in template metaprogramming to get around the recursive
> template instantiation depth, so this might break existing code.

Hi Richard,

Just checked, and g++-4.7.2 doesn't give a 'recusive template
instantiation' error.
The error is:

error: template instantiation depth exceeds maximum of 900 (use
-ftemplate-depth= to increase the maximum) instantiating ‘a<(n + 1)>
a<n>::operator->() [with int n = 900]’

This is their recursive template instantiation error. I'm not sure why they
chose to diagnose this problem in that way; there is no recursive
instantiation here. The loop is in [over.ref](13.5.6)p1:

"An expression x->m is interpreted as (x.operator->())->m for a class
object x of type T if T::operator->() exists and if the operator is
selected as the best match function by the overload resolution mechanism."

This rule gets applied repeatedly, outside of any template instantiation.

In the standard at [temp.inst]

<14.7.1 note d>
There is an implementation-defined quantity that specifies the limit on
the total depth
of recursive instantiations, which could involve more than one template.
The result of an
infinite recursion in instantiation is undefined.

This is not what's happening in this case. Templates are mainly relevant to
this problem because they make it much more likely to happen (although,
since we already catch operator-> cycles, templates are the only way we can
have a genuinely infinite sequence of operator-> calls). If we had
generated code that looked like this:

struct T { int n; };
struct A0 { T *operator->(); };
struct A1 { A0 operator->(); };
struct A2 { A1 operator->(); };
// ...

A1000 a;
int k = a->n;

... we'd hit into exactly the same resource limitations as with the
templated code, and the same limit should presumably apply here, whatever
it is. (GCC strangely accepts this code while rejecting the corresponding
templated code.)

Hi Richard,
I understand it is not what is happening. I just included the <14.7.2 note 15>
for clarity. I think the g++ error message is crystal clear. And the practicality
of it is obvious.

Thanks for your comments.

enjoy,
Karen

Hi Richard / Karen,

Thanks for making things clearer.

Now how do we go about handling this sort of behavior?

Do we throw an error message just like g++ does? Wont it be better to print an error message instead of an infinite loop?

Also if you could please elaborate a bit on how we can “limit the number of times we go around the loop looking for an overloaded operator-> in Sema::ActOnStartCXXMemberReference” ?

Thanks,

Rahul

Hi Richard / Karen,

Thanks for making things clearer.
Now how do we go about handling this sort of behavior?

Do we throw an error message just like g++ does? Wont it be better to
print an error message instead of an infinite loop?

We don't know the loop is infinite, but we should probably give up after
some number of iterations.

Also if you could please elaborate a bit on how we can "limit the number
of times we go around the loop looking for an overloaded operator->
in Sema::
ActOnStartCXXMemberReference" ?

Add a counter to the "while (BaseType->isRecordType())" loop. Bail out with
a diagnostic if it hits the limit.

Hi Richard / Karen,

Thanks for making things clearer.
Now how do we go about handling this sort of behavior?

Do we throw an error message just like g++ does? Wont it be better to
print an error message instead of an infinite loop?

We don't know the loop is infinite, but we should probably give up after
some number of iterations.

Also if you could please elaborate a bit on how we can "limit the number
of times we go around the loop looking for an overloaded operator->
in Sema::
ActOnStartCXXMemberReference" ?

Add a counter to the "while (BaseType->isRecordType())" loop. Bail out
with a diagnostic if it hits the limit.

Pedantry, just for fun - should this be added to the implementation limits
in the C++ standard?

- David

Hi Rahul,

Just a couple comments. I think consideration should be given to implementing a
command line option to override the limit. If some group knows what they are doing
and have the resources to support a higher limit, then the compiler ought not to
preclude them from doing so.

I did give all this some thought last evening. Interesting subject. One thing is the
gcc implementation is limiting a resource. And so there is no chance of unintended
side effects. In limiting the idiom, there could potentially be unintended side effects.
I'd make the limit as high as reasonably possible. And the commandline override seems
prudent.

enjoyed all the comments. food for thought.

thanks,
Karen

Hi Richard / Karen,

Thanks for making things clearer.
Now how do we go about handling this sort of behavior?

Do we throw an error message just like g++ does? Wont it be better to
print an error message instead of an infinite loop?

We don't know the loop is infinite, but we should probably give up after
some number of iterations.

Also if you could please elaborate a bit on how we can "limit the number
of times we go around the loop looking for an overloaded operator->
in Sema::
ActOnStartCXXMemberReference" ?

Add a counter to the "while (BaseType->isRecordType())" loop. Bail out
with a diagnostic if it hits the limit.

Pedantry, just for fun - should this be added to the implementation limits
in the C++ standard?

The list of limits is not normative; an implementation can use whatever
limits it likes, so long as it documents them. And in some sense this is
already covered by "Nesting levels of parenthesized expressions within a
full-expression [256]." because the rewrite rule introduces nested
parentheses =)