I have found a case where an optimization pass is barfing on invalid code in an unreachable
basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found inside a cycle of
two unreachable basic blocks)
The invalid code and the unreachable basic blocks were introduced by the function inliner.
I am wondering what is valid for these cases ?
(I did not find anything in LangRef, but I might as well have missed it)
- clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
- I assume that the self reference in dead code is not valid ?
- should inlining do an extra effort of cleaning up such dead code blocks ?
I have found a case where an optimization pass is barfing on invalid code in an unreachable
basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found inside a cycle of
two unreachable basic blocks)
The invalid code and the unreachable basic blocks were introduced by the function inliner.
I am wondering what is valid for these cases ?
(I did not find anything in LangRef, but I might as well have missed it)
- clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
Yep.
- I assume that the self reference in dead code is not valid ?
Unreachable code is permitted to take various weird shapes and forms.
So no, that is valid.
- should inlining do an extra effort of cleaning up such dead code blocks ?
Which pass?
Generally, it (or the users of that code) should either use DominatoTree
to avoid handling unreachable blocks, or it should be hardened to deal
with such situations gracefully.
> Hi all,
>
> I have found a case where an optimization pass is barfing on invalid code in
an unreachable
> basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found
inside a cycle of
> two unreachable basic blocks)
>
> The invalid code and the unreachable basic blocks were introduced by the
function inliner.
>
> I am wondering what is valid for these cases ?
> (I did not find anything in LangRef, but I might as well have missed it)
>
> - clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
Yep.
> - I assume that the self reference in dead code is not valid ?
Unreachable code is permitted to take various weird shapes and forms.
So no, that is valid.
> - should inlining do an extra effort of cleaning up such dead code blocks ?
>
Which pass?
As far as I see, it is in SROA (and a combination of our custom changes, including the full restrict changes).
I am not sure yet if it is a generic SROA problem or not, but if it is, I should be able
to create a bugreport.
Generally, it (or the users of that code) should either use DominatoTree
to avoid handling unreachable blocks, or it should be hardened to deal
with such situations gracefully.
Yes, self-references are valid in unreachable code, we have test cases for this, it’ll pass the verifier. Passes should be resilient against this (even though a pass could always do a quick reachability check and delete all dead blocks as a pre-step).
>
> - the problematic code construct consists of a cycle of 'dead' basic blocks
that conditionally jumps into live blocks.
>
> That jump introduce a connection of a dead self-referring 'getelementptr'
to live code through a PHI node.
>
> - during SROA, the scoped noalias analysis (full restrict version) does an
unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
>
> - 'getUnderlyingObject' also follows the path to the dead blocks and ends up
in the self-referring 'getelementpr'.
>
>
>
> Does this mean that 'getUnderlyingObject' should also be prepared to be
handling 'invalid instructions' in 'dead code' ?
I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
to assert that no cycles are encountered instead of endlessly looping.
But i *think* the problem is in your SROA changes.
Can you link me to the particular patch in question?
This is the patch that triggers the unbounded lookup: https://reviews.llvm.org/D68507
(llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)
I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.
Apparently, your extra check also triggers cases without my patches ? A self-reference is
likely a mild version of an 'invalid instruction' but what other constructs can we encounter
in dead blocks ?
I am wondering if we should allow (analysis) passes to follow code into dead blocks at all ?
Can you isolate the failure to just a run of MemorySSA? If so, it sounds like it would be a problem in your linked patch. I guess it would be good to know exactly where an infinite loop is triggered (and why).
[..]
> >
> > - the problematic code construct consists of a cycle of 'dead' basic blocks
> that conditionally jumps into live blocks.
> >
> > That jump introduce a connection of a dead self-referring 'getelementptr'
> to live code through a PHI node.
> >
> > - during SROA, the scoped noalias analysis (full restrict version) does an
> unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
> >
> > - 'getUnderlyingObject' also follows the path to the dead blocks and ends up
> in the self-referring 'getelementpr'.
> >
> >
> >
> > Does this mean that 'getUnderlyingObject' should also be prepared to be
> handling 'invalid instructions' in 'dead code' ?
> I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
> to assert that no cycles are encountered instead of endlessly looping.
>
> But i *think* the problem is in your SROA changes.
> Can you link me to the particular patch in question?
This is the patch that triggers the unbounded lookup: https://reviews.llvm.org/D68507
(llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)
I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.
Apparently, your extra check also triggers cases without my patches ? A self-reference is
likely a mild version of an 'invalid instruction' but what other constructs can we encounter
in dead blocks ?
Yeah, i fooled myself there.
I am wondering if we should allow (analysis) passes to follow code into dead blocks at all ?
I wouldn't think so. It's a waste of time i would say.
The infinite loop happens because getUnderlyingObject() is called with MaxLookup=0. getUnderlyingObject() relies on a finite depth limit to prevent infinite loops. The problem is specific to the full restrict patch for this reason – normally we don’t perform unlimited getUnderlyingObject() calls.
The correct fix here is imho to remove the ability to call getUnderlyingObject() with MaxLookup=0, which sounds like a bad idea for pathological cases anyway.
- the problematic code construct consists of a cycle of 'dead' basic blocks
that conditionally jumps into live blocks.
That jump introduce a connection of a dead self-referring 'getelementptr'
to live code through a PHI node.
- during SROA, the scoped noalias analysis (full restrict version) does an
unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
- 'getUnderlyingObject' also follows the path to the dead blocks and ends up
in the self-referring 'getelementpr'.
Does this mean that 'getUnderlyingObject' should also be prepared to be
handling 'invalid instructions' in 'dead code' ?
I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
to assert that no cycles are encountered instead of endlessly looping.
But i *think* the problem is in your SROA changes.
Can you link me to the particular patch in question?
This is the patch that triggers the unbounded lookup: https://reviews.llvm.org/D68507
(llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)
I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.
Can you isolate the failure to just a run of MemorySSA? If so, it sounds like it would be a
problem in your linked patch. I guess it would be good to know exactly where an infinite
loop is triggered (and why).
The infinite loop happens because getUnderlyingObject() is called with MaxLookup=0. getUnderlyingObject() relies on a finite depth limit to prevent >infinite loops. The problem is specific to the full restrict patch for this reason -- normally we don't perform unlimited getUnderlyingObject() >calls.
The correct fix here is imho to remove the ability to call getUnderlyingObject() with MaxLookup=0, which sounds like a bad idea for pathological cases anyway.
I agree. IIRC this lookup originates from the original 'local restrict support'. I think we should be able to cope with a limited depth lookup.
At worst case, it should modify the result of a scoped noalias query to return 'MayAlias' over 'NoAlias'.