LTO "bug" and Clang warnings

After looking at the Livermore for a while, we found the issue that was causing LTO to produce a different result.

Consider the code below [1]. setup() doesn’t touch bar/baz, main() doesn’t reference foo. LTO finds, correctly, that it can remove the setup(), but the result is different.

The code is clearly wrong, but the compiler has no right to fix user’s stupidity, even at that level. The only think I can think that would mitigate the problem would be to have a warning or out-of-bounds access when it’s that obvious.

Clang folks,

Correct me it I’m wrong, but I humbly think this should be done in the front-end, so it’s easy to print out line information without requiring debug symbols to be present. How easy would be to do that in Clang’s AST?

I assume it’s only a matter of checking the stride against the bounds of the array, if all is knowns at compile time. That would also help against segmentation fault.

It wouldn’t, however, work with variable length arrays, but VLA tend to segfault more often than silently overwrite other global variables.

cheers,
–renato

[1]

#include <stdio.h>

struct {
int foo[20];
int bar[20];
int baz[20];
} S;

void setup() {
int i;
// OVERWRITES ALL THREE ARRAYS
for (i=0; i<60; i++)
S.foo[i] = i; // ONLY REFERENCES FOO
}

// DOESN’T USE FOO
int main() {
int i;
setup();
printf("Bar: ");

for (i=0; i<20; i++)
printf("%d “, S.bar[i]);
printf(”\nBaz: “);
for (i=0; i<20; i++)
printf(”%d “, S.baz[i]);
puts(”");

return 0;
}

Hi Renato,

After looking at the Livermore for a while, we found the issue that was causing
LTO to produce a different result.

Consider the code below [1]. setup() doesn't touch bar/baz, main() doesn't
reference foo. LTO finds, correctly,

I don't think this is correct. At the LLVM IR level it is valid to write into
bar and baz by accessing off the end of foo. GlobalOpt is wrong to think that
an inbounds GEP into foo cannot be referencing bar/baz; the inbounds rules only
say it can't access memory outside of S. So I think this is an optimizer bug.

Ciao, Duncan.

  that it can remove the setup(), but the

It's not fixing user's stupidity. If you don't follow language rules, this is what you can get. A warning would definitely be very helpful, but suppressing optimizations due to this is not the right way to go.

For the cases when the user code cannot be fixed, maybe implementing an option like "-fpessimal-aliasing" would help (i.e. assuming that everything is aliased with everything).

-Krzysztof

Hi Duncan,

Ok, I found that even if main() does reference foo, setup() still gets chopped off and the results is the unexpected:

Foo: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Bar: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Baz: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

So, while there is the issue in LTO, I still think Clang could give a warning. This is a source of buffer overflow attacks…

While trying to define which pass was doing this, I commented out pass by pass in LTO and found that commenting some of them would produce a seg-fault in LLVM (haven’t checked further, though), but I know that is one of:

PM.add(createGlobalOptimizerPass());

PM.add(createConstantMergePass());

PM.add(createDeadArgEliminationPass());

PM.add(createInstructionCombiningPass());

// Inline small functions
if (RunInliner)
PM.add(createFunctionInliningPass());

PM.add(createPruneEHPass()); // Remove dead EH info.
if (RunInliner)

PM.add(createGlobalOptimizerPass());
PM.add(createGlobalDCEPass()); // Remove dead functions.

Then I looped over all LTO passes in the populateLTOPassManager() but adding them manually (incrementally, until I passed them all) didn’t allow me to reproduce the error.

I’ll create a bugzilla for this, and the Clang warning. If Clang folks think it’s worth, they can start from there.

–renato

I'd have to find the actual clauses in the standard, but I'm pretty sure that neither C nor C++ allow this. If the LLVM IR does not prohibit it, then it's likely that we lose some information in the translation process (unless it's indicated by metadata).

-Krzysztof

The bugs:

http://llvm.org/bugs/show_bug.cgi?id=14851

http://llvm.org/bugs/show_bug.cgi?id=14852

cheers,
–renato

After looking at the Livermore for a while, we found the issue that was
causing LTO to produce a different result.

Consider the code below [1]. setup() doesn't touch bar/baz, main() doesn't
reference foo. LTO finds, correctly, that it can remove the setup(), but the
result is different.

The code is clearly wrong,

Pretty sure this is UB. I believe if foo/bar/baz were instead an array
of arrays it might be well defined.

but the compiler has no right to fix user's
stupidity, even at that level.

I'm not sure what you mean by "fix user's stupidity" here - could you clarify?

The only think I can think that would
mitigate the problem would be to have a warning or out-of-bounds access when
it's that obvious.

Possible, though I don't think we have anything that goes quite that
far yet (computing the dynamic range of 'i' & checking that it's
beyond the bounds of the array, etc). This seems more like the purview
of the Clang Static Analyzer. Have you tried it to see if it catches
this case?

Clang folks,

Correct me it I'm wrong, but I humbly think this should be done in the
front-end, so it's easy to print out line information without requiring
debug symbols to be present.

Regardless of the technical issues, we generally don't emit any
diagnostics from places other than the front-end for reliability
reasons. You can see this in GCC's optimizer-based warnings: they're
quite unreliable. Subtle changes in code lead to certain
optimizations/analyses not firing & warnings appear/disappear with
relatively little logic/sense to the user.

In short: that option is not something that's ever really considered
for Clang diagnostics.

How easy would be to do that in Clang's AST?

It's more the proof necessary to compute the bounds of 'i' and access
to the arrays that's the interesting part. Possible, & it wouldn't
surprise me if the SA already does this work, but not something I
think we'd end up doing in clang proper.

Buffer overrun on foo[20] and relying on it for bar[20].

It might not even be an error to access foo[50] even though foo only has 20
elements (via pointer indirection rules), but it's user error to do so, and
if the standard allows that (I'm yet to find the paragraph), then the
compiler has no right to "fix" it. If it's undefined, than LTO is
completely right and nothing should be done.

The "stupidity" part is to rely on undefined behaviour. Mind you, the
stupidity in this case was mine. I removed functions from Livermore that I
though were harmless, and added a few arrays to be initialized by others
and haven't checked that the ranges were dynamic.

cheers,
--renato

I do believe it’s undefined.

§5.2.1 Subscripting [expr.sub]

1/ A postfix expression followed by an expression in square brackets is a postfix expression. One of the expressions
shall have the type “pointer to T” and the other shall have unscoped enumeration or integral type.
The result is an lvalue of type “T.” The type “T” shall be a completely-defined object type.62 The expression
E1[E2] is identical (by definition) to *((E1)+(E2))

§5.7 Additive operators [expr.add]

5/ When an expression that has integral type is added to or subtracted from a pointer, the result has the type
of the pointer operand. If the pointer operand points to an element of an array object, **and the array is
large enough**, the result points to an element offset from the original element such that the difference of
the subscripts of the resulting and original array elements equals the integral expression. […]

There is later (in §8.3.4 Arrays) a special case access out of bounds within a multi-dimensional array; however that is not our concern here.

Obviously, a warning, if possible, could be nice; but in general I am afraid this is more the domain of static analysis as it requires “guessing” the bounds of the loop. It might have been caught with ubsan though (I think there is an out-of-bounds checker).

– Matthieu

I'm not sure what you mean by "fix user's stupidity" here - could you
clarify?

Buffer overrun on foo[20] and relying on it for bar[20].

It might not even be an error to access foo[50] even though foo only has 20
elements (via pointer indirection rules), but it's user error to do so, and
if the standard allows that (I'm yet to find the paragraph), then the
compiler has no right to "fix" it.

Hmm, OK - just in case this is a useful observation to make:

What the compiler does with undefined behavior (let's, for the sake of
discussion, assume that this is undefined and that the compiler knows
this (the former seems likely, though Duncan seems to indicate that
the latter might not actually bet the case)) is not "fix" it. The
compiler simply assumes that UB can never happen. It's not
deliberately trying to fix the code - it's just optimizing the code
based on the semantic model/guarantees it has. One of those guarantees
(again, hypothetically) is that writes to 'foo' cannot write to 'bar'
- so if 'foo' is never accessed, simply optimize away writes to 'foo'
to save time. This isn't intended to "fix" anything - simply to make
faster code.

I agree, and take back the wrong wording. I didn't mean it as an
intentional fix.

cheers,
--renato

I do believe it's undefined.

§5.2.1 Subscripting [expr.sub]
...
§5.7 Additive operators [expr.add]
...

Still, doesn't explicitly say it's undefined. I agree this gives the
freedom of implementers to extend naturally, but it's at least arguable. I
have the 2011 draft and couldn't find anything, nor in the current open
issues.

Obviously, a warning, if possible, could be nice; but in general I am

afraid this is more the domain of static analysis as it requires "guessing"
the bounds of the loop. It might have been caught with ubsan though (I
think there is an out-of-bounds checker).

Is the static analyser in clang-extra-tools?

cheers,
--renato

I do believe it's undefined.

§5.2.1 Subscripting [expr.sub]
...
§5.7 Additive operators [expr.add]
...

Still, doesn't explicitly say it's undefined.

That's the fun of it - all the things that aren't defined are
undefined. (but, yes, some things are more explicitly undefined than
others)

I agree this gives the freedom
of implementers to extend naturally, but it's at least arguable. I have the
2011 draft and couldn't find anything, nor in the current open issues.

Obviously, a warning, if possible, could be nice; but in general I am
afraid this is more the domain of static analysis as it requires "guessing"
the bounds of the loop. It might have been caught with ubsan though (I think
there is an out-of-bounds checker).

Is the static analyser in clang-extra-tools?

No, it's built into clang itself, actually. clang -analysis or
somesuch, I've not run it myself, actually.
http://clang-analyzer.llvm.org/ should have some details.

That's the fun of it - all the things that aren't defined are
undefined. (but, yes, some things are more explicitly undefined than
others)

This is Java is much better, there's no standard, just Oracle's
implementation! :smiley:

No, it's built into clang itself, actually. clang -analysis or

somesuch, I've not run it myself, actually.
http://clang-analyzer.llvm.org/ should have some details.

Not much comes out of it when I pass it over lto-test.c:

<plist version="1.0">
<dict>
<key>clang_version</key>
<string>clang version 3.3 </string>
<key>files</key>
<array>
</array>
<key>diagnostics</key>
<array>
</array>
</dict>

cheers,
--renato

It does, actually:

C++ 2003 [5.7.5]

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i–n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

-Krzysztof

Just found on the C standard, too: 6.5.6-8.

I need to get used reading those standards again...

cheers,
--renato