[+Richard, Chandler]
I am not a language lawyer, but I'll atttempt to answer anyway.
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:
typedef struct S {
char padding[4088];
struct S *p1;
struct S *p2;
} S;
struct S* f1(struct S *s, int x)
{
S *r;
if (x)
r = s->p1;
else
r = s->p2;
return r;
}
TO
struct S* f1(struct S *s, int x)
{
return (x) ? s->p1 : s->p2;
}
The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct.
Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.
I also believe that this is correct.
I think that Chandler summarized some things to be careful about in this
regard here:
http://lists.llvm.org/pipermail/llvm-commits/Week-
of-Mon-20170807/477944.html
Of the three points highlighted here, the second clearly might apply:
2) Related to #1, there are applications that rely on this memory model,
for example structures where entire regions of the structure live in
protected pages and cannot be correctly accessed.
This, however, is clearly an extension to the standard memory model, and I
see no reason to support this by default. Speculatively loading cache lines
under contention from other cores might not be a good thing to do for
performance reasons, but that's not a correctness issue.
I agree that this optimization is legal under the C and C++ specs.
Sorry, I'd meant to add a proviso to this. It's legal under the C and C++
specs to the same extent that any load speculation is.
In general, load speculation is theoretically problematic in C/C++ because
it is undefined behavior to have a race even if the race is spurious. For
example,
a program containing a load that races with a store has undefined behavior
even
if the result of the load is never used. Therefore, strictly speaking,
speculating a
load can introduce a race that didn't otherwise exist, meaning that it can
introduce
undefined behavior to a program,
This is where the argument falls down. If there's no race in the source
code, the behavior *is* defined. It doesn't matter that the hardware might
have a racy read in the implementation, so long as it doesn't affect the
behavior of the abstract machine.
meaning that it's not a legal transformation.
There might be other reasons (maybe practical reasons) why this isn't
permitted, but it's not because it introduces UB (it doesn't).
Now,
AFAIK LLVM doesn't have any multi-threaded analyses that would actually
miscompile such a speculation, but it can still matter because e.g. TSan
emits code
that dynamically enforces the memory model, and it will dutifully report
the race here.
(Or, to quote from C11 5.1.2.4p28:
Transformations that introduce a speculative read of a potentially
shared memory
location may not preserve the semantics of the program as defined in
this standard,
since they potentially introduce a data race.
I think this is somewhat muddy, and essentially contradicted by the text
immediately following it:
However, they are typically valid in the
context of an optimizing compiler that targets a specific machine with
well-defined
semantics for data races. They would be invalid for a hypothetical
machine that is
not tolerant of races or provides hardware race detection.
In this sense, TSan makes an arbitrary machine race-intolerant.)
Right, this transformation is invalid according to TSan (and a compiler
targeting TSan should not apply it) . Apart from such debug
implementations, I'm not aware of an extant platform on which it's
problematic.
-- James