I think noalias
and dereferenceable
are interacting in some subtle ways, but I don’t know how exactly and the LangRef does not seem to discuss this.
Concretely: Is the following code UB?
(For ease of writing I will write this as C code but pretend I can just declare arguments noalias
and/or dereferenceable
. But this is discussing LLVM IR semantics and just using C syntax.)
void test(noalias int *ptr1, dereferenceable(4) int *ptr2) {
*ptr1 = 0;
}
void main() {
int x;
test(&x, &x);
}
Clearly, if I remove dereferenceable
this code is fine (while the two pointers do alias, they are not accessed in aliasing ways, and that is all that matters).
However, dereferenceable
is supposed to allow LLVM to add spurious reads. And if we change test
as follows, this clearly becomes UB:
void test(noalias int *ptr1, dereferenceable(4) int *ptr2) {
*ptr1 = 0;
int val = *ptr2;
}
So if the statement “dereferenceable
allows LLVM to add spurious reads” is correct, then the original program must already have been UB?
One way this could make sense is that the presence of dereferenceable(N)
is considered to act like a read of N bytes for the purpose of noalias
. Then the original function would already be writing through ptr1
and reading from ptr2
, which is UB. But I am not sure if that is truly the intention of dereferenceable
?
But if that is not how dereferenceable
and noalias
interact, then how come dereferenceable
-based optimizations like the following do not introduce UB?
Before:
void test(noalias int *ptr1, dereferenceable(4) int *ptr2, int count) {
*ptr1 = 0;
for (int i = 0; i < count; ++i) { int val = *ptr2; }
}
After:
void test(noalias int *ptr1, dereferenceable(4) int *ptr2, int count) {
*ptr1 = 0;
int val_prefetch = *ptr2; // aliases with the previous line!
for (int i = 0; i < count; ++i) { int val = val_prefetch; }
}