Where and how to report an optimisation issue that doesn't cause a crash

Could be… But the wierd thing here is if I change the array to be of size 256 and the index to be ‘unsigned char’, seems like there is no way to access the ‘size’ field throught “y->ptr[index]” (in your example) but clang still performs the re-read: https://godbolt.org/z/ahlGHY

It looks like clang sees an alias between “int ptr[256]” and “int size”… maybe because they are both “int”? Because changing the type of size to char/short/long eliminates the re-read: https://godbolt.org/z/4kr1oo

Interesting and confusing at the same time. Can it be related to the standard definition of (non)aliasing as this answer suggests: https://stackoverflow.com/a/58550055/5218277 ?

On the other hand, placing the “int size” (as you mentioned) before “int ptr[256]” also eliminates the re-reads: https://godbolt.org/z/kpB74m

Baseline: No compiler produces optimal code in all cases - it’s too expensive. They’ll all have some tradeoffs, some more reasonable than others, some more intentional than others. I don’t know where this particular situation lies (whether it’s a reasonable tradeoff of compiler complexity, or an accidental one that might still be a reasonable tradeoff of compiler engineer time (other things more important than this particular optimization, etc)).

‪On Fri, Oct 25, 2019 at 1:02 AM ‫אלכס לופ’‬‎ <alex_lop@walla.com> wrote:‬

Could be… But the wierd thing here is if I change the array to be of size 256 and the index to be ‘unsigned char’, seems like there is no way to access the ‘size’ field throught “y->ptr[index]” (in your example) but clang still performs the re-read: https://godbolt.org/z/ahlGHY

Yup - I guess LLVM doesn’t track/use the upper bound of the index here, even if it uses the lower bound. (as demonstrated by the reordering enabling the optimization)

It looks like clang sees an alias between “int ptr[256]” and “int size”… maybe because they are both “int”? Because changing the type of size to char/short/long eliminates the re-read: https://godbolt.org/z/4kr1oo

That’s somewhat interesting - hmm, I’d assumed we didn’t use strict aliasing by default, but perhaps we do. You can try compiling with -fno-strict-aliasing to disable type based alias analysis & see that the type change doesn’t fix the issue then.

Interesting and confusing at the same time. Can it be related to the standard definition of (non)aliasing as this answer suggests: https://stackoverflow.com/a/58550055/5218277 ?

Yeah, guess so - once you change the type, type based alias analysis proves the pointer produced by the index arithmetic can’t alias the non-int member. (with -fno-strict-aliasing changing the type does not remove the extra load even when the type is different)

Using the shorter test case:

struct S {
int a[3];
int b;
};

int f(struct S *p, int i) {
if (p->b)
return 42;

p->a[i] = 3;
return p->b;
}

one can see that the the TBAA metadata loses information about the array member:

!4 = !{!“S”, !5, i64 0, !7, i64 12}
!5 = !{!“omnipotent char”, !6, i64 0}

The “new struct path TBAA” looks better, it seems to say “there are 12 bytes of
ints at offset 0 in struct S”

(Command line was ./bin/clang -target armv7m-eabi -O2 -S y.c -emit-llvm -Xclang
-new-struct-path-tbaa)

!3 = !{!4, !7, i64 12, i64 4}
!4 = !{!5, i64 16, !“S”, !7, i64 0, i64 12, !7, i64 12, i64 4}
!5 = !{!6, i64 1, !“omnipotent char”}
!6 = !{!“Simple C/C++ TBAA”}
!7 = !{!5, i64 4, !“int”}
!8 = !{!7, !7, i64 0, i64 4}

but then, the access tag for the store to the array

%arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i32 0, i32 0, i32 %i
store i32 3, i32* %arrayidx, align 4, !tbaa !8

says just “it’s in int” and there it still a redundant load:

f:
ldr r2, [r0, #12]
cmp r2, #0
itt ne
movne r0, #42
bxne lr
movs r2, #3
str.w r2, [r0, r1, lsl #2]
ldr r0, [r0, #12]
bx lr

So, I manually hacked the metadata too look like:

!8 = !{!4, !7, i64 0, i64 4}

i.e. as if we access the first element of the array.

Running that through opt -O2 and llc yields:

f:
ldr r2, [r0, #12]
cmp r2, #0
iteee ne
movne r0, #42
moveq r2, #3
streq.w r2, [r0, r1, lsl #2]
moveq r0, #0
bx lr

That seems like something that Clang can do by itself for access tags for index
expressions with member arrays: state that they access the offset in the struct
that corresponds to the first array element, so unknown indices would still
conservatively alias between each other, but not with other struct members.

Thoughts? Pitfalls? I may give it a shot.

~chill

can’t say I know much about TBAA, so I’ll leave it up to you/others :slight_smile:

Hi Momchil,

That seems like something that Clang can do by itself for access
tags for index expressions with member arrays: state that they
access the offset in the struct that corresponds to the first
array element, so unknown indices would still conservatively
alias between each other, but not with other struct members.

Then all by-known-index array accesses would need to be encoded as if there were accessing the first element, wouldn’t they? The idea behind the new representation was to address existing limitations by giving the TBAA accurate information about accesses. If memory servers me, in this specific case of an unknown index, the tag shall refer to the whole member array, which is supposed to mean that all and any of its elements can actually be accessed.