Structure type IR load/store semantics

Hi,

In C (as far as I’m aware), the semantics of structures is such that if you have previously dereferenced a pointer to a structure, then you can assume it is safe to speculate a later load from a different field of the same structure as the whole object must be allocated.

For example, for the following code:

struct a {
    int a;
    int b;
    int c;
};

struct a *A;
struct a *B;

void  foo(int n)
{
    for (int i = 0; i < n; ++i)
    {
        if (A[i].b > B[i].b)
            A[i].c = A[i].a;
        else
            A[i].c = B[i].a;
    }
}

… it would be safe to speculate the accesses to A[i].a and B[i].a at the C language level and transform into something to the effect of:

%val1 = load i32, i32* %A_i_b     ; Load from A
%val2 = load i32, i32* %B_i_b     ; Load from B

%cmp = slt i32 %val1, %val2
%load1 = load i32, i32* %A_i_a    ; Can speculate due to previous load from A
%load2 = load i32, i32* %B_i_a    ; Can speculate due to previous load from B
%select = select i1 %cmp, %struct.a* %load1, %struct.a* %load2
store i32 %select, i32* %A_i_c,

My question is, do these same C semantics translate through into LLVM IR? I suspect they don’t and this transform at the IR level would not be valid, in which case the question becomes how could these semantics be represented in IR such that this transformation can be done? There is !dereferenceable metadata that can be attached to pointers which I think would be sufficient, however it currently doesn’t look like anything on the clang side emits this. Is it tractable to start having clang emit this metadata in cases like this (i.e. on a pointer loading any field of a struct where said struct has already been loaded from/stored to, even if it was a different field)?

No, those guarantees don’t transpose to LLVM IR. Pointer types are meaningless in LLVM IR; alas they are going to disappear (Opaque Pointers — LLVM 15.0.0git documentation).

The only way would be to use the dereferenceable metadata as you mention. But you would need to show it provides a meaningful improvement in practice.

IIR, !dereferenceable means the value you load (so the result of the load), is dereferenceable. It only applies to loads of pointers, kinda. That said, I’m not sure if the load of the globals A and B can be annotated according to C rules to be !dereferenceable(sizeof(struct a)). Before you go there, try out if the transformation happens when you pass both as values (or as pointers with the dereferenceable attribute) to the function instead.

The particular case I’m looking at is actually around vectorization. As is, a workload similar to the example I provided does not get vectorised due to the vectorizer getting hung up on the select of the base pointer of the loads. If the loads on both sides of the branch are speculated (which GCC does do via it’s tree if conversion pass), then not only can we now vectorize this loop, but also the loads can be done as structured loads, loading both parts of the structure at once. Both of these things hugely improves performance for the workload I’m looking at.

Ah, I misunderstood the semantics of that metadata, and yes, I don’t think that quite fits what I need.

This does actually work (with some modification of the instcombine doing this to handle GEPs), also if I use assume operand bundles to represent the dereferenceability (the same function argument one) at the point of the second load from the structure, I can also get this transform to trigger.

Does having clang emit these assume calls seem like a reasonable approach?