Loads not hoisted out of the loop

Hey everyone,

I’m looking at the following two C functions:

http://pastebin.com/mYWCj6d8

Second one is about 50% slower than the first one because in the second case d->data[i * 3 + {X, Y, Z}] loads are not moved out of the second loop and are computed every time j increments, even though they don’t depend on it. If I move those out manually, the performance of the two is equal.

What could be the reason for this behaviour? I looked at LICM logs; the compiler hoists GEPs but stops short of loading the actual values. Does it have something to do with possible aliasing?

Thanks,

Dimitri.

Yes.

Almost certainly, the compiler can’t tell that the load from the struct (to get the array pointer) doesn’t alias with the stores to the array itself. This is exactly the kind of thing that type-based alias analysis (-fstrict-aliasing) can help with. Use with caution, as it can break some type-unsafe idioms.

–Owen

Ownen, thanks for the reply. The above, however, was already compiled with -fstrict-aliasing and -O3; doesn’t look like it makes much of a difference in this case.

As with my earlier question, I’m actually using C as a model to create a custom frontend, which I intend to bless with more strict aliasing rules by default. The Array is going to be a language intrinsic so I’m free to make a suitable tbaa tree. The problem is I haven’t found a good example of how would such length-aware pointer container structure be annotated so that loads from the data pointer are optimized as though we didn’t access it through a struct.

The struct IR looks something like

%struct.Array = type { double*, i64 }

Then we get the GEP to data, followed by a load

%data = getelementptr inbounds %struct.Array* %d, i64 0, i32 0
%0 = load double** %data, align 8, !tbaa !0

where relevant TBAA looks like

!0 = metadata !{metadata !“any pointer”, metadata !1}
!1 = metadata !{metadata !“omnipotent char”, metadata !2}
!2 = metadata !{metadata !“Simple C/C++ TBAA”}
!3 = metadata !{metadata !“double”, metadata !1}

So if I understand correctly, because double and any pointer are leafs of omnipotent char, they can easily alias. Thus, having something like

!4 = metadata !{metadata !“Array data”, metadata !2}

%0 = load double** %data, align 8, !tbaa !4

should do the trick?

Dimitri.

The problem is actually the stores to out->data. While "d" and "out" are restricted pointers, it tells us nothing about the relation between "d->data" and "out->data", since the "data" member is a pointer itself. Since both "d->data" and "out->data" are both of type double, the type-based alias analysis won't help.

In the case when you pass "double*" to the function, the restrict keyword tells us that the arrays of doubles themselves do not alias, which allows the compiler to infer that the loads and stores in the loop do not depend on each other.

-Krzysztof

Right, I see. Is there any other way to solve this? As the last resort, I
was considering silently transforming each Array argument into separate
data and metadata arguments, but that would certainly add other
complications I'd rather avoid.

I was also think ing a hypothetical LLVM-based Fortran compiler would be
the model to aim for, where they have dimension-aware arrays that never
alias (AFAIK?). I wonder how one would solve that problem there.

Dimitri.

And I suppose one would be to write a specialized aliasing analysis pass
which would return "NoAlias" if the "data" pointer loads come from
different Array structs?

Right, I see. Is there any other way to solve this? As the last resort,
I was considering silently transforming each Array argument into
separate data and metadata arguments, but that would certainly add other
complications I'd rather avoid.

Depends on what information you have available. If all you have is the LLVM IR, then the only universal solution that I can think of is interprocedural analysis, where you would track the pointer values, and potentially determine that they never point to overlapping memory areas.

I was also think ing a hypothetical LLVM-based Fortran compiler would be
the model to aim for, where they have dimension-aware arrays that never
alias (AFAIK?). I wonder how one would solve that problem there.

I think the proper way would be to use metadata after all, and have some alias analysis that could understand it.

-Krzysztof

From: "Krzysztof Parzyszek" <kparzysz@codeaurora.org>
To: "Dimitri Tcaciuc" <dtcaciuc@gmail.com>
Cc: llvmdev@cs.uiuc.edu
Sent: Friday, January 18, 2013 11:22:26 AM
Subject: Re: [LLVMdev] Loads not hoisted out of the loop

>
> Right, I see. Is there any other way to solve this? As the last
> resort,
> I was considering silently transforming each Array argument into
> separate data and metadata arguments, but that would certainly add
> other
> complications I'd rather avoid.

Depends on what information you have available. If all you have is
the
LLVM IR, then the only universal solution that I can think of is
interprocedural analysis, where you would track the pointer values,
and
potentially determine that they never point to overlapping memory
areas.

> I was also think ing a hypothetical LLVM-based Fortran compiler
> would be
> the model to aim for, where they have dimension-aware arrays that
> never
> alias (AFAIK?). I wonder how one would solve that problem there.

I think the proper way would be to use metadata after all, and have
some
alias analysis that could understand it.

I agree. FWIW, I'm currently working on making the LLVM-based Fortran compiler non-hypothetical, and so for several reasons, I'd like to have a solution to this. If we can't think of anything better, we could always fall back to the N^2 metadata solution (explicit mark as no-alias all pairs that don't alias), but I'd rather we come up with something else.

-Hal

How about having Fortran-specific metadata, and a Fortran-specific alias analysis? The metadata could indicate how the properties of the Fortran-frontend-generated types relate to the types that the IR code uses to load/store data. For example:

f90_array_type = type { i32 size, double* data };
metadata = "attributes of f90_array_type, like restrict, should be applied to the member pointer 'data'"

Is this something you've considered?

-Krzysztof

I am not certain how fortran implements multi-dimensional arrays, but in my
case I'm doing something like

   type { i32 nd, i32* dims, double* data };

Perhaps we could add !tbaa.pointer?

   !1 = metadata !{ metadata !"int", metadata !0 }
   !2 = metadata !{ metadata !"double", metadata !0 }

   !3 = metadata !{ metadata !1, 0 } ; tbaa.pointer { metadata type,
i64 noalias }
   !4 = metadata !{ metadata !2, 1 }

   !5 = { i64 0, i64 4, metadata !1, i64 8, i64 8, metadata !3, i64 16, i64
8, metadata !4 }

   %Array = type { i32 nd, i32* dims, double* data }, !tbaa.struct !5

Dimitri.

Actually, I just realized the referenced pointer metadata isn’t marked as such in any way when its referenced in tbaa.struct. I suppose it also doesn’t have much to do with TBAA as well, so that’s also a misnomer. My metadata-fu is still weak.

Dimitri.