Annotating known pointer alignment

Hi all,

I'm instrumenting IR by replacing loads and stores by calls to a library, which I have compiled to bitcode such that inlining can take place. My problem is: If I could retain the alignment information on the load/store, this would open many optimization opportunities after inlining. Unfortunately, I don't know how.

After thinking about it, and trying different things, I now have several questions.

First, consider this function:
   #include <stdint.h>
   uint64_t foo(uint64_t *bar) {
     *bar = 42;
     return (uint64_t)bar & 3;
   }

Which is compiled to
   define i64 @foo(i64* %bar) nounwind uwtable ssp {
     store i64 42, i64* %bar, align 8
     %0 = ptrtoint i64* %bar to i64
     %and = and i64 %0, 3
     ret i64 %and
   }

1) How can clang deduce the alignment on the store? It emits a store without alignment information, and instcombine adds the explicit alignment according to the langref (pref alignment).

2) If we know that the store is aligned, shouldn't instcombine deduce that the pointer %bar itself must be aligned (set low bits in KnownZero), and use this information for other uses, at least those that are dominated by the store? This would fold the three instructions to "ret i64 0".

3) If instcombine cannot deduce it, is there a way to annotate that a specific pointer value is aligned? In my case it should work to add this sequence:
   %A = ptrtoint i64* %ptr to i64
   %B = and i64 %A, -8
   %C = inttoptr i64 %B to i64*
and replacing all uses of %ptr by %C, then running optimizations, and then replacing %C by %ptr again. But this is neither efficient nor nice, and optimizations could cause other instructions to use my dummy instructions too such that I cannot remove them afterwards.
Nick Lewycky once proposed to add an alignment field to ptrtoint (for bug 9120), following the principle that pointer uses know the alignment. That would also help for my example. After inlining, I could visit all uses of the pointer (load/store/ptrtoint) and add the alignment accordingly.

Also clang could set an alignment for ptrtoint, since it seems to know that some pointers are always aligned.

Thanks for any help to solve my problem or answer my questions!

Clemens

Hi Clemens,

I'm instrumenting IR by replacing loads and stores by calls to a library, which
I have compiled to bitcode such that inlining can take place. My problem is: If
I could retain the alignment information on the load/store, this would open many
optimization opportunities after inlining. Unfortunately, I don't know how.

After thinking about it, and trying different things, I now have several questions.

First, consider this function:
   #include <stdint.h>
   uint64_t foo(uint64_t *bar) {
     *bar = 42;
     return (uint64_t)bar & 3;
   }

Which is compiled to
   define i64 @foo(i64* %bar) nounwind uwtable ssp {
     store i64 42, i64* %bar, align 8
     %0 = ptrtoint i64* %bar to i64
     %and = and i64 %0, 3
     ret i64 %and
   }

1) How can clang deduce the alignment on the store?

by consulting the C standard :slight_smile:

  It emits a store without

alignment information,

I assume you mean: without an explicit alignment.

  and instcombine adds the explicit alignment according to

the langref (pref alignment).

Without an explicit alignment means the ABI alignment in the case of
loads/stores.

2) If we know that the store is aligned, shouldn't instcombine deduce that the
pointer %bar itself must be aligned (set low bits in KnownZero), and use this
information for other uses, at least those that are dominated by the store? This
would fold the three instructions to "ret i64 0".

Probably it should. Doing so would require that LLVM semantics considers an
unaligned load/store to result in undefined behaviour, and would need to be
documented in the LangRef.

Ciao, Duncan.

Hi Duncan,

thanks for your comments.

First, consider this function:
   #include <stdint.h>
   uint64_t foo(uint64_t *bar) {
     *bar = 42;
     return (uint64_t)bar & 3;
   }

Which is compiled to
   define i64 @foo(i64* %bar) nounwind uwtable ssp {
     store i64 42, i64* %bar, align 8
     %0 = ptrtoint i64* %bar to i64
     %and = and i64 %0, 3
     ret i64 %and
   }

1) How can clang deduce the alignment on the store?

by consulting the C standard :slight_smile:

Ah, good to know. §6.3.2.3 (7) even states that casting alone leads to undefined behaviour, if the resulting pointer is not correctly aligned.

  It emits a store without

alignment information,

I assume you mean: without an explicit alignment.

  and instcombine adds the explicit alignment according to

the langref (pref alignment).

Without an explicit alignment means the ABI alignment in the case of
loads/stores.

Yes, that second step was clear. Assuming you meant the "preferential alignment", according to the langref.

2) If we know that the store is aligned, shouldn't instcombine deduce
that the
pointer %bar itself must be aligned (set low bits in KnownZero), and
use this
information for other uses, at least those that are dominated by the
store? This
would fold the three instructions to "ret i64 0".

Probably it should. Doing so would require that LLVM semantics
considers an
unaligned load/store to result in undefined behaviour, and would need to be
documented in the LangRef.

It already is: "Overestimating the alignment results in an undefined behavior." (both load and store).
But implementing this kind of optimization is not that easy, since it would require to
1) visit other uses (hopefolly an aligned load/store) of the pointer for which we want to know the Known-Zero-Bits, and
2) know the dominance tree, since exploiting the knowledge about the alignment of the pointer is only valid after the load/store has been executed.

As far as I can see, both is not done in InstCombine yet, so adding it would (a) be a lot of work, and (b) increase the runtime of InstCombine.

What's your opinion about an alignment on ptrtoint instructions? Then clang could add that information, since as I just learned the C standard guarantees *any* pointer to be correctly aligned.
Since this is not guaranteed in LLVM IR, we need to communicate this information in order to take advantage of it.

Cheers,
Clemens

Hi Clemens,

thanks for your comments.

First, consider this function:
   #include <stdint.h>
   uint64_t foo(uint64_t *bar) {
     *bar = 42;
     return (uint64_t)bar & 3;
   }

Which is compiled to
   define i64 @foo(i64* %bar) nounwind uwtable ssp {
     store i64 42, i64* %bar, align 8
     %0 = ptrtoint i64* %bar to i64
     %and = and i64 %0, 3
     ret i64 %and
   }

1) How can clang deduce the alignment on the store?

by consulting the C standard :slight_smile:

Ah, good to know. §6.3.2.3 (7) even states that casting alone leads to undefined
behaviour, if the resulting pointer is not correctly aligned.

  It emits a store without

alignment information,

I assume you mean: without an explicit alignment.

  and instcombine adds the explicit alignment according to

the langref (pref alignment).

Without an explicit alignment means the ABI alignment in the case of
loads/stores.

Yes, that second step was clear. Assuming you meant the "preferential
alignment", according to the langref.

no, I meant the ABI alignment. If the LangRef says the preferential alignment
then I'm pretty sure the LangRef is wrong!

2) If we know that the store is aligned, shouldn't instcombine deduce
that the
pointer %bar itself must be aligned (set low bits in KnownZero), and
use this
information for other uses, at least those that are dominated by the
store? This
would fold the three instructions to "ret i64 0".

Probably it should. Doing so would require that LLVM semantics
considers an
unaligned load/store to result in undefined behaviour, and would need to be
documented in the LangRef.

It already is: "Overestimating the alignment results in an undefined behavior."
(both load and store).
But implementing this kind of optimization is not that easy, since it would
require to
1) visit other uses (hopefolly an aligned load/store) of the pointer for which
we want to know the Known-Zero-Bits, and
2) know the dominance tree, since exploiting the knowledge about the alignment
of the pointer is only valid after the load/store has been executed.

As far as I can see, both is not done in InstCombine yet, so adding it would (a)
be a lot of work, and (b) increase the runtime of InstCombine.

Maybe correlated value propagation (aka lazy value info) could be extended to
vend pointer alignment information.

What's your opinion about an alignment on ptrtoint instructions?

It makes sense to me. It would also simplify propagating alignment information
out of loads and stores and onto ptrtoint, where ValueTracking could easily pick
it up.

Ciao, Duncan.

  Then clang

Hi Duncan,

  and instcombine adds the explicit alignment according to

the langref (pref alignment).

Without an explicit alignment means the ABI alignment in the case of
loads/stores.

Yes, that second step was clear. Assuming you meant the "preferential
alignment", according to the langref.

no, I meant the ABI alignment. If the LangRef says the preferential
alignment
then I'm pretty sure the LangRef is wrong!

Yep, the InstCombiner in fact uses the ABI alignment, so propably you are right and the LangRef is wrong.

Cheers,
Clemens

Hi Clemens,

  and instcombine adds the explicit alignment according to

the langref (pref alignment).

Without an explicit alignment means the ABI alignment in the case of
loads/stores.

Yes, that second step was clear. Assuming you meant the "preferential
alignment", according to the langref.

no, I meant the ABI alignment. If the LangRef says the preferential
alignment
then I'm pretty sure the LangRef is wrong!

Yep, the InstCombiner in fact uses the ABI alignment, so propably you are right
and the LangRef is wrong.

I've corrected the LangRef.

Ciao, Duncan.