Separate AA metadata for load/store portions of memcpy

This is going back to something I had asked on IRC about a few weeks
ago and promised
to get back to when I had some time to actually work on it. For background:

Currently, we can annotate tbaa on memcpy's, but when we do so, the
semantics consider it to apply to *both* the load and the store part
of the memcpy. This is quite a significant limitation and the cause of
a good amount of lost TBAA precision in my frontend (and I would
imagine Clang as well,
though I have no data or experiments to back that up). Note that while
I'm mostly concerned with TBAA here, the same is certainly true of
noalias and alias.scope metadata as well.

Now, a few weeks ago, I simply hacked around this (locally) by
introducing new !tbaa_src, !tbaa_dst,
!noalias_src, !noalias_dest, !alias.scope_src, !alias.scope_dst
metadata and adjusting LLVM to use that. This was a relatively simple
change, but of course it feels rather unsatisfying and suffers from
the problem that there's now some redundancy. On discussing this on
IRC, I believe Hal had
suggested that we might want to consider adding some way to add
metadata on function arguments
(and I apologize if I misremembered the exact proposal), to be able to
write things like:

call void @llvm.memcpy(i8* %dest !tbaa !1 !noalias !2, i8* %src !tbaa
!2 !noalias !3, ...)

Of course we currently don't allow this kind of thing in the IR at
all, so this would be a pretty major
change. I'd like to solicit some opinions on the best way to represent
this in the IR (in particular
whether it's worth introducing a way to annotate AA MD on function
arguments to avoid the uglyness of introducing 2N extra metadata
tags).

Thanks,
Keno

This is going back to something I had asked on IRC about a few weeks
ago and promised
to get back to when I had some time to actually work on it. For background:

Currently, we can annotate tbaa on memcpy's, but when we do so, the
semantics consider it to apply to *both* the load and the store part
of the memcpy. This is quite a significant limitation and the cause of
a good amount of lost TBAA precision in my frontend (and I would
imagine Clang as well,
though I have no data or experiments to back that up). Note that while
I'm mostly concerned with TBAA here, the same is certainly true of
noalias and alias.scope metadata as well.

Now, a few weeks ago, I simply hacked around this (locally) by
introducing new !tbaa_src, !tbaa_dst,
!noalias_src, !noalias_dest, !alias.scope_src, !alias.scope_dst
metadata and adjusting LLVM to use that. This was a relatively simple
change, but of course it feels rather unsatisfying and suffers from
the problem that there's now some redundancy. On discussing this on
IRC, I believe Hal had
suggested that we might want to consider adding some way to add
metadata on function arguments
(and I apologize if I misremembered the exact proposal), to be able to
write things like:

call void @llvm.memcpy(i8* %dest !tbaa !1 !noalias !2, i8* %src !tbaa
!2 !noalias !3, ...)

Of course we currently don't allow this kind of thing in the IR at
all, so this would be a pretty major
change. I'd like to solicit some opinions on the best way to represent
this in the IR (in particular
whether it's worth introducing a way to annotate AA MD on function
arguments to avoid the uglyness of introducing 2N extra metadata
tags).

Part of my motivation for suggesting this was that we already have a fair amount of duplication between attributes and metadata (e.g. we have a nonnull attribute and nonnull metadata). Does anyone else have an opinion here?

  -Hal

Hi,

I feel like the !nonnull metadata is a bit different, because the
nonnull attribute is used on declarations as well, not just on call
sites. I suppose there's nothing preventing the same from being true
for AA metadata as well, but I don't currently see a use case for that
(do you?).

Keno

I feel like the !nonnull metadata is a bit different, because the
nonnull attribute is used on declarations as well, not just on call
sites. I suppose there's nothing preventing the same from being true
for AA metadata as well, but I don't currently see a use case for that
(do you?).

Putting TBAA information on declarations *could* make sense. Regardless, we could allow it in both places. We have metadata on globals, etc.

  -Hal