noalias and alias.scope metadata producers

Hi all,

In LLVM language reference I read that one can use noalias and alias.scope metadata to provide more detailed information about pointer aliasing. However, I was unable to obtain any LLVM IR annotations using this metadata from any LLVM optimization pass or Clang frontend (am I missing something?).

If I understand it correctly, this information would complement the type-based alias information and whatever mechanisms the alias analysis passes in LLVM compute from the input program. I was wondering if the coverage provided by these two components is already acceptable or if there is work that can be done in LLVM IR clients like clang to provide more information with proper noalias and alias.scope annotations.

Any comments?

Thanks!
Samuel

Hi Samuel,

Currently we add this metadata when inlining functions with noalias function parameters. See the function AddAliasScopeMetadata in lib/Transforms/Utils/InlineFunction.cpp.

On my TODO list is to generate this metadata directly in Clang for block-level pointers with restrict, etc. but I've not gotten there yet.

Thanks again,
Hal

Hey Samuel,

I'm not sure if this is interesting for you but maybe it is:

Polly can emit these metadata for loop nests we can analyze. It is based
on runtime alias checks, thus versioning. We are currently fixing the bugs
in the runtime alias check generation we merged yesterday but after that
(or if you like earlier) I can submit the annotation patch for review.
Some limited tests on polybench benchmarks (e.g., 3mm) without any polyhedral
optimizations, thus only parallel and noalias annotations, showed up to
20% improvement.

Best regards,
  Johannes

Hal, Johannes,

Thanks for the feedback. I have been digging into this a little bit more and was able to have some of this metadata being generated. Nevertheless, I am confused about the semantics of this metadata. Let me explain:

I was expecting the alias metadata to complement the information that alias analysis passes compute. However, it seems that the alias information of the pointers used in memory instructions is assumed to be different form the information of these instructions themselves. Let me give you an example:

MayAlias: double* %arrayidx.i, double* %arrayidx6
MayAlias: %4 = load double* %arrayidx.i, align 8, !tbaa !1, !alias.scope !7, !noalias !10 <-> store double %2, double* %arrayidx6, align 8, !tbaa !1

becomes:

MayAlias: double* %arrayidx.i, double* %arrayidx6
NoAlias: %4 = load double* %arrayidx.i, align 8, !tbaa !1, !alias.scope !8, !noalias !5 <-> store double %2, double* %arrayidx6, align 8, !tbaa !1, !alias.scope !5, !noalias !8

after I annotate the store using arrayidx6 as not aliasing with the load using arrayidx.i. Shouldn’t the alias information of the memory instructions be propagated to the used pointers by the alias analysis pass? Is this something that was not implemented (if so I’d be happy to work on this) or is my interpretation of the semantics wrong?

Thanks again!
Samuel

From: "Samuel F Antao" <sfantao@us.ibm.com>
To: "Johannes Doerfert" <doerfert@cs.uni-saarland.de>
Cc: "Tobias Grosser" <tobias@grosser.es>, "Samuel F Antao" <sfantao@us.ibm.com>, "LLVM Dev" <llvmdev@cs.uiuc.edu>
Sent: Wednesday, September 24, 2014 3:28:25 PM
Subject: Re: [LLVMdev] noalias and alias.scope metadata producers

Hal, Johannes,

Thanks for the feedback. I have been digging into this a little bit
more and was able to have some of this metadata being generated.
Nevertheless, I am confused about the semantics of this metadata.
Let me explain:

I was expecting the alias metadata to complement the information that
alias analysis passes compute. However, it seems that the alias
information of the pointers used in memory instructions is assumed
to be different form the information of these instructions
themselves. Let me give you an example:

MayAlias: double* %arrayidx.i, double* %arrayidx6
MayAlias: %4 = load double* %arrayidx.i, align 8, !tbaa !1,
!alias.scope !7, !noalias !10 <-> store double %2, double*
%arrayidx6, align 8, !tbaa !1

becomes:

MayAlias: double* %arrayidx.i, double* %arrayidx6
NoAlias: %4 = load double* %arrayidx.i, align 8, !tbaa !1,
!alias.scope !8, !noalias !5 <-> store double %2, double*
%arrayidx6, align 8, !tbaa !1, !alias.scope !5, !noalias !8

after I annotate the store using arrayidx6 as not aliasing with the
load using arrayidx.i. Shouldn't the alias information of the memory
instructions be propagated to the used pointers by the alias
analysis pass?

No, LLVM's current AA infrastructure does not do this kind of backward inference from accesses to their pointers (you won't get this from TBAA metadata either). Do you feel this would be useful?

-Hal

Hi Hal,

Thanks for the clarification.

From: "Samuel F Antao" <sfantao@us.ibm.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Johannes Doerfert" <doerfert@cs.uni-saarland.de>, "LLVM Dev" <llvmdev@cs.uiuc.edu>, "Tobias Grosser"
<tobias@grosser.es>
Sent: Friday, September 26, 2014 11:14:01 AM
Subject: Re: [LLVMdev] noalias and alias.scope metadata producers

Hi Hal,

Thanks for the clarification.

> To: Samuel F Antao/Watson/IBM@IBMUS
> Cc: Tobias Grosser <tobias@grosser.es>, LLVM Dev
> <llvmdev@cs.uiuc.edu>, Johannes Doerfert
> <doerfert@cs.uni-saarland.de>
> Date: 09/24/2014 06:42 PM
> Subject: Re: [LLVMdev] noalias and alias.scope metadata producers
>
> > From: "Samuel F Antao" <sfantao@us.ibm.com>
> > To: "Johannes Doerfert" <doerfert@cs.uni-saarland.de>
> > Cc: "Tobias Grosser" <tobias@grosser.es>, "Samuel F Antao"
> <sfantao@us.ibm.com>, "LLVM Dev" <llvmdev@cs.uiuc.edu>
> > Sent: Wednesday, September 24, 2014 3:28:25 PM
> > Subject: Re: [LLVMdev] noalias and alias.scope metadata producers
> >
> >
> >
> > Hal, Johannes,
> >
> >
> > Thanks for the feedback. I have been digging into this a little
> > bit
> > more and was able to have some of this metadata being generated.
> > Nevertheless, I am confused about the semantics of this metadata.
> > Let me explain:
> >
> >
> > I was expecting the alias metadata to complement the information
> > that
> > alias analysis passes compute. However, it seems that the alias
> > information of the pointers used in memory instructions is
> > assumed
> > to be different form the information of these instructions
> > themselves. Let me give you an example:
> >
> >
> >
> > MayAlias: double* %arrayidx.i, double* %arrayidx6
> > MayAlias: %4 = load double* %arrayidx.i, align 8, !tbaa !1,
> > !alias.scope !7, !noalias !10 <-> store double %2, double*
> > %arrayidx6, align 8, !tbaa !1
> >
> >
> > becomes:
> >
> >
> >
> > MayAlias: double* %arrayidx.i, double* %arrayidx6
> > NoAlias: %4 = load double* %arrayidx.i, align 8, !tbaa !1,
> > !alias.scope !8, !noalias !5 <-> store double %2, double*
> > %arrayidx6, align 8, !tbaa !1, !alias.scope !5, !noalias !8
> >
> >
> > after I annotate the store using arrayidx6 as not aliasing with
> > the
> > load using arrayidx.i. Shouldn't the alias information of the
> > memory
> > instructions be propagated to the used pointers by the alias
> > analysis pass?
>
> No, LLVM's current AA infrastructure does not do this kind of
> backward inference from accesses to their pointers (you won't get
> this from TBAA metadata either). Do you feel this would be useful?
>

My feeling is that the AliasAnalysis would be able to compute more
accurate results if it propagates the information in the metadata. I
see two advantages in doing that:

From a development viewpoint, it would be nice to have the alias
information centralized in a single class and avoid more complex
schemes to combine metadata and analysis information. E.g. LICM,
gets the Alias Analysis, loads the metadata and forward the
information to the tracker that combines both. Maybe there is a good
reason to use this scheme that I am not aware, but using AA only
would be more elegant and easier to maintain.
From a performance viewpoint, the ability to make AA as accurate as
possible by propagating the metadata information may result in
significant improvements. In particular for LLVM IR clients that can
infer a lot of alias information from given language properties,
they can use that to heavily annotate the code. In those cases AA
would be almost irrelevant as the metadata contain most information.

What do you think about this? If one thinks this is a good direction
to go I'd be happy to help doing that.

I think that the benefit is very unclear; we derive software maintainability benefits from the current modular system, and even within this system, the different AA passes can feed information to each other.

Most users of AA are really interested in whether they can reorder, or prove redundant, some loads or stores or function calls, making the results returned for the pointers themselves somewhat secondary. If you'd like to argue for a different setup, you'd need to provide some clear use cases that the current infrastructure fails to address.

Thanks again,
Hal