Function attributes for LibFunc and its impact on GlobalsAA

Hi,

GlobalsAA, during propagation of mod-ref behavior in the call graph, looks at library functions (in GlobalsAAResult::AnalyzeCallGraph: F->isDeclaration() check), for attributes, and if the function does not have the onlyReadsMemory attribute set, forgets it.

I noticed that library functions such as malloc/realloc do not have the attributes doesNotAccessMemory or onlyReadsMemory respectively set (FunctionAttrs.cpp). This leads to a loss of GlobalsAA information in the caller (and its caller and so on). Aren’t these attributes stricter than necessary currently? I do not see why the presence of malloc/realloc in a function needs to invalidate all mod-ref info gathered for that function so far.

Please let me know if the question is not clear. I’ll try to extract out a simple test case from the program I’m looking at and post it, so as to have a concrete example.

Thanks,

Hi Vaivaswatha,

I think not adding readnone/readonly to malloc/realloc is correct. malloc/free hooks can be added to most implementations (for leak checking and so on), so calling malloc could in fact call any other arbitrary code that could write to memory.

Cheers,

James

Hi James,

Thank you for the response. I understand the concern about malloc/free hooks. Could we detect that a program has setup malloc hooks (assuming we’re in a whole program compilation) and make assumptions (such as onlyAccessesArgMem()) when the program hasn’t setup malloc hooks? Using a command line flag could be one option too.

I’m currently working on a program where having these attributes could help GlobalsAA give significantly more precise results. Considering that this info is propagated to the caller, its caller and so on, this may have a wider impact than the program I’m currently looking at.

Thanks,

Hi,

I think that might be difficult to detect. If you wanted to force this behaviour in your own toolchain, you could just use “-mllvm -force-attribute=malloc:readnone” on the clang command line?

James

From: "James Molloy via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Vaivaswatha Nagaraj" <vn@compilertree.com>
Cc: "LLVM Dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, December 3, 2015 4:41:46 AM
Subject: Re: [llvm-dev] Function attributes for LibFunc and its impact on GlobalsAA

Hi,

I think that might be difficult to detect. If you wanted to force
this behaviour in your own toolchain, you could just use "-mllvm
-force-attribute=malloc:readnone" on the clang command line?

This is unlikely to be desirable. A readnone function is one whose output is a function only of its inputs, and if you have this:

  int *x = malloc(4);
  *x = 2;
  int *y = malloc(4);
  *y = 4;

you certainly don't want EarlyCSE to replace the second call to malloc with the result of the first (which it will happily do if you mark malloc as readnone).

readonly is a more-interesting question, because, in practice, this will currently work. It works, however, for the wrong reason (as I recall, we currently don't CSE readonly calls because we need to assume that they might have infinite loops, which is a problem we need to otherwise fix). Thus, marking it readonly is probably not a good long-term plan.

Given that malloc is an important special case, however, giving it special handling is potentially reasonable (we have isMallocLikeFn and isOperatorNewLikeFn in MemoryBuiltins.h). One might argue that tagging malloc() as readonly might break an LTO build, but doing so already potentially has problems because of our aliasing assumptions. malloc hooks are an interesting point, but those are not standard, not commonly used, can already cause violations of our aliasing assumptions, and the problem that hooking libc functions that we assume are readonly in a way that changes state visible to the caller might break things is not unique to malloc. Users always have the option of turning off these kinds of assumptions by compiling with -fno-builtin-malloc.

-Hal

Hi Hal,

malloc hooks are an interesting point, but those are not standard, not commonly used

I very much agree with this.

readonly is a more-interesting question, because, in practice, this will currently work. It works, however, for the wrong reason

Rather than read-only, could we mark malloc/free etc with onlyAccessesArgMem()? GlobalsAA would just need a simple check to ignore such functions (along with read-only which it already is checking for) during propagation along the call graph. As a reference, I’m attaching a prototype patch (This patch is on release37 unfortunately, but is applicable verbatim to the latest svn version).

Thanks,

globalsaa-malloc.diff (81.2 KB)

From: "Vaivaswatha Nagaraj" <vn@compilertree.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "LLVM Dev" <llvm-dev@lists.llvm.org>, "James Molloy" <james@jamesmolloy.co.uk>
Sent: Thursday, December 3, 2015 9:20:16 AM
Subject: Re: [llvm-dev] Function attributes for LibFunc and its impact on GlobalsAA

Hi Hal,

>malloc hooks are an interesting point, but those are not standard,
>not commonly used
I very much agree with this.

>readonly is a more-interesting question, because, in practice, this
>will currently work. It works, however, for the wrong reason
Rather than read-only, could we mark malloc/free etc with
onlyAccessesArgMem()? GlobalsAA would just need a simple check to
ignore such functions (along with read-only which it already is
checking for) during propagation along the call graph. As a
reference, I'm attaching a prototype patch (This patch is on
release37 unfortunately, but is applicable verbatim to the latest
svn version).

The semantics here are not quite right. These functions don't only access memory based on their pointer arguments, but also other global state. It just happens to be that this global state is not something with which any accesses in user code can alias.

For example, in your patch, you're adding argmemonly to printf. This is not right:

void foo(char * restrict s1, char * restrict s2) {
  printf(s1);
  printf(s2);
}

If printf is argmemonly, then we could interchange the two printf calls. For malloc this is still a problem, in the following sense, if we have:

  p1 = malloc(really_big);
  ...
  free(p1);

  p2 = malloc(really_big);
  ...
  free(p2);

allowing a transformation into:

  p1 = malloc(really_big);
  p2 = malloc(really_big);
  ...
  free(p1); free(p2);

could be problematic.

I understand what you're trying to do, but I think you need to introduce a new attribute to do it. We don't currently have one that fits. You kind of want something like argmem_and_inaccessible_state_only (bikeshedding aside). I'm in favor of adding such an attribute, it seems like it could be applied to many things. I suggest you precisely define the semantics you want in a new RFC, give some examples of the functions to which it might apply, and discuss where/how it will be used by analysis/transformations.

-Hal

From: "James Molloy via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Vaivaswatha Nagaraj" <vn@compilertree.com>
Cc: "LLVM Dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, December 3, 2015 4:41:46 AM
Subject: Re: [llvm-dev] Function attributes for LibFunc and its impact on GlobalsAA

Hi,

I think that might be difficult to detect. If you wanted to force
this behaviour in your own toolchain, you could just use "-mllvm
-force-attribute=malloc:readnone" on the clang command line?

This is unlikely to be desirable. A readnone function is one whose output is a function only of its inputs, and if you have this:

int *x = malloc(4);
*x = 2;
int *y = malloc(4);
*y = 4;

you certainly don't want EarlyCSE to replace the second call to malloc with the result of the first (which it will happily do if you mark malloc as readnone).

readonly is a more-interesting question, because, in practice, this will currently work. It works, however, for the wrong reason (as I recall, we currently don't CSE readonly calls because we need to assume that they might have infinite loops, which is a problem we need to otherwise fix). Thus, marking it readonly is probably not a good long-term plan.

Given that malloc is an important special case, however, giving it special handling is potentially reasonable (we have isMallocLikeFn and isOperatorNewLikeFn in MemoryBuiltins.h). One might argue that tagging malloc() as readonly might break an LTO build, but doing so already potentially has problems because of our aliasing assumptions. malloc hooks are an interesting point, but those are not standard, not commonly used, can already cause violations of our aliasing assumptions, and the problem that hooking libc functions that we assume are readonly in a way that changes state visible to the caller might break things is not unique to malloc. Users always have the option of turning off these kinds of assumptions by compiling with -fno-builtin-malloc.

Side note: Currently, clang does not support -fno-builtin-foo options. However, I posted a patch earlier today to resolve this longstanding bug. Feel free to give it a look.. :slight_smile:

Chad