[RFC] Adding a globalmemonly attribute for functions that only access global memory

Hello everyone,

We propose the addition of a GlobalMemOnly attribute to LLVM.

Definition:

This attribute indicates that the only memory accessed inside a function, that is visible to the callee, is global memory. If the function reads or writes other memory that is visible to the callee, the behavior is undefined.

Motivation:

Memory attributes already exist for specifying the memory locations accessed by a function, such as ArgMemOnly and InacessibleMemOnly. Similarly, there are attributes for describing memory behaviour, such as ReadOnly and WriteOnly.

However, no attributes exist for differentiating the types of memory objects that functions can access, such as global objects and objects created by alloca. We would like to start by introducing an attribute to deal with functions that only access global objects.

The addition of the GlobalMemOnly attribute can benefit LLVM as follows:

  1. Improves Alias Analysis, since this information can be used to improve the determination of aliasing between functions and memory locations that point to global objects.

  2. Provides Transforms with greater information: many math lib calls only modify the global variable errno. Unnecessary loads of the call arguments can therefore be removed, since the calls will have no side-effects on them.

Patches Overview:

  1. Definition of GlobalMemOnly in the AsmParser, IR, and LLVM Bitcode. It is also defined in the LangRef.

https://reviews.llvm.org/D109644

  1. Support is added to Alias Analysis. If a function is GlobalMemOnly and a memory location does not point to a global object, getModRefInfo can infer that no aliasing between the function and the location can occur. This patch also adds a regression test for BasicAA.

https://reviews.llvm.org/D109647

  1. Certain math lib calls are assigned GlobalMemOnly, since they only modify the global variable errno. These calls include common functions, such as atan, exp, and log.

https://reviews.llvm.org/D109648

Please let us know of any thoughts you may have regarding this attribute’s implementation and use cases. We believe that it can strengthen LLVM’s alias analysis and optimization passes if added.

Thanks,
Mugilan

Hi Mugilan,

I think this is an interesting direction. I’ve recently put some thought into the memory attributes you mention for unrelated reasons, and found that the various XYZonly attributes don’t compose well, while noXYZ attributes compose better.The reason is simple. With what you’re proposing, we end up with:

  • argmemonly
  • inaccessiblememonly
  • inaccessibleorargmemonly
  • globalmemonly

There are 4 attributes talking about 3 types of memory. Clearly, there are orthogonality issues. First of all, glancing at your patch, your new attribute should really be called inaccessibleorglobalmemonly. What if somebody wants argorglobalmemonly? Do they add yet another attribute? That way lies madness.

If we instead take a step back, we can recognize that there are different “paths” of accessing memory:

  • arg
  • global
  • inaccessible
  • indirect (i.e., via a pointer loaded from memory)
  • other (e.g., pointer returned by black box function)

These could map very nicely onto noXYZ attributes. You end up 5 attributes: noargmem, noglobalmem, noinaccessiblemem, noindirectmem, noothermem. Nicely composable, in particular today’s attribute are mapped as follows:

  • argmemonly → noglobalmem noinaccessiblemem noindirectmem noothermem
  • inaccessiblememonly → noargmem noglobalmem noindirectmem noothermem
  • inaccessibleorargmemonly → noglobalmem noindirectmem noothermem
  • globalmemonly → noargmem noindirectmem noothermem (remember, you seem to want to allow access to inaccessible memory here)

More importantly, the 5 attributes can express any future desired combination, e.g. a function that accesses memory through argument pointers, and through pointers loaded from there (transitively) is: noglobalmem noinaccessiblemem noothermem

Thoughts?

Cheers,
Nicolai

Hi Nicolai,

Hi Mugilan,

I think this is an interesting direction. I've recently put some thought
into the memory attributes you mention for unrelated reasons, and found
that the various XYZonly attributes don't compose well, while noXYZ
attributes compose better.The reason is simple. With what you're proposing,
we end up with:

* argmemonly
* inaccessiblememonly
* inaccessibleorargmemonly
* globalmemonly

There are 4 attributes talking about 3 types of memory. Clearly, there are
orthogonality issues. First of all, glancing at your patch, your new
attribute should really be called inaccessibleorglobalmemonly. What if
somebody wants argorglobalmemonly? Do they add yet another attribute? That
way lies madness.

Agreed. We actually wanted to use this to make the point that we
need to switch up the scheme :slight_smile:

That said, I think the handling of globalmemonly should be discussed
and merged (after refinement) as it will be reused whatever the later
scheme will be.

If we instead take a step back, we can recognize that there are different
"paths" of accessing memory:

* arg
* global
* inaccessible
* indirect (i.e., via a pointer loaded from memory)
* other (e.g., pointer returned by black box function)

These could map very nicely onto noXYZ attributes. You end up 5 attributes:
noargmem, noglobalmem, noinaccessiblemem, noindirectmem, noothermem. Nicely
composable, in particular today's attribute are mapped as follows:

* argmemonly -> noglobalmem noinaccessiblemem noindirectmem noothermem
* inaccessiblememonly -> noargmem noglobalmem noindirectmem noothermem
* inaccessibleorargmemonly -> noglobalmem noindirectmem noothermem
* globalmemonly -> noargmem noindirectmem noothermem (remember, you seem to
want to allow access to inaccessible memory here)

More importantly, the 5 attributes can express any future desired
combination, e.g. a function that accesses memory through argument
pointers, and through pointers loaded from there (transitively) is:
noglobalmem noinaccessiblemem noothermem

Thoughts?

I was thinking we go with a bitfield but otherwise I'm with you. This
is the Attributor AAMemoryLocation encoding:

     /// Encoding of different locations that could be accessed by a memory
     /// access.
     enum {
      ALL_LOCATIONS = 0,
      NO_LOCAL_MEM = 1 << 0,
      NO_CONST_MEM = 1 << 1,
      NO_GLOBAL_INTERNAL_MEM = 1 << 2,
      NO_GLOBAL_EXTERNAL_MEM = 1 << 3,
      NO_GLOBAL_MEM = NO_GLOBAL_INTERNAL_MEM | NO_GLOBAL_EXTERNAL_MEM,
      NO_ARGUMENT_MEM = 1 << 4,
      NO_INACCESSIBLE_MEM = 1 << 5,
      NO_MALLOCED_MEM = 1 << 6,
      NO_UNKOWN_MEM = 1 << 7,
      NO_LOCATIONS = NO_LOCAL_MEM | NO_CONST_MEM | NO_GLOBAL_INTERNAL_MEM |
                     NO_GLOBAL_EXTERNAL_MEM | NO_ARGUMENT_MEM |
                     NO_INACCESSIBLE_MEM | NO_MALLOCED_MEM | NO_UNKOWN_MEM,
     }

I was thinking to use this, or a variation of that, as the argument of an attribute. Thus,
`access_locations(~(NO_GLOBAL_MEM | NO_ARGUMENT_MEM))`
to express argument and global memory is potentially accessed.

That said, I *also* want us to allow to use this interface for "special memory locations".
What I mean is for example a special "location" for things like intrinsics:
`llvm.assume` writes only to bit "LLVM_ASSUME_MEM"
`llvm.launder.invariant.group` writes only to bit "LLVM_LAUNDER_INVARIANT_GROUP" (D109548)
...
This way they stop to interfere with each other and other effects.

I hope this makes some sense.

~ Johannes

+1 to the overall proposal. The use case makes sense and is worth handling. Given the immediate benefit appears to mostly errno focused, I might have gone with a more specific errnomemonly flag, but since this proposal is broadly applicable and appears to solve enough of the errno problem to be useful, this seems fine too.

Nicolai raises an important point re: inaccessible memory that needs clarified in your patches. I think this is mostly a naming and/or documentation clarity point though.

I happen to agree with both Nicolai and Johannes that we need to invert our current memory attribute structure, but as with Johannes, I would not want to make that blocking for this proposal.

Philip

Hi Nicolai,

Hi Mugilan,

I think this is an interesting direction. I've recently put some thought
into the memory attributes you mention for unrelated reasons, and found
that the various XYZonly attributes don't compose well, while noXYZ
attributes compose better.The reason is simple. With what you're proposing,
we end up with:

* argmemonly
* inaccessiblememonly
* inaccessibleorargmemonly
* globalmemonly

There are 4 attributes talking about 3 types of memory. Clearly, there are
orthogonality issues. First of all, glancing at your patch, your new
attribute should really be called inaccessibleorglobalmemonly. What if
somebody wants argorglobalmemonly? Do they add yet another attribute? That
way lies madness.

Agreed. We actually wanted to use this to make the point that we
need to switch up the scheme :slight_smile:

That said, I think the handling of globalmemonly should be discussed
and merged (after refinement) as it will be reused whatever the later
scheme will be.

If we instead take a step back, we can recognize that there are different
"paths" of accessing memory:

* arg
* global
* inaccessible
* indirect (i.e., via a pointer loaded from memory)
* other (e.g., pointer returned by black box function)

These could map very nicely onto noXYZ attributes. You end up 5 attributes:
noargmem, noglobalmem, noinaccessiblemem, noindirectmem, noothermem. Nicely
composable, in particular today's attribute are mapped as follows:

* argmemonly -> noglobalmem noinaccessiblemem noindirectmem noothermem
* inaccessiblememonly -> noargmem noglobalmem noindirectmem noothermem
* inaccessibleorargmemonly -> noglobalmem noindirectmem noothermem
* globalmemonly -> noargmem noindirectmem noothermem (remember, you seem to
want to allow access to inaccessible memory here)

More importantly, the 5 attributes can express any future desired
combination, e.g. a function that accesses memory through argument
pointers, and through pointers loaded from there (transitively) is:
noglobalmem noinaccessiblemem noothermem

Thoughts?

I was thinking we go with a bitfield but otherwise I'm with you. This
is the Attributor AAMemoryLocation encoding:

    /// Encoding of different locations that could be accessed by a memory
    /// access.
    enum {
     ALL_LOCATIONS = 0,
     NO_LOCAL_MEM = 1 << 0,
     NO_CONST_MEM = 1 << 1,
     NO_GLOBAL_INTERNAL_MEM = 1 << 2,
     NO_GLOBAL_EXTERNAL_MEM = 1 << 3,
     NO_GLOBAL_MEM = NO_GLOBAL_INTERNAL_MEM | NO_GLOBAL_EXTERNAL_MEM,
     NO_ARGUMENT_MEM = 1 << 4,
     NO_INACCESSIBLE_MEM = 1 << 5,
     NO_MALLOCED_MEM = 1 << 6,
     NO_UNKOWN_MEM = 1 << 7,
     NO_LOCATIONS = NO_LOCAL_MEM | NO_CONST_MEM | NO_GLOBAL_INTERNAL_MEM |
                    NO_GLOBAL_EXTERNAL_MEM | NO_ARGUMENT_MEM |
                    NO_INACCESSIBLE_MEM | NO_MALLOCED_MEM | NO_UNKOWN_MEM,
    }

I was thinking to use this, or a variation of that, as the argument of an attribute. Thus,
`access_locations(~(NO_GLOBAL_MEM | NO_ARGUMENT_MEM))`
to express argument and global memory is potentially accessed.

That said, I *also* want us to allow to use this interface for "special memory locations".
What I mean is for example a special "location" for things like intrinsics:
`llvm.assume` writes only to bit "LLVM_ASSUME_MEM"
`llvm.launder.invariant.group` writes only to bit "LLVM_LAUNDER_INVARIANT_GROUP" (D109548)
...
This way they stop to interfere with each other and other effects.

This last piece reads like a separate change to me. I wouldn't want to tie the inversion with this extension. Having said that, let me bike shed your separate proposal. :slight_smile:

If we had an attribute which allowed a specific list of globals potentially aliased, and introduced a set of predefined symbolic global names, we could solve this use case and also handle the more general "but this function has a known aliasing set!" use case as well. In pseudo IR, consider:

@G = external global i32

declare void @restricted() globalmayalias(@G, @llvm.assume_control)

In fact, the original motivation for the globalmemonly proposal could be alternatively solved with:

@errno = external global i32

declare void @mathfunc(float, float) globalmayalias(@errno)

How would that work, since errno generally isn’t a global (or a compiler-supported thread-local)? It’s typically something along the lines of:
int* __errno_location(void) attribute((const));
#define errno *__errno_location()

Technically, we can make up locations as long as we are consistent.

Hi Nicolai,

Hi Mugilan,

I think this is an interesting direction. I've recently put some thought
into the memory attributes you mention for unrelated reasons, and found
that the various XYZonly attributes don't compose well, while noXYZ
attributes compose better.The reason is simple. With what you're proposing,
we end up with:

* argmemonly
* inaccessiblememonly
* inaccessibleorargmemonly
* globalmemonly

There are 4 attributes talking about 3 types of memory. Clearly, there are
orthogonality issues. First of all, glancing at your patch, your new
attribute should really be called inaccessibleorglobalmemonly. What if
somebody wants argorglobalmemonly? Do they add yet another attribute? That
way lies madness.

Agreed. We actually wanted to use this to make the point that we
need to switch up the scheme :slight_smile:

That said, I think the handling of globalmemonly should be discussed
and merged (after refinement) as it will be reused whatever the later
scheme will be.

If we instead take a step back, we can recognize that there are different
"paths" of accessing memory:

* arg
* global
* inaccessible
* indirect (i.e., via a pointer loaded from memory)
* other (e.g., pointer returned by black box function)

These could map very nicely onto noXYZ attributes. You end up 5 attributes:
noargmem, noglobalmem, noinaccessiblemem, noindirectmem, noothermem. Nicely
composable, in particular today's attribute are mapped as follows:

* argmemonly -> noglobalmem noinaccessiblemem noindirectmem noothermem
* inaccessiblememonly -> noargmem noglobalmem noindirectmem noothermem
* inaccessibleorargmemonly -> noglobalmem noindirectmem noothermem
* globalmemonly -> noargmem noindirectmem noothermem (remember, you seem to
want to allow access to inaccessible memory here)

More importantly, the 5 attributes can express any future desired
combination, e.g. a function that accesses memory through argument
pointers, and through pointers loaded from there (transitively) is:
noglobalmem noinaccessiblemem noothermem

Thoughts?

I was thinking we go with a bitfield but otherwise I'm with you. This
is the Attributor AAMemoryLocation encoding:

    /// Encoding of different locations that could be accessed by a memory
    /// access.
    enum {
     ALL_LOCATIONS = 0,
     NO_LOCAL_MEM = 1 << 0,
     NO_CONST_MEM = 1 << 1,
     NO_GLOBAL_INTERNAL_MEM = 1 << 2,
     NO_GLOBAL_EXTERNAL_MEM = 1 << 3,
     NO_GLOBAL_MEM = NO_GLOBAL_INTERNAL_MEM | NO_GLOBAL_EXTERNAL_MEM,
     NO_ARGUMENT_MEM = 1 << 4,
     NO_INACCESSIBLE_MEM = 1 << 5,
     NO_MALLOCED_MEM = 1 << 6,
     NO_UNKOWN_MEM = 1 << 7,
     NO_LOCATIONS = NO_LOCAL_MEM | NO_CONST_MEM | NO_GLOBAL_INTERNAL_MEM |
                    NO_GLOBAL_EXTERNAL_MEM | NO_ARGUMENT_MEM |
                    NO_INACCESSIBLE_MEM | NO_MALLOCED_MEM | NO_UNKOWN_MEM,
    }

I was thinking to use this, or a variation of that, as the argument of an attribute. Thus,
`access_locations(~(NO_GLOBAL_MEM | NO_ARGUMENT_MEM))`
to express argument and global memory is potentially accessed.

That said, I *also* want us to allow to use this interface for "special memory locations".
What I mean is for example a special "location" for things like intrinsics:
`llvm.assume` writes only to bit "LLVM_ASSUME_MEM"
`llvm.launder.invariant.group` writes only to bit "LLVM_LAUNDER_INVARIANT_GROUP" (D109548)
...
This way they stop to interfere with each other and other effects.

This last piece reads like a separate change to me. I wouldn't want to tie the inversion with this extension.

My hope was to do this step by step.

So actually advance something with this patch set and start the discussion for a generalization.

If we decide right away to change the global handling we might skip parts of this patch set though.

Having said that, let me bike shed your separate proposal. :slight_smile:

If we had an attribute which allowed a specific list of globals potentially aliased, and introduced a set of predefined symbolic global names, we could solve this use case and also handle the more general "but this function has a known aliasing set!" use case as well. In pseudo IR, consider:

@G = external global i32

declare void @restricted() globalmayalias(@G, @llvm.assume_control)

In fact, the original motivation for the globalmemonly proposal could be alternatively solved with:

@errno = external global i32

declare void @mathfunc(float, float) globalmayalias(@errno)

I like this, it solves almost all the "global memory" use cases with more precision than the proposal here.
IIRC, this was what I proposed to Mugilan first but we figured it might be easier to do it in multiple steps.

That said, I'd also want us to use explicit enumartations or `argmemonly` and `inaccessiblememonly`:
`accessed_args(/* ArgNo */ 0, /* ArgNo */ 3)`
`accessed_inaccessible_mem(/* ALL */ -1)`
`accessed_inaccessible_mem(/* llvm.assume_control */ 0x01)`

Maybe if we have globalmayalias and the "control" bits are internal we might not need inaccessible memory anymore, that would be even better.

~ Johannes

Didn't actually know that. Johannes point about consistent lies does apply, but I hadn't had that in mind when I wrote this. We might also be able to play games with constexpr-like functions being "similar" to globals, but honestly, I'd probably just take a different approach for this particular problem. Like, say, the original RFC in this thread.

Philip

Hi Nicolai,

Hi Mugilan,

I think this is an interesting direction. I've recently put some thought
into the memory attributes you mention for unrelated reasons, and found
that the various XYZonly attributes don't compose well, while noXYZ
attributes compose better.The reason is simple. With what you're proposing,
we end up with:

* argmemonly
* inaccessiblememonly
* inaccessibleorargmemonly
* globalmemonly

There are 4 attributes talking about 3 types of memory. Clearly, there are
orthogonality issues. First of all, glancing at your patch, your new
attribute should really be called inaccessibleorglobalmemonly. What if
somebody wants argorglobalmemonly? Do they add yet another attribute? That
way lies madness.

Agreed. We actually wanted to use this to make the point that we
need to switch up the scheme :slight_smile:

That said, I think the handling of globalmemonly should be discussed
and merged (after refinement) as it will be reused whatever the later
scheme will be.

If we instead take a step back, we can recognize that there are different
"paths" of accessing memory:

* arg
* global
* inaccessible
* indirect (i.e., via a pointer loaded from memory)
* other (e.g., pointer returned by black box function)

These could map very nicely onto noXYZ attributes. You end up 5 attributes:
noargmem, noglobalmem, noinaccessiblemem, noindirectmem, noothermem. Nicely
composable, in particular today's attribute are mapped as follows:

* argmemonly -> noglobalmem noinaccessiblemem noindirectmem noothermem
* inaccessiblememonly -> noargmem noglobalmem noindirectmem noothermem
* inaccessibleorargmemonly -> noglobalmem noindirectmem noothermem
* globalmemonly -> noargmem noindirectmem noothermem (remember, you seem to
want to allow access to inaccessible memory here)

More importantly, the 5 attributes can express any future desired
combination, e.g. a function that accesses memory through argument
pointers, and through pointers loaded from there (transitively) is:
noglobalmem noinaccessiblemem noothermem

Thoughts?

I was thinking we go with a bitfield but otherwise I'm with you. This
is the Attributor AAMemoryLocation encoding:

    /// Encoding of different locations that could be accessed by a memory
    /// access.
    enum {
     ALL_LOCATIONS = 0,
     NO_LOCAL_MEM = 1 << 0,
     NO_CONST_MEM = 1 << 1,
     NO_GLOBAL_INTERNAL_MEM = 1 << 2,
     NO_GLOBAL_EXTERNAL_MEM = 1 << 3,
     NO_GLOBAL_MEM = NO_GLOBAL_INTERNAL_MEM | NO_GLOBAL_EXTERNAL_MEM,
     NO_ARGUMENT_MEM = 1 << 4,
     NO_INACCESSIBLE_MEM = 1 << 5,
     NO_MALLOCED_MEM = 1 << 6,
     NO_UNKOWN_MEM = 1 << 7,
     NO_LOCATIONS = NO_LOCAL_MEM | NO_CONST_MEM | NO_GLOBAL_INTERNAL_MEM |
                    NO_GLOBAL_EXTERNAL_MEM | NO_ARGUMENT_MEM |
                    NO_INACCESSIBLE_MEM | NO_MALLOCED_MEM | NO_UNKOWN_MEM,
    }

I was thinking to use this, or a variation of that, as the argument of an attribute. Thus,
`access_locations(~(NO_GLOBAL_MEM | NO_ARGUMENT_MEM))`
to express argument and global memory is potentially accessed.

That said, I *also* want us to allow to use this interface for "special memory locations".
What I mean is for example a special "location" for things like intrinsics:
`llvm.assume` writes only to bit "LLVM_ASSUME_MEM"
`llvm.launder.invariant.group` writes only to bit "LLVM_LAUNDER_INVARIANT_GROUP" (D109548)
...
This way they stop to interfere with each other and other effects.

This last piece reads like a separate change to me. I wouldn't want to tie the inversion with this extension.

My hope was to do this step by step.

So actually advance something with this patch set and start the discussion for a generalization.

If we decide right away to change the global handling we might skip parts of this patch set though.

I'm in full support of incrementalism here. Let's not let perfection be the enemy of the good. :slight_smile:

Having said that, let me bike shed your separate proposal. :slight_smile:

If we had an attribute which allowed a specific list of globals potentially aliased, and introduced a set of predefined symbolic global names, we could solve this use case and also handle the more general "but this function has a known aliasing set!" use case as well. In pseudo IR, consider:

@G = external global i32

declare void @restricted() globalmayalias(@G, @llvm.assume_control)

In fact, the original motivation for the globalmemonly proposal could be alternatively solved with:

@errno = external global i32

declare void @mathfunc(float, float) globalmayalias(@errno)

I like this, it solves almost all the "global memory" use cases with more precision than the proposal here.
IIRC, this was what I proposed to Mugilan first but we figured it might be easier to do it in multiple steps.

(agreed, incrementalism is good!)

That said, I'd also want us to use explicit enumartations or `argmemonly` and `inaccessiblememonly`:
`accessed_args(/* ArgNo */ 0, /* ArgNo */ 3)`
`accessed_inaccessible_mem(/* ALL */ -1)`
`accessed_inaccessible_mem(/* llvm.assume_control */ 0x01)`

Maybe if we have globalmayalias and the "control" bits are internal we might not need inaccessible memory anymore, that would be even better.

Rather than run the risk of derailing this thread further, why don't we defer this conversation to a future thread?