[RFC] Storing default function attributes on the module

Hi Duncan,

Been thinking about this a bit and a few comments/questions. I may have misunderstood some things in your mail though so please feel free to explain at me :slight_smile:

Changing clang to store target defaults on the module will allow us to
continue to override them when running llc. The right precedence
would be:

  1. Explicit attributes set on the function.
  2. llc command-line options.
  3. Default function attributes stored on the module.

(Outside of llc, skip step 2.)

In lib/Linker (i.e., llvm-lto, llvm-link, libLTO.dylib),
defaults should be pushed down as explicit function attributes.

I think there are a few options/backend communications that aren’t/haven’t gone this way yet that are probably worth considering:

MCTargetOptions/TargetOptions: Basically a grab bag of functionality, some of which is duplicated via attributes and some that’s not. I think at least some of these should be replaced by module flags, e.g. ABI.

Random backend flags: Anything for debugging.

I’m thinking things that are either set as Init(true/false) and affect things at a global level and not just the function level.

They look like this in assembly:

attributes default = { “no-frame-pointer-elim”=“false” }

So, how do you see module merging and defaults working? (Yes, there were testcases, but let’s go with prose here. I found the testcases a bit hard to reason.)

Limitations

There are a few limitations with this approach (at least, with my
reference implementation).

  • Function::getAttributes() only reflects the explicitly specified
    attributes, skipping those set as module defaults.

Ick. Pretty severe limitation? I.e. how would it work to test general attributes on a function during code gen?

  • If an enum attribute is set as a default, there’s no way for a
    function-attribute to override it. In practice, we could avoid the
    feature for enum attributes.

Hrm. This seems like a pretty severe limitation? Anything come to mind in practice.

  • CallSite instructions store function-level attributes, but don’t
    forward to the module-level defaults. There are places (like the
    calls to EmitUnaryFloatFnCall() in -simplify-libcalls) where we
    use the callee function attributes to set the call site attributes.
    In practice, we could avoid the feature for attributes that are
    meaningful for call sites.

Sure.

  • Intrinsics’ attributes are independent of CodeGenOptions, and set
    via Instrinsic::getAttributes(). With this change they’d inherit
    the default attributes like other functions. Is this a problem?
    If so, we can add a flag on Function that inhibits forwarding to
    the defaults.

Seems reasonable.

Thoughts? Other ideas for solving the llc problem?

I think this is a good start, I think I’d like to worry about some of the other issues in advance before we start incrementally changing things though. (Also, I really have no other ideas :slight_smile:

-eric

Hi Duncan,

Been thinking about this a bit and a few comments/questions. I may have misunderstood some things in your mail though so please feel free to explain at me :slight_smile:

Changing `clang` to store target defaults on the module will allow us to
continue to override them when running `llc`. The right precedence
would be:

  1. Explicit attributes set on the function.
  2. `llc` command-line options.
  3. Default function attributes stored on the module.

(Outside of `llc`, skip step 2.)

In `lib/Linker` (i.e., `llvm-lto`, `llvm-link`, `libLTO.dylib`),
defaults should be pushed down as explicit function attributes.

I think there are a few options/backend communications that aren't/haven't gone this way yet that are probably worth considering:

MCTargetOptions/TargetOptions: Basically a grab bag of functionality, some of which is duplicated via attributes and some that's not. I think at least some of these should be replaced by module flags, e.g. ABI.

Random backend flags: Anything for debugging.

I'm thinking things that are either set as Init(true/false) and affect things at a global level and not just the function level.

(I think my lib/Linker comment was unclear. See below.)

Not at all trying to say that *everything* should be a function
attribute; global-level stuff should absolutely be module flags.

I'm just talking about infrastructure for the things that *are*
function-level.

They look like this in assembly:

    attributes default = { "no-frame-pointer-elim"="false" }

So, how do you see module merging and defaults working? (Yes, there were testcases, but let's go with prose here. I found the testcases a bit hard to reason.)

This is where my lib/Linker comment applies:

In `lib/Linker` (i.e., `llvm-lto`, `llvm-link`, `libLTO.dylib`),
defaults should be pushed down as explicit function attributes.

^ This is how I see module merging and defaults working: push the
defaults down to explicit function attributes. So there wouldn't
be any default function attributes in the output of `llvm-link`.
This means that `llc` will still have trouble overriding attributes
in the output of merged modules -- but at least it can handle the
output of `clang` without trouble. In the future we could try to
be more intelligent about merged modules, and keep common options
in the default set.

Limitations

There are a few limitations with this approach (at least, with my
reference implementation).

  - `Function::getAttributes()` only reflects the explicitly specified
    attributes, skipping those set as module defaults.

Ick. Pretty severe limitation? I.e. how would it work to test general attributes on a function during code gen?

As long as everyone calls `Function::hasFnAttribute()`, there's no
problem. This proposal basically turns it into a bug to access
them directly; you need to go through `Function::hasFnAttribute()`
to get the right answer. (Not sure if there's a better way?)

  - If an enum attribute is set as a default, there's no way for a
    function-attribute to override it. In practice, we could avoid the
    feature for enum attributes.

Hrm. This seems like a pretty severe limitation? Anything come to mind in practice.

In the `Attribute` format, mutually exclusive attributes aren't
related at all (they're not inherently mutually exclusive). To
make them overridable, we'd need a completely new design for
enum attributes.

As a result, this proposal only improves `llc` for string-based
attributes. I don't see that as a problem... the string-based
attributes are more flexible anyway. Maybe `Module` should only
allow `setDefaultFnAttribute()` for string attributes though?

(Some more context on why enum attributes can't really be
overridden. This isn't just a problem for enum attributes that
are mutually exclusive. Consider:

    attributes defaults = { noreturn }

Besides being somewhat insane, there's no `return` attribute,
so you can't really override it. I suppose one idea would be to
explicitly mark a function `~noreturn` or something:

    define void @foo() ~noreturn { ; Ignore module default noreturn.

Not sure if this direction is a good one though.)

  - `CallSite` instructions store function-level attributes, but don't
    forward to the module-level defaults. There are places (like the
    calls to `EmitUnaryFloatFnCall()` in `-simplify-libcalls`) where we
    use the callee function attributes to set the call site attributes.
    In practice, we could avoid the feature for attributes that are
    meaningful for call sites.

Sure.

  - Intrinsics' attributes are independent of `CodeGenOptions`, and set
    via `Instrinsic::getAttributes()`. With this change they'd inherit
    the default attributes like other functions. Is this a problem?
    If so, we can add a flag on `Function` that inhibits forwarding to
    the defaults.

Seems reasonable.

Thoughts? Other ideas for solving the `llc` problem?

I think this is a good start, I think I'd like to worry about some of the other issues in advance before we start incrementally changing things though. (Also, I really have no other ideas :slight_smile:

So Akira had an idea at the end of last week that I don't
think made it onto the list, and it's worth considering as an
alternative:

Add a bit to attributes indicating whether or not they're
overridable (i.e., they're overridable if they're target
defaults; they're not overridable if they've been explicitly
specified somehow).

Here's some straw-man syntax:

    attributes #0 = { noreturn ssp? "att1"="1" "att2"="2"? }

Where:

  - `noreturn` and `"att1"="1"` are required.
  - `ssp` and `"att2"="2"` can be overridden (e.g., by `llc`).

(Alternately, but equivalently:

    attributes #0 = { noreturn! ssp "att1"="1"! "att2"="2" }

I like this syntax better, but it would cause more churn, and
`!` is so far reserved for metadata.)

Whatever the syntax, the idea is: `llc` resets/deletes
attributes on every function to match what's specified on the
command-line. In the above example, functions with attribute
set #0 could have `ssp` and `"att2"` overridden via the `llc`
command-line, but not `noreturn` and `"att1"`.

To compare it to my proposal:

  - Storing a default attribute set (my proposal) makes it
    easier to look at and set the defaults. Applying `llc`
    command-line options is easy, too -- just override the
    default attribute set on the module -- although it doesn't
    really work on the output of `lib/Linker`.
  - Storing a bit on each attribute (Akira's proposal) handles
    more cases. Nothing needs to be done in `lib/Linker`
    (`llc` is able to override the output of `llvm-link`),
    and it doesn't have a disconnect between `hasFnAttribute()`
    and `getAttributes().hasAttribute(FunctionIndex)`.

Awkwardly, having written that out, I'm kind of leaning toward
it myself right now (maybe I'm fickle?) -- it seems to have
fewer limitations. The main thing I prefer about my proposal
is that it's easier to change the default attributes when
modifying an assembly file by hand, but I suppose we could
write a tool for that?

>
> Hi Duncan,
>
> Been thinking about this a bit and a few comments/questions. I may have
misunderstood some things in your mail though so please feel free to
explain at me :slight_smile:
>
>
> Changing `clang` to store target defaults on the module will allow us to
> continue to override them when running `llc`. The right precedence
> would be:
>
> 1. Explicit attributes set on the function.
> 2. `llc` command-line options.
> 3. Default function attributes stored on the module.
>
> (Outside of `llc`, skip step 2.)
>
> In `lib/Linker` (i.e., `llvm-lto`, `llvm-link`, `libLTO.dylib`),
> defaults should be pushed down as explicit function attributes.
>
> I think there are a few options/backend communications that
aren't/haven't gone this way yet that are probably worth considering:
>
> MCTargetOptions/TargetOptions: Basically a grab bag of functionality,
some of which is duplicated via attributes and some that's not. I think at
least some of these should be replaced by module flags, e.g. ABI.
>
> Random backend flags: Anything for debugging.
>
> I'm thinking things that are either set as Init(true/false) and affect
things at a global level and not just the function level.

(I think my lib/Linker comment was unclear. See below.)

Not at all trying to say that *everything* should be a function
attribute; global-level stuff should absolutely be module flags.

I'm just talking about infrastructure for the things that *are*
function-level.

>
> They look like this in assembly:
>
> attributes default = { "no-frame-pointer-elim"="false" }
>
>
> So, how do you see module merging and defaults working? (Yes, there were
testcases, but let's go with prose here. I found the testcases a bit hard
to reason.)

This is where my lib/Linker comment applies:

>> In `lib/Linker` (i.e., `llvm-lto`, `llvm-link`, `libLTO.dylib`),
>> defaults should be pushed down as explicit function attributes.

^ This is how I see module merging and defaults working: push the
defaults down to explicit function attributes. So there wouldn't
be any default function attributes in the output of `llvm-link`.
This means that `llc` will still have trouble overriding attributes
in the output of merged modules -- but at least it can handle the
output of `clang` without trouble. In the future we could try to
be more intelligent about merged modules, and keep common options
in the default set.

>
> Limitations
> ===========
>
> There are a few limitations with this approach (at least, with my
> reference implementation).
>
> - `Function::getAttributes()` only reflects the explicitly specified
> attributes, skipping those set as module defaults.
>
> Ick. Pretty severe limitation? I.e. how would it work to test general
attributes on a function during code gen?

As long as everyone calls `Function::hasFnAttribute()`, there's no
problem. This proposal basically turns it into a bug to access
them directly; you need to go through `Function::hasFnAttribute()`
to get the right answer. (Not sure if there's a better way?)

>
> - If an enum attribute is set as a default, there's no way for a
> function-attribute to override it. In practice, we could avoid the
> feature for enum attributes.
>
> Hrm. This seems like a pretty severe limitation? Anything come to mind
in practice.

In the `Attribute` format, mutually exclusive attributes aren't
related at all (they're not inherently mutually exclusive). To
make them overridable, we'd need a completely new design for
enum attributes.

As a result, this proposal only improves `llc` for string-based
attributes. I don't see that as a problem... the string-based
attributes are more flexible anyway. Maybe `Module` should only
allow `setDefaultFnAttribute()` for string attributes though?

(Some more context on why enum attributes can't really be
overridden. This isn't just a problem for enum attributes that
are mutually exclusive. Consider:

    attributes defaults = { noreturn }

Besides being somewhat insane, there's no `return` attribute,
so you can't really override it. I suppose one idea would be to
explicitly mark a function `~noreturn` or something:

    define void @foo() ~noreturn { ; Ignore module default noreturn.

Not sure if this direction is a good one though.)

>
> - `CallSite` instructions store function-level attributes, but don't
> forward to the module-level defaults. There are places (like the
> calls to `EmitUnaryFloatFnCall()` in `-simplify-libcalls`) where we
> use the callee function attributes to set the call site attributes.
> In practice, we could avoid the feature for attributes that are
> meaningful for call sites.
>
> Sure.
>
> - Intrinsics' attributes are independent of `CodeGenOptions`, and set
> via `Instrinsic::getAttributes()`. With this change they'd inherit
> the default attributes like other functions. Is this a problem?
> If so, we can add a flag on `Function` that inhibits forwarding to
> the defaults.
>
>
> Seems reasonable.
>
> Thoughts? Other ideas for solving the `llc` problem?
>
>
> I think this is a good start, I think I'd like to worry about some of
the other issues in advance before we start incrementally changing things
though. (Also, I really have no other ideas :slight_smile:
>

So Akira had an idea at the end of last week that I don't
think made it onto the list, and it's worth considering as an
alternative:

Add a bit to attributes indicating whether or not they're
overridable (i.e., they're overridable if they're target
defaults; they're not overridable if they've been explicitly
specified somehow).

Here's some straw-man syntax:

    attributes #0 = { noreturn ssp? "att1"="1" "att2"="2"? }

Where:

  - `noreturn` and `"att1"="1"` are required.
  - `ssp` and `"att2"="2"` can be overridden (e.g., by `llc`).

(Alternately, but equivalently:

    attributes #0 = { noreturn! ssp "att1"="1"! "att2"="2" }

I like this syntax better, but it would cause more churn, and
`!` is so far reserved for metadata.)

Whatever the syntax, the idea is: `llc` resets/deletes
attributes on every function to match what's specified on the
command-line. In the above example, functions with attribute
set #0 could have `ssp` and `"att2"` overridden via the `llc`
command-line, but not `noreturn` and `"att1"`.

To compare it to my proposal:

  - Storing a default attribute set (my proposal) makes it
    easier to look at and set the defaults. Applying `llc`
    command-line options is easy, too -- just override the
    default attribute set on the module -- although it doesn't
    really work on the output of `lib/Linker`.
  - Storing a bit on each attribute (Akira's proposal) handles
    more cases. Nothing needs to be done in `lib/Linker`
    (`llc` is able to override the output of `llvm-link`),
    and it doesn't have a disconnect between `hasFnAttribute()`
    and `getAttributes().hasAttribute(FunctionIndex)`.

Awkwardly, having written that out, I'm kind of leaning toward
it myself right now (maybe I'm fickle?) -- it seems to have
fewer limitations. The main thing I prefer about my proposal
is that it's easier to change the default attributes when
modifying an assembly file by hand, but I suppose we could
write a tool for that?

I think the tool approach deserves a bit of attention for the original
usecase of trying different target attributes to see if they tickle a bug.
Would it be feasible to have a purpose-built tool `llvm-attr-mutate`
(bikeshed) so you can do `cat foo.bc | llvm-attr-mutate .... | llc`. The
arguments to llvm-attr-mutate could then be made as convenient/powerful as
needed for this debugging task.

-- Sean Silva

FWIW the tool approach is what we were coming up with in the first place :slight_smile:

Which would obviate the need for handling the command line options at all of course.

-eric

I've done some more thinking about this.

1. The module-level default attribute approach (what I started this
    thread with) cleans up assembly files and provides a clean hook for
    mutating the "defaults" from `llc`. However, it doesn't cleanly fit
    into the current semantic model for attributes, so you get weird
    inconsistencies depending on how you query them.
2. Flipping a bit to indicate whether something was the "default"
    solves the various technical problems. However, overriding the
    attributes requires walking all of the IR to rewrite them, and the
    assembly syntax readability will get worse, not better.
3. Using a separate tool allows you to override the attributes
    arbitrarily, but doesn't distinguish between "default" and explicit
    attributes. Furthermore, if the primary use for the tool is:

        llvm-attr-mutate -o - t.bc -attr target-cpu=proc123 | llc
    
    then I'm not sure we're gaining anything. In particular, having
    `llc` mutate the IR when you say:

        llc <t.bc -target-cpu=proc123

    is a far cleaner interface. Not to say that a tool to mutate
    attributes isn't a great idea -- I think it might be -- just that
    it's not a great solution for *this* problem.
4. Having `llc` mutate the IR itself -- the obvious solution, which
    Akira posted a patch for a few months ago -- does the job just as
    well as `llvm-attr-mutate` but with a much cleaner interface. It
    fails to distinguish between target defaults and explicit
    attributes, but when combined with `llvm-extract`, it gives you full
    control over the codegen options for each function.

Remind me again why we don't just do #4? It seems like the simplest way
to keep `llc` viable in the short term.

llc should be able to override the default values for options without mutating anything that’s explicitly specified at a function (or module) level.

For example, if I have a .ll file with specialized functions with different CPUs specified, I want to be able to recompile that file with -mcpu= on the llc command line in such a way that those functions don’t get changed, but any functions that don’t have an explicit CPU on them do. Consider trying to test something like function multi versioning.

-Jim

4. Having `llc` mutate the IR itself -- the obvious solution, which
   Akira posted a patch for a few months ago -- does the job just as
   well as `llvm-attr-mutate` but with a much cleaner interface. It
   fails to distinguish between target defaults and explicit
   attributes, but when combined with `llvm-extract`, it gives you full
   control over the codegen options for each function.

Remind me again why we don't just do #4? It seems like the simplest way
to keep `llc` viable in the short term.

llc should be able to override the default values for options without
mutating anything that’s explicitly specified at a function (or module)
level.

For example, if I have a .ll file with specialized functions with different
CPUs specified, I want to be able to recompile that file with -mcpu= on the
llc command line in such a way that those functions don’t get changed, but
any functions that don’t have an explicit CPU on them do. Consider trying to
test something like function multi versioning.

Sure, but that can be implemented with 4, no? Something like:

* In llc, if given -mcpu:
  * Iterate over all functions:
     * If a function has a an explicit cpu, leave it alone.
     * If a function has no explicit cpu, set it to the -mcpu command line value

Cheers,
Rafael

  1. Having llc mutate the IR itself – the obvious solution, which
    Akira posted a patch for a few months ago – does the job just as
    well as llvm-attr-mutate but with a much cleaner interface. It
    fails to distinguish between target defaults and explicit
    attributes, but when combined with llvm-extract, it gives you full
    control over the codegen options for each function.

Remind me again why we don’t just do #4? It seems like the simplest way
to keep llc viable in the short term.

llc should be able to override the default values for options without
mutating anything that’s explicitly specified at a function (or module)
level.

For example, if I have a .ll file with specialized functions with different
CPUs specified, I want to be able to recompile that file with -mcpu= on the
llc command line in such a way that those functions don’t get changed, but
any functions that don’t have an explicit CPU on them do. Consider trying to
test something like function multi versioning.

Sure, but that can be implemented with 4, no? Something like:

  • In llc, if given -mcpu:
  • Iterate over all functions:
  • If a function has a an explicit cpu, leave it alone.
  • If a function has no explicit cpu, set it to the -mcpu command line value

The problem with some of this is that we could see a lot of default cpus on functions. I.e. it may be set most of the time. That’s still up in the air though.

-eric