String attributes for function arguments and return values

Hi,

I’d like to support string attributes on function arguments and return values. We are going to use them in our tree to express higher level language types.

Internally attributes framework have everything to do this, it’s even possible to generate string attributes via API right now:
Function *function;
function->setAttributes(function->getAttributes().addAttribute(context, i, “attribute”));
But because it’s not supported in LLParser if you dump the function and try to parse it back it will fail. I have a patch to fix this problem:
http://reviews.llvm.org/D11058
I consider this part as a bug fix for existing functionality.

The second patch is to add accessors to string attributes to Argument and Function classes:
http://reviews.llvm.org/D10872
This part is optional because there no code in upstream will make use of it. But if we support string attributes syntax it makes sense to provide API support as well.

Does anyone have any objections?

Thanks,
Artur

From: "Artur Pilipenko" <apilipenko@azulsystems.com>
To: llvmdev@cs.uiuc.edu
Cc: "Hal Finkel" <hfinkel@anl.gov>
Sent: Monday, July 13, 2015 6:45:35 AM
Subject: String attributes for function arguments and return values

Hi,

I’d like to support string attributes on function arguments and
return values. We are going to use them in our tree to express
higher level language types.

How do you expect to use this information? Will you need the inliner to do something special with these?

Thanks again,
Hal

Hi,

From: “Artur Pilipenko” <apilipenko@azulsystems.com>
To: llvmdev@cs.uiuc.edu
Cc: “Hal Finkel” <hfinkel@anl.gov>
Sent: Monday, July 13, 2015 6:45:35 AM
Subject: String attributes for function arguments and return values

Hi,

I’d like to support string attributes on function arguments and
return values. We are going to use them in our tree to express
higher level language types.

How do you expect to use this information? Will you need the inliner to do something special with these?

Type information is required for Java specific optimizations, like devirtualization, subtype check optimizations, etc. There are no plans to upstream them, because they are too specific to Java.

W.r.t inlining I don’t think that these attributes will require any special handling.

Artur

This sounds more like a use case for metadata. Can we attach metadata to function arguments, or does that not work currently?

This sounds more like a use case for metadata. Can we attach metadata to function arguments, or does that not work currently?

We can’t, no.

I have an out of tree patch which allows metadata in AttributeSets. This would also potentially also work here.

However, depending on the number of unique strings/metadata in AttributeSets, this could get large. I don’t think we’ve ever had more that a few unique AttributeSets in an entire module. If you have too many different strings then you could have a significant number of sets which could get slow.

Metadata attached to the function or the function arguments is likely to scale better than strings/metadata in the AttributeSets, but I guess it all depends on whether many are even needed.

Pete

As far as I can tell, the string attributes on function parameters is already “supposed to work”. We support it in bytecode. We even support serialization of the attributes. It’s just the parsing that’s broken. I don’t have any problem with an eventual move towards supporting metadata on arguments, but does anyone object to landing the current patches? Whether we believe that the use case motivating the patch is better represented by metadata or not, having the deserialization support seems like a clear improvement. As a side note, I can’t find any mention of the string attribute functionality in the LangRef or ExtendingLLVM. Seems like it might be time to add something about the capability for extension. We should probably also explicitly reserve the entire namespace of possible keywords for future LLVM in tree enhancements. Philip

This sounds more like a use case for metadata. Can we attach metadata to function arguments, or does that not work currently?

We can’t, no.

I have an out of tree patch which allows metadata in AttributeSets. This would also potentially also work here.

However, depending on the number of unique strings/metadata in AttributeSets, this could get large. I don’t think we’ve ever had more that a few unique AttributeSets in an entire module. If you have too many different strings then you could have a significant number of sets which could get slow.

Metadata attached to the function or the function arguments is likely to scale better than strings/metadata in the AttributeSets, but I guess it all depends on whether many are even needed.

As far as I can tell, the string attributes on function parameters is already “supposed to work”. We support it in bytecode. We even support serialization of the attributes. It’s just the parsing that’s broken. I don’t have any problem with an eventual move towards supporting metadata on arguments, but does anyone object to landing the current patches? Whether we believe that the use case motivating the patch is better represented by metadata or not, having the deserialization support seems like a clear improvement.

As a side note, I can’t find any mention of the string attribute functionality in the LangRef or ExtendingLLVM. Seems like it might be time to add something about the capability for extension. We should probably also explicitly reserve the entire namespace of possible keywords for future LLVM in tree enhancements.

So as far as the attribute versus metadata question here I don’t have a particular care whether or not we support attributes on any particular thing in the Value hierarchy. As far as your particular case I really only have one question: are the attributes needed for correctness or for optimization? If they’re the latter they should probably be metadata, the former then attributes seem to make the best sense.

Mostly just trying to see about you getting the right fixes in for the support you need and the rest of us not having to worry about not breaking things that no one cares about :slight_smile:

-eric

As far as I can tell, the string attributes on function parameters is already “supposed to work”. We support it in bytecode. We even support serialization of the attributes. It’s just the parsing that’s broken. I don’t have any problem with an eventual move towards supporting metadata on arguments, but does anyone object to landing the current patches? Whether we believe that the use case motivating the patch is better represented by metadata or not, having the deserialization support seems like a clear improvement.

No objection from me. Seems like an arbitrary restriction for it to currently work on everything other than arguments.

If you or anyone else wants the metadata in attribute sets, we can have another discussion for that. My use cases were tbaa on arguments and range metadata, but there could easily be others.

Cheers,
Pete

For the particular use case we have, metadata on arguments would be a better semantic fit. It’s a pure optimization hint. Having said that, attributes work just fine in practice as well. Let me restate my previous comment: Having support for custom attributes on function arguments is generally useful for external users of LLVM. Whether it is ideal in this particular case is not really relevant. There are certainly reasonable cases where using a target/environment specific attribute to effect call lowering makes perfect sense. It seems desireable to be able to prototype these quickly so that they can mature and (possibly) make it upstream. My view is that we already support these attributes. I don’t have an example user, but it really wouldn’t surprise me if folks were using this functionality already. Everything works if generated through the C++ APIs or read from bitcode. It’s only the deserialization parts that break. In particular, you can have a working compiler which generates output which isn’t parseable by LLVM’s existing tools. That’s not exactly a good state to be in.

I would be happy to see this happen. I’m not sure that “in attribute sets” is quite the right framing here, but being able to use metadata in all the places we use attributes seems like a reasonable design goal. (Actually, we should enumerate that list to make sure it still seems reasonable.) This might also give us a way to migrate some of our existing attributes which are really just optimization hints. (i.e. nonnull, dereferenceable)

So as far as the attribute versus metadata question here I don’t have a particular care whether or not we support attributes on any particular thing in the Value hierarchy. As far as your particular case I really only have one question: are the attributes needed for correctness or for optimization? If they’re the latter they should probably be metadata, the former then attributes seem to make the best sense.

For the particular use case we have, metadata on arguments would be a better semantic fit. It’s a pure optimization hint. Having said that, attributes work just fine in practice as well.

I guess as long as they’re not upstreamed you can do whatever you’d like, I’d suggest the metadata though just to keep within the llvm design principles.

Mostly just trying to see about you getting the right fixes in for the support you need and the rest of us not having to worry about not breaking things that no one cares about :slight_smile:

Let me restate my previous comment: Having support for custom attributes on function arguments is generally useful for external users of LLVM. Whether it is ideal in this particular case is not really relevant. There are certainly reasonable cases where using a target/environment specific attribute to effect call lowering makes perfect sense. It seems desireable to be able to prototype these quickly so that they can mature and (possibly) make it upstream.

There’s a lot of things that are generally useful that we delete. I don’t see anything unused in any other way. Bitcode support is a bit more… solid though so removing anything that exists is harder. Misfeatures or things accidentally supported have a tendency to stick around and complicate things.

That said…

My view is that we already support these attributes. I don’t have an example user, but it really wouldn’t surprise me if folks were using this functionality already. Everything works if generated through the C++ APIs or read from bitcode. It’s only the deserialization parts that break. In particular, you can have a working compiler which generates output which isn’t parseable by LLVM’s existing tools. That’s not exactly a good state to be in.

I don’t have a strong opinion here as I said in the first place. If fixing this support is useful then I’ve no objection.

-eric

So as far as the attribute versus metadata question here I don't have a particular care whether or not we support attributes on any particular thing in the Value hierarchy. As far as your particular case I really only have one question: are the attributes needed for correctness or for optimization? If they're the latter they should probably be metadata, the former then attributes seem to make the best sense.

For the particular use case we have, metadata on arguments would be a better semantic fit. It's a pure optimization hint. Having said that, attributes work just fine in practice as well.

I guess as long as they're not upstreamed you can do whatever you'd like, I'd suggest the metadata though just to keep within the llvm design principles.

Mostly just trying to see about you getting the right fixes in for the support you need and the rest of us not having to worry about not breaking things that no one cares about :slight_smile:

Let me restate my previous comment: Having support for custom attributes on function arguments is generally useful for external users of LLVM. Whether it is ideal in this particular case is not really relevant. There are certainly reasonable cases where using a target/environment specific attribute to effect call lowering makes perfect sense. It seems desireable to be able to prototype these quickly so that they can mature and (possibly) make it upstream.

There's a lot of things that are generally useful that we delete. I don't see anything unused in any other way. Bitcode support is a bit more... solid though so removing anything that exists is harder. Misfeatures or things accidentally supported have a tendency to stick around and complicate things.

That said...

My view is that we *already* support these attributes. I don't have an example user, but it really wouldn't surprise me if folks were using this functionality already. Everything works if generated through the C++ APIs or read from bitcode. It's only the deserialization parts that break. In particular, you can have a working compiler which generates output which isn't parseable by LLVM's existing tools. That's not exactly a good state to be in.

I don't have a strong opinion here as I said in the first place. If fixing this support is useful then I've no objection.

FWIW (I'm arriving kind of late here...), I agree that string-based
attributes make sense to support on arguments.