alloc_size metadata

Hi,

I'm implementing the alloc_size function attribute in clang. This attribute exists in gcc since version 4.3.
The attribute specifies that a function returns a buffer with the size given by the multiplication of the specified arguments. I would like to add new metadata to pass this information to LLVM IR.

For example, take the following C code:

void* my_calloc(size_t, size_t) __attribute__((alloc_size(1,2)));

void* fn() {
   return my_calloc(1, 3);
}

which would get translated to:

define i8* @fn() {
entry:
   %call = call i8* @my_calloc(i32 1, i32 3), !alloc_size !0
   ret i8* %call
}

declare i8* @my_calloc(i32, i32)

!0 = metadata !{i32 0, i32 1}

The downsize is that the metadata is added to all call sites, since it's not possible to attach metadata to function declarations.

Any comment, suggestion, etc?

Thanks,
Nuno

Hi Nuno,

I'm implementing the alloc_size function attribute in clang.

does anyone actually use this attribute? And if they do, can it really buy
them anything? How about "implementing" it by ignoring it!

Ciao, Duncan.

  This

Hi Nuno,

I'm implementing the alloc_size function attribute in clang.

does anyone actually use this attribute? And if they do, can it really buy
them anything? How about "implementing" it by ignoring it!

Tools like ASan and SAFECode *could* use this attribute to determine the size of memory objects created by allocators. This is needed for things like SAFECode's fastcheck optimization (which replaces expensive checks that need to lookup object bounds with a simpler check that has the object bounds passed in as arguments) as well as its instrumentation to register heap object bounds with the SAFECode run-time.

Currently, SAFECode has a pass which just recognizes certain functions as allocators and knows how to interpret the arguments to find the size. If we want SAFECode to work with another allocator (like a program's custom allocator, the Objective-C allocator, the Boehm garbage collector, etc), then that pass needs to be modified to recognize it. Having to update this pass for every allocator name and type is one of the few reasons why SAFECode only works with C/C++ and not just any old language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the relevant information about allocators to tools like SAFECode and ASan. I think it might also make some of the optimizations in LLVM that require knowing about allocators work on non-C/C++ code.

So, no, I don't think anything's using it now, but it looks like something very useful to add, and I think some of those useful things are already a part of core LLVM.

-- John T.

Hi Nuno,

I’m implementing the alloc_size function attribute in clang.
does anyone actually use this attribute? And if they do, can it really buy
them anything? How about “implementing” it by ignoring it!

Tools like ASan and SAFECode could use this attribute

A case where this may be useful for asan:
size_t n, m; …
int *x = new int [n]; …
x[m] // here we can check “m < n” instead of a more expensive shadow memory lookup.
For asan such optimization is possible only if ‘x’ does not escape the current function before the use,
otherwise we may lose a use-after-free.

I don’t know whether such optimization will fire often enough to pay the price for the added complexity of the implementation.
It would be interesting to see statistics on some huge app.

–kcc

Hi John,

I'm implementing the alloc_size function attribute in clang.

does anyone actually use this attribute? And if they do, can it really buy
them anything? How about "implementing" it by ignoring it!

...

Currently, SAFECode has a pass which just recognizes certain functions as
allocators and knows how to interpret the arguments to find the size. If we want
SAFECode to work with another allocator (like a program's custom allocator, the
Objective-C allocator, the Boehm garbage collector, etc), then that pass needs
to be modified to recognize it. Having to update this pass for every allocator
name and type is one of the few reasons why SAFECode only works with C/C++ and
not just any old language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the relevant
information about allocators to tools like SAFECode and ASan. I think it might
also make some of the optimizations in LLVM that require knowing about
allocators work on non-C/C++ code.

these are good points. The attribute and proposed implementation feel pretty
clunky though, which is my main gripe.

Since LLVM already has utility functions for recognizing allocators (i.e. that
know about malloc, realloc and -fno-builtin etc) can't SAFECode just make use
of them? Then either (1) something like alloc_size is implemented, the LLVM
utility learns about it, and SAFECode benefits automagically, or (2) the LLVM
utility is taught about other allocators like Ada's, and SAFECode benefits
automagically.

Ciao, Duncan.

Hi Kostya,

    Tools like ASan and SAFECode *could* use this attribute

A case where this may be useful for asan:
    size_t n, m; ...
    int *x = new int [n]; ...
    x[m] // here we can check "m < n" instead of a more expensive shadow memory
lookup.

I don't think you need the attribute for this: LLVM already has utilities that
know about malloc and C++'s new. Can't you just use them?

Ciao, Duncan.

I think this is key – there should be some clear numbers and evidence that this is a really important semantic extension in order to get accurate and efficient results.

And as Duncan points out, we should be confident that there is no existing mechanism to get the same optimization improvements.

Hi John,

I'm implementing the alloc_size function attribute in clang.

does anyone actually use this attribute? And if they do, can it really buy
them anything? How about "implementing" it by ignoring it!

...

Currently, SAFECode has a pass which just recognizes certain functions as
allocators and knows how to interpret the arguments to find the size. If we want
SAFECode to work with another allocator (like a program's custom allocator, the
Objective-C allocator, the Boehm garbage collector, etc), then that pass needs
to be modified to recognize it. Having to update this pass for every allocator
name and type is one of the few reasons why SAFECode only works with C/C++ and
not just any old language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the relevant
information about allocators to tools like SAFECode and ASan. I think it might
also make some of the optimizations in LLVM that require knowing about
allocators work on non-C/C++ code.

these are good points. The attribute and proposed implementation feel pretty
clunky though, which is my main gripe.

Hrm. I haven't formed an opinion on what the attributes should look like. I think supporting the ones established by GCC would be important for compatibility, and on the surface, they look reasonable. Devising better ones for Clang is fine with me. What about them feels klunky?

Since LLVM already has utility functions for recognizing allocators (i.e. that
know about malloc, realloc and -fno-builtin etc) can't SAFECode just make use
of them?

It probably could. It doesn't simply because SAFECode was written before these features existed within LLVM.
:slight_smile:

Then either (1) something like alloc_size is implemented, the LLVM
utility learns about it, and SAFECode benefits automagically, or (2) the LLVM
utility is taught about other allocators like Ada's, and SAFECode benefits
automagically.

I'm not sure what you mean by "LLVM utility," but I think we're thinking along the same lines. Clang/LLVM implement the alloc_size attributes, we change SAFECode to recognize it, and so when people use it, SAFECode benefits automagically.

Am I right that we're thinking the same thing, or did I completely misunderstand you?

-- John T.

Hi John,

Hi John,

I'm implementing the alloc_size function attribute in clang.

does anyone actually use this attribute? And if they do, can it really buy
them anything? How about "implementing" it by ignoring it!

...

Currently, SAFECode has a pass which just recognizes certain functions as
allocators and knows how to interpret the arguments to find the size. If we want
SAFECode to work with another allocator (like a program's custom allocator, the
Objective-C allocator, the Boehm garbage collector, etc), then that pass needs
to be modified to recognize it. Having to update this pass for every allocator
name and type is one of the few reasons why SAFECode only works with C/C++ and
not just any old language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the relevant
information about allocators to tools like SAFECode and ASan. I think it might
also make some of the optimizations in LLVM that require knowing about
allocators work on non-C/C++ code.

these are good points. The attribute and proposed implementation feel pretty
clunky though, which is my main gripe.

Hrm. I haven't formed an opinion on what the attributes should look like. I
think supporting the ones established by GCC would be important for
compatibility, and on the surface, they look reasonable. Devising better ones
for Clang is fine with me. What about them feels klunky?

basically it feels like "I only know about C, here's something that pretends to
be general but only handles C". Consider a language with a string type that
contains the string length as well as the characters. It has a library function
allocate_string(length). How much does it allocate? length+4 bytes. That can't
be represented by alloc_size. What's more, it may well store the length at the
start, and return a pointer to just after the length: a pointer to the first
character. alloc_size can't represent "the allocated memory starts 4 bytes
before the return value" either. In short, it feels like a hack for handling
something that turns up in some particular C code that someone has, rather than
a general solution to the general problem.

Since LLVM already has utility functions for recognizing allocators (i.e. that
know about malloc, realloc and -fno-builtin etc) can't SAFECode just make use
of them?

It probably could. It doesn't simply because SAFECode was written before these
features existed within LLVM.
:slight_smile:

Then either (1) something like alloc_size is implemented, the LLVM
utility learns about it, and SAFECode benefits automagically, or (2) the LLVM
utility is taught about other allocators like Ada's, and SAFECode benefits
automagically.

I'm not sure what you mean by "LLVM utility," but I think we're thinking along
the same lines. Clang/LLVM implement the alloc_size attributes, we change
SAFECode to recognize it, and so when people use it, SAFECode benefits
automagically.

Am I right that we're thinking the same thing, or did I completely misunderstand
you?

no, I'm thinking that SAFECode won't need to look at or worry about the
attribute at all, because the LLVM methods will know about it and serve
up the appropriate info. Take a look at Analysis/MemoryBuiltins.h. In
spite of the names, things like extractMallocCall are dealing with "malloc
like" functions, such as C++'s "new" as well as malloc. Similarly for
calloc. So you could use those right now to extract "malloc" and "calloc"
sizes. If alloc_size is implemented, presumably these would just magically
start to give you useful sizes for functions annotated with that attribute too.

Ciao, Duncan.

Hi John,

>> Hi John,
>>
>>>>> I'm implementing the alloc_size function attribute in clang.
>>>> does anyone actually use this attribute? And if they do, can it
>>>> really buy them anything? How about "implementing" it by
>>>> ignoring it!
>>>
>> ...
>>>
>>> Currently, SAFECode has a pass which just recognizes certain
>>> functions as allocators and knows how to interpret the arguments
>>> to find the size. If we want SAFECode to work with another
>>> allocator (like a program's custom allocator, the Objective-C
>>> allocator, the Boehm garbage collector, etc), then that pass
>>> needs to be modified to recognize it. Having to update this pass
>>> for every allocator name and type is one of the few reasons why
>>> SAFECode only works with C/C++ and not just any old language that
>>> is compiled down to LLVM IR.
>>
>>
>>> Nuno's proposed feature would allow programmers to communicate
>>> the relevant information about allocators to tools like SAFECode
>>> and ASan. I think it might also make some of the optimizations in
>>> LLVM that require knowing about allocators work on non-C/C++ code.
>>
>> these are good points. The attribute and proposed implementation
>> feel pretty clunky though, which is my main gripe.
>
> Hrm. I haven't formed an opinion on what the attributes should look
> like. I think supporting the ones established by GCC would be
> important for compatibility, and on the surface, they look
> reasonable. Devising better ones for Clang is fine with me. What
> about them feels klunky?

basically it feels like "I only know about C, here's something that
pretends to be general but only handles C". Consider a language with
a string type that contains the string length as well as the
characters. It has a library function allocate_string(length). How
much does it allocate? length+4 bytes. That can't be represented by
alloc_size. What's more, it may well store the length at the start,
and return a pointer to just after the length: a pointer to the first
character. alloc_size can't represent "the allocated memory starts 4
bytes before the return value" either. In short, it feels like a
hack for handling something that turns up in some particular C code
that someone has, rather than a general solution to the general
problem.

I think this is a good point, here's a suggestion:

Have the metadata name two functions, both assumed to have the same
signature as the tagged function, one which returns the offset of the
start of the allocated region and one which returns the length of the
allocated region. Alternatively, these functions could take the same
signature and additionally the returned pointer of the tagged
function, and then one function can return the start of the region and
the other the length.

For static analysis, we can attempt to inline these functions and then
use SCEV (dead code elimination will then get rid of the unused
results). For runtime checks, calls (which may also be inlined) can be
easily constructed.

>> Since LLVM already has utility functions for recognizing
>> allocators (i.e. that know about malloc, realloc and -fno-builtin
>> etc) can't SAFECode just make use of them?
>
> It probably could. It doesn't simply because SAFECode was written
> before these features existed within LLVM.
> :slight_smile:
>
>> Then either (1) something like alloc_size is implemented, the LLVM
>> utility learns about it, and SAFECode benefits automagically, or
>> (2) the LLVM utility is taught about other allocators like Ada's,
>> and SAFECode benefits automagically.
>
> I'm not sure what you mean by "LLVM utility," but I think we're
> thinking along the same lines. Clang/LLVM implement the alloc_size
> attributes, we change SAFECode to recognize it, and so when people
> use it, SAFECode benefits automagically.
>
> Am I right that we're thinking the same thing, or did I completely
> misunderstand you?

no, I'm thinking that SAFECode won't need to look at or worry about
the attribute at all, because the LLVM methods will know about it and
serve up the appropriate info. Take a look at
Analysis/MemoryBuiltins.h. In spite of the names, things like
extractMallocCall are dealing with "malloc like" functions, such as
C++'s "new" as well as malloc. Similarly for calloc. So you could
use those right now to extract "malloc" and "calloc" sizes. If
alloc_size is implemented, presumably these would just magically
start to give you useful sizes for functions annotated with that
attribute too.

Does the current code even handle calloc? I only see malloc and new.

-Hal

Hi John,

Hi John,

I'm implementing the alloc_size function attribute in clang.

does anyone actually use this attribute? And if they do, can it really buy
them anything? How about "implementing" it by ignoring it!

...

Currently, SAFECode has a pass which just recognizes certain functions as
allocators and knows how to interpret the arguments to find the size. If we want
SAFECode to work with another allocator (like a program's custom allocator, the
Objective-C allocator, the Boehm garbage collector, etc), then that pass needs
to be modified to recognize it. Having to update this pass for every allocator
name and type is one of the few reasons why SAFECode only works with C/C++ and
not just any old language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the relevant
information about allocators to tools like SAFECode and ASan. I think it might
also make some of the optimizations in LLVM that require knowing about
allocators work on non-C/C++ code.

these are good points. The attribute and proposed implementation feel pretty
clunky though, which is my main gripe.

Hrm. I haven't formed an opinion on what the attributes should look like. I
think supporting the ones established by GCC would be important for
compatibility, and on the surface, they look reasonable. Devising better ones
for Clang is fine with me. What about them feels klunky?

basically it feels like "I only know about C, here's something that pretends to
be general but only handles C". Consider a language with a string type that
contains the string length as well as the characters. It has a library function
allocate_string(length). How much does it allocate? length+4 bytes. That can't
be represented by alloc_size. What's more, it may well store the length at the
start, and return a pointer to just after the length: a pointer to the first
character. alloc_size can't represent "the allocated memory starts 4 bytes
before the return value" either. In short, it feels like a hack for handling
something that turns up in some particular C code that someone has, rather than
a general solution to the general problem.

True. It also doesn't handle a number of "C" allocators like strdup(), etc. Making it that general, though, may be tricky, and I don't think it negates the utility of the simpler form. I suspect a fair number of allocators could be described by the alloc_size feature.

Even in the C/C++ world, I think it'd be useful. There's the GC_malloc() in Boehm's garbage collector, kmalloc() in the Linux kernel, malloc wrappers in applications, memalign(), etc.

Since LLVM already has utility functions for recognizing allocators (i.e. that
know about malloc, realloc and -fno-builtin etc) can't SAFECode just make use
of them?

It probably could. It doesn't simply because SAFECode was written before these
features existed within LLVM.
:slight_smile:

[snip]

no, I'm thinking that SAFECode won't need to look at or worry about the
attribute at all, because the LLVM methods will know about it and serve
up the appropriate info. Take a look at Analysis/MemoryBuiltins.h. In
spite of the names, things like extractMallocCall are dealing with "malloc
like" functions, such as C++'s "new" as well as malloc. Similarly for
calloc. So you could use those right now to extract "malloc" and "calloc"
sizes. If alloc_size is implemented, presumably these would just magically
start to give you useful sizes for functions annotated with that attribute too.

I see. That makes sense.

-- John T.

Currently, SAFECode has a pass which just recognizes certain functions as
allocators and knows how to interpret the arguments to find the size. If we want
SAFECode to work with another allocator (like a program's custom allocator, the
Objective-C allocator, the Boehm garbage collector, etc), then that pass needs
to be modified to recognize it. Having to update this pass for every allocator
name and type is one of the few reasons why SAFECode only works with C/C++ and
not just any old language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the relevant
information about allocators to tools like SAFECode and ASan. I think it might
also make some of the optimizations in LLVM that require knowing about
allocators work on non-C/C++ code.

these are good points. The attribute and proposed implementation feel pretty
clunky though, which is my main gripe.

Hrm. I haven't formed an opinion on what the attributes should look like. I
think supporting the ones established by GCC would be important for
compatibility, and on the surface, they look reasonable. Devising better ones
for Clang is fine with me. What about them feels klunky?

basically it feels like "I only know about C, here's something that pretends to
be general but only handles C". Consider a language with a string type that
contains the string length as well as the characters. It has a library function
allocate_string(length). How much does it allocate? length+4 bytes. That can't
be represented by alloc_size. What's more, it may well store the length at the
start, and return a pointer to just after the length: a pointer to the first
character. alloc_size can't represent "the allocated memory starts 4 bytes
before the return value" either. In short, it feels like a hack for handling
something that turns up in some particular C code that someone has, rather than
a general solution to the general problem.

It's not a general solution, and not it even for C, of course.
But it's very useful for applications that have their own malloc wrappers and implementations. For example, LLVM, which has its own allocators! Without this metadata, you'll never be able to analyze LLVM's code at all. It's simply impossible to detect, in general, if a function is a custom allocator.
So, yes, some metadata is necessary. I agree my proposal is not general enough for all applications. For example, I run the tool over some code yesterday and I found an allocator that is the following: alloc(x, y, x) and allocates 'x * y + z' bytes. And that cannot be represented either at source-code level (with GCC's attribute) nor at IR level following my metadata proposal.
I'm happy to implement something more general if we come up with a better design.

Nuno

Just a nitpick here: in some cases, you can detect if a function is a wrapper around an allocator. A simple data-flow analysis can detect whether the function's return value is always the result of a known allocator and whether the parameters to the known allocator are a function of the function's arguments.

This can work for some allocators, but not all, and I don't know how well it works in practice. However, it's not technically impossible all the time.
:slight_smile:

I think the poolalloc project has an implementation of this analysis: poolalloc/lib/DSA/AllocatorIdentification.cpp.

-- John T.

Ok, so this seems to be the most general proposal, which can obviously handle all cases.
Something like this would work:

define i8* @foo() {
   %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
   %call = tail call i8* @my_recalloc(i8* null, i32 42) nounwind, !alloc_size !{i32 %0}
   ret i8* %call
}

Basically I just added a function call as the metadata (it's not currently possible to add the function itself to the metadata; the function call is required instead).
As long as the function is marked as readnone, I think it shouldn't interfere with the optimizers, and we can have a later pass to drop the metadata and remove the calls. I still don't like having the explicit calls there, though. Any suggestions to remove the functions calls from there?

I feel that the offset function is probably not required. I've never seen an allocation function that doesn't return a pointer to the beginning of the allocated buffer. Also, I cannot remember of any function in the C library that has that behavior.

We will also need a convenient syntax to export this feature in the languages we support.
I personally would like to see '__attribute__((alloc_size( strlen(x)+1 ))' in C, but the implementation seems to be non-trivial.

About Duncan's comment about having the memory builtin analysis recognize this intrinsic, well I agree it should (and I'll take care of that), but I'm not sure if we should be very aggressive in optimizing based on this metadata.
For example, do we really want to remove a call to a custom allocator whose return value is unused (like we do for malloc)? If so, we'll also need a metadata node to mark de-allocation functions (so that sequences like my_free(my_malloc(xx)) are removed).

Any feedback on the issues described is highly appreciated!

Thanks,
Nuno

Hi Nuno,

I think this is a good point, here's a suggestion:

Have the metadata name two functions, both assumed to have the same
signature as the tagged function, one which returns the offset of the
start of the allocated region and one which returns the length of the
allocated region. Alternatively, these functions could take the same
signature and additionally the returned pointer of the tagged
function, and then one function can return the start of the region and
the other the length.

Ok, so this seems to be the most general proposal, which can obviously
handle all cases.

I agree. Variation: have one function return the offset of the start of
the memory, and the other the offset of the end of the memory (or the end
plus 1), i.e. a range. This seems more uniform to me, but I don't have a
strong preference.

Something like this would work:

define i8* @foo() {
    %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
    %call = tail call i8* @my_recalloc(i8* null, i32 42) nounwind,
!alloc_size !{i32 %0}
    ret i8* %call
}

Basically I just added a function call as the metadata (it's not
currently possible to add the function itself to the metadata; the
function call is required instead).
As long as the function is marked as readnone, I think it shouldn't
interfere with the optimizers, and we can have a later pass to drop
the metadata and remove the calls. I still don't like having the
explicit calls there, though. Any suggestions to remove the functions
calls from there?

How about this:

define i32 @lo(i32) {
   ret i32 0
}

define i32 @hi(i32 %n) {
   ret i32 %n
}

declare i8* @wonder_allocator(i32)

define i8* @foo(i32 %n) {
   %r = call i8* @wonder_allocator(i32 %n), !alloc !0
   ret i8* %r
}

!0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }

The main problem I see is that if you declare @lo and @hi to have internal
linkage then the optimizers will zap them. Maybe there's a neat solution
to that.

I feel that the offset function is probably not required. I've never
seen an allocation function that doesn't return a pointer to the
beginning of the allocated buffer. Also, I cannot remember of any
function in the C library that has that behavior.

Yes, in C you probably never see such a thing, but we are not just dealing
with C here. I think it is important to have the start offset as well as the
length.

We will also need a convenient syntax to export this feature in the
languages we support.

Actually, no you don't. You could just implement GCC's alloc_size in terms of
this, at least for the moment. Even in the long term it's probably pretty
pointless for clang to ever expose the start offset functionality since clang
only supports C-like languages and probably (as you mentioned) this is pretty
useless for them.

I personally would like to see '__attribute__((alloc_size( strlen(x)+1
))' in C, but the implementation seems to be non-trivial.

About Duncan's comment about having the memory builtin analysis
recognize this intrinsic, well I agree it should (and I'll take care
of that), but I'm not sure if we should be very aggressive in
optimizing based on this metadata.

It would be great for understanding that loads/stores from/to outside the bounds
of the allocation result in undef. I think the optimizers already exploit this
kind of info in the case of alloca - maybe this helps generalize to heap
allocations.

For example, do we really want to remove a call to a custom allocator
whose return value is unused (like we do for malloc)?

No we don't, so LLVM's interface to malloc-like and calloc-like things would
have to be reworked to extract out different kinds of knowledge.

   If so, we'll

also need a metadata node to mark de-allocation functions (so that
sequences like my_free(my_malloc(xx)) are removed).

Maybe!

Ciao, Duncan.

>> >> Hi John,
>> >>
>> >>>>> I'm implementing the alloc_size function attribute in clang.
>> >>>> does anyone actually use this attribute? And if they do, can
>> >>>> it really buy them anything? How about "implementing" it by
>> >>>> ignoring it!
>> >>>
>> >> ...
>> >>>
>> >>> Currently, SAFECode has a pass which just recognizes certain
>> >>> functions as allocators and knows how to interpret the
>> >>> arguments to find the size. If we want SAFECode to work with
>> >>> another allocator (like a program's custom allocator, the
>> >>> Objective-C allocator, the Boehm garbage collector, etc), then
>> >>> that pass needs to be modified to recognize it. Having to
>> >>> update this pass for every allocator name and type is one of
>> >>> the few reasons why SAFECode only works with C/C++ and not
>> >>> just any old language that is compiled down to LLVM IR.
>> >>
>> >>
>> >>> Nuno's proposed feature would allow programmers to communicate
>> >>> the relevant information about allocators to tools like
>> >>> SAFECode and ASan. I think it might also make some of the
>> >>> optimizations in LLVM that require knowing about allocators
>> >>> work on non-C/C++ code.
>> >>
>> >> these are good points. The attribute and proposed implementation
>> >> feel pretty clunky though, which is my main gripe.
>> >
>> > Hrm. I haven't formed an opinion on what the attributes should
>> > look like. I think supporting the ones established by GCC would
>> > be important for compatibility, and on the surface, they look
>> > reasonable. Devising better ones for Clang is fine with me. What
>> > about them feels klunky?
>>
>> basically it feels like "I only know about C, here's something that
>> pretends to be general but only handles C". Consider a language
>> with a string type that contains the string length as well as the
>> characters. It has a library function allocate_string(length).
>> How much does it allocate? length+4 bytes. That can't be
>> represented by alloc_size. What's more, it may well store the
>> length at the start, and return a pointer to just after the
>> length: a pointer to the first character. alloc_size can't
>> represent "the allocated memory starts 4 bytes before the return
>> value" either. In short, it feels like a hack for handling
>> something that turns up in some particular C code that someone
>> has, rather than a general solution to the general problem.
>
> I think this is a good point, here's a suggestion:
>
> Have the metadata name two functions, both assumed to have the same
> signature as the tagged function, one which returns the offset of
> the start of the allocated region and one which returns the length
> of the allocated region. Alternatively, these functions could take
> the same signature and additionally the returned pointer of the
> tagged function, and then one function can return the start of the
> region and the other the length.

Ok, so this seems to be the most general proposal, which can
obviously handle all cases.
Something like this would work:

define i8* @foo() {
   %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
   %call = tail call i8* @my_recalloc(i8* null, i32 42) nounwind,
!alloc_size !{i32 %0}
   ret i8* %call
}

Basically I just added a function call as the metadata (it's not
currently possible to add the function itself to the metadata; the
function call is required instead).

I had in mind essentially what Duncan wrote in his response (just
adding the function names, and using some rule to construct the
function calls from the allocator calls).

As long as the function is marked as readnone, I think it shouldn't
interfere with the optimizers, and we can have a later pass to drop
the metadata and remove the calls. I still don't like having the
explicit calls there, though. Any suggestions to remove the
functions calls from there?

I feel that the offset function is probably not required. I've never
seen an allocation function that doesn't return a pointer to the
beginning of the allocated buffer. Also, I cannot remember of any
function in the C library that has that behavior.

Isn't this what posix_memalign does? Anything that needs to 'round up'
the alignment will return a pointer which may not point to the
beginning of the buffer. As Duncan noted, there are other uses as well.

We will also need a convenient syntax to export this feature in the
languages we support.
I personally would like to see
'__attribute__((alloc_size( strlen(x)+1 ))' in C, but the
implementation seems to be non-trivial.

IMHO, this would be a very useful feature.

About Duncan's comment about having the memory builtin analysis
recognize this intrinsic, well I agree it should (and I'll take care
of that), but I'm not sure if we should be very aggressive in
optimizing based on this metadata.
For example, do we really want to remove a call to a custom
allocator whose return value is unused (like we do for malloc)? If
so, we'll also need a metadata node to mark de-allocation functions
(so that sequences like my_free(my_malloc(xx)) are removed).

You might want to restrict that to custom allocators that have an
additional 'removable' attribute.

Out of curiosity, does any of this potentially effect the presence or
absence of 'inbounds' on GEP instructions?

Thanks again,
Hal

Hi Nuno,

>> I think this is a good point, here's a suggestion:
>>
>> Have the metadata name two functions, both assumed to have the same
>> signature as the tagged function, one which returns the offset of
>> the start of the allocated region and one which returns the length
>> of the allocated region. Alternatively, these functions could take
>> the same signature and additionally the returned pointer of the
>> tagged function, and then one function can return the start of the
>> region and the other the length.
>
> Ok, so this seems to be the most general proposal, which can
> obviously handle all cases.

I agree. Variation: have one function return the offset of the start
of the memory, and the other the offset of the end of the memory (or
the end plus 1), i.e. a range. This seems more uniform to me, but I
don't have a strong preference.

> Something like this would work:
>
> define i8* @foo() {
> %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
> %call = tail call i8* @my_recalloc(i8* null, i32 42) nounwind,
> !alloc_size !{i32 %0}
> ret i8* %call
> }
>
> Basically I just added a function call as the metadata (it's not
> currently possible to add the function itself to the metadata; the
> function call is required instead).
> As long as the function is marked as readnone, I think it shouldn't
> interfere with the optimizers, and we can have a later pass to drop
> the metadata and remove the calls. I still don't like having the
> explicit calls there, though. Any suggestions to remove the
> functions calls from there?

How about this:

define i32 @lo(i32) {
   ret i32 0
}

define i32 @hi(i32 %n) {
   ret i32 %n
}

declare i8* @wonder_allocator(i32)

define i8* @foo(i32 %n) {
   %r = call i8* @wonder_allocator(i32 %n), !alloc !0
   ret i8* %r
}

!0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }

This is the format that I had in mind.

The main problem I see is that if you declare @lo and @hi to have
internal linkage then the optimizers will zap them. Maybe there's a
neat solution to that.

I would consider the optimizer doing this a feature, not a problem.
That having been said, we need to make sure that the optimzer does not
zap them before the analysis/instrumentation passes get to run.

> I feel that the offset function is probably not required. I've never
> seen an allocation function that doesn't return a pointer to the
> beginning of the allocated buffer. Also, I cannot remember of any
> function in the C library that has that behavior.

Yes, in C you probably never see such a thing, but we are not just
dealing with C here. I think it is important to have the start
offset as well as the length.

> We will also need a convenient syntax to export this feature in the
> languages we support.

Actually, no you don't. You could just implement GCC's alloc_size in
terms of this, at least for the moment. Even in the long term it's
probably pretty pointless for clang to ever expose the start offset
functionality since clang only supports C-like languages and probably
(as you mentioned) this is pretty useless for them.

As I mentioned in my other response, posix_memalign and friends do
this. However, writing to the memory prior to the returned pointer
would still be an error, so perhaps this does not matter.

-Hal

>> I think this is a good point, here's a suggestion:
>>
>> Have the metadata name two functions, both assumed to have the same
>> signature as the tagged function, one which returns the offset of
>> the start of the allocated region and one which returns the length
>> of the allocated region. Alternatively, these functions could take
>> the same signature and additionally the returned pointer of the
>> tagged function, and then one function can return the start of the
>> region and the other the length.
>
> Ok, so this seems to be the most general proposal, which can
> obviously handle all cases.

I agree. Variation: have one function return the offset of the start
of the memory, and the other the offset of the end of the memory (or
the end plus 1), i.e. a range. This seems more uniform to me, but I
don't have a strong preference.

> Something like this would work:
>
> define i8* @foo() {
> %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
> %call = tail call i8* @my_recalloc(i8* null, i32 42) nounwind,
> !alloc_size !{i32 %0}
> ret i8* %call
> }
>
> Basically I just added a function call as the metadata (it's not
> currently possible to add the function itself to the metadata; the
> function call is required instead).
> As long as the function is marked as readnone, I think it shouldn't
> interfere with the optimizers, and we can have a later pass to drop
> the metadata and remove the calls. I still don't like having the
> explicit calls there, though. Any suggestions to remove the
> functions calls from there?

How about this:

define i32 @lo(i32) {
   ret i32 0
}

define i32 @hi(i32 %n) {
   ret i32 %n
}

declare i8* @wonder_allocator(i32)

define i8* @foo(i32 %n) {
   %r = call i8* @wonder_allocator(i32 %n), !alloc !0
   ret i8* %r
}

!0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }

This is the format that I had in mind.

The main problem I see is that if you declare @lo and @hi to have
internal linkage then the optimizers will zap them. Maybe there's a
neat solution to that.

I would consider the optimizer doing this a feature, not a problem.
That having been said, we need to make sure that the optimzer does not
zap them before the analysis/instrumentation passes get to run.

This is actually non-trivial to accomplish.
Metadata doesn't count as a user, so internal functions with no other usage will get removed. On the other hand, we could use @llvm.used to make sure the functions had (at least) one user, but that's probably equivalent to not using internal linkage.
And I still want to make sure that these functions disappear in the final binary..

Another thing that bothers me is the implementation on the objectsize intrinsic. This intrinsic returns the *constant* size of the pointed object given as argument (if the object has a constant size). However, with this function scheme, the implementation would be a bit heavy, since it would need to inline the @lo and @hi functions, simplify the resulting expression, and then check if the result is a ConstantInt. And remember that in general these functions can be arbitrary complex.

> I feel that the offset function is probably not required. I've never
> seen an allocation function that doesn't return a pointer to the
> beginning of the allocated buffer. Also, I cannot remember of any
> function in the C library that has that behavior.

Yes, in C you probably never see such a thing, but we are not just
dealing with C here. I think it is important to have the start
offset as well as the length.

Ok, that makes the whole thing more general. But is the added complexity worth it? I mean, if it won't have any expected users, then probably it's not worth it. I don't know much about ADA or other languages, so I'm just asking if there is a front-end that may use this functionality.

> We will also need a convenient syntax to export this feature in the
> languages we support.

Actually, no you don't. You could just implement GCC's alloc_size in
terms of this, at least for the moment. Even in the long term it's
probably pretty pointless for clang to ever expose the start offset
functionality since clang only supports C-like languages and probably
(as you mentioned) this is pretty useless for them.

As I mentioned in my other response, posix_memalign and friends do
this. However, writing to the memory prior to the returned pointer
would still be an error, so perhaps this does not matter.

Uhm, that's not what I understand from the man page. It even says that the returned pointer can be used with free and realloc.

Nuno

About Duncan's comment about having the memory builtin analysis
recognize this intrinsic, well I agree it should (and I'll take care
of that), but I'm not sure if we should be very aggressive in
optimizing based on this metadata.
For example, do we really want to remove a call to a custom
allocator whose return value is unused (like we do for malloc)? If
so, we'll also need a metadata node to mark de-allocation functions
(so that sequences like my_free(my_malloc(xx)) are removed).

You might want to restrict that to custom allocators that have an
additional 'removable' attribute.

Out of curiosity, does any of this potentially effect the presence or
absence of 'inbounds' on GEP instructions?

No, I don't think so. The inbounds flag means that the offset computation will not overflow. Here we are only discussing the discovery of the object size (potentially pointed by a GEP). As Duncan pointed out, knowing an object's size may be used to expose more undefined behavior to optimizers if we prove that the index will fall out of bounds.

Nuno

>> >> I think this is a good point, here's a suggestion:
>> >>
>> >> Have the metadata name two functions, both assumed to have the
>> >> same signature as the tagged function, one which returns the
>> >> offset of the start of the allocated region and one which
>> >> returns the length of the allocated region. Alternatively,
>> >> these functions could take the same signature and additionally
>> >> the returned pointer of the tagged function, and then one
>> >> function can return the start of the region and the other the
>> >> length.
>> >
>> > Ok, so this seems to be the most general proposal, which can
>> > obviously handle all cases.
>>
>> I agree. Variation: have one function return the offset of the
>> start of the memory, and the other the offset of the end of the
>> memory (or the end plus 1), i.e. a range. This seems more uniform
>> to me, but I don't have a strong preference.
>>
>> > Something like this would work:
>> >
>> > define i8* @foo() {
>> > %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
>> > %call = tail call i8* @my_recalloc(i8* null, i32 42)
>> > nounwind, !alloc_size !{i32 %0}
>> > ret i8* %call
>> > }
>> >
>> > Basically I just added a function call as the metadata (it's not
>> > currently possible to add the function itself to the metadata;
>> > the function call is required instead).
>> > As long as the function is marked as readnone, I think it
>> > shouldn't interfere with the optimizers, and we can have a later
>> > pass to drop the metadata and remove the calls. I still don't
>> > like having the explicit calls there, though. Any suggestions
>> > to remove the functions calls from there?
>>
>> How about this:
>>
>> define i32 @lo(i32) {
>> ret i32 0
>> }
>>
>> define i32 @hi(i32 %n) {
>> ret i32 %n
>> }
>>
>> declare i8* @wonder_allocator(i32)
>>
>> define i8* @foo(i32 %n) {
>> %r = call i8* @wonder_allocator(i32 %n), !alloc !0
>> ret i8* %r
>> }
>>
>> !0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }
>
> This is the format that I had in mind.
>
>>
>>
>> The main problem I see is that if you declare @lo and @hi to have
>> internal linkage then the optimizers will zap them. Maybe there's
>> a neat solution to that.
>
> I would consider the optimizer doing this a feature, not a problem.
> That having been said, we need to make sure that the optimzer does
> not zap them before the analysis/instrumentation passes get to run.

This is actually non-trivial to accomplish.
Metadata doesn't count as a user, so internal functions with no
other usage will get removed.

I thought that it is possible to have passes run before the optimizer
performs such deletions. Is this not practical? Another option is to
change the current implementation delete such functions in two phases:
in the first phase we leave functions with metadata references. In the
second phase (which runs near the end of the pipeline) we delete
functions regardless of metadata references.

On the other hand, we could use
@llvm.used to make sure the functions had (at least) one user, but
that's probably equivalent to not using internal linkage.
And I still want to make sure that these functions disappear in the
final binary..

Another thing that bothers me is the implementation on the
objectsize intrinsic. This intrinsic returns the *constant* size of
the pointed object given as argument (if the object has a constant
size). However, with this function scheme, the implementation would
be a bit heavy, since it would need to inline the @lo and @hi
functions, simplify the resulting expression, and then check if the
result is a ConstantInt. And remember that in general these functions
can be arbitrary complex.

I agree; we'd need to use SCEV or some other heavyweight mechanism to
do the analysis. In some sense, however, that would be the price of
generality. On the other hand, I see no reason why we could not write a
utility function that could accomplish all of that, so we'd only need
to work out the details once.

I think that this kind of issue will come up again in the future. Any
time someone asks, "how can a frontend pass <some complicated
constraint or information> to the backend", this kind of functionality
will be needed.

>> > I feel that the offset function is probably not required. I've
>> > never seen an allocation function that doesn't return a pointer
>> > to the beginning of the allocated buffer. Also, I cannot
>> > remember of any function in the C library that has that behavior.
>>
>> Yes, in C you probably never see such a thing, but we are not just
>> dealing with C here. I think it is important to have the start
>> offset as well as the length.

Ok, that makes the whole thing more general. But is the added
complexity worth it? I mean, if it won't have any expected users,
then probably it's not worth it. I don't know much about ADA or
other languages, so I'm just asking if there is a front-end that may
use this functionality.

>> > We will also need a convenient syntax to export this feature in
>> > the languages we support.
>>
>> Actually, no you don't. You could just implement GCC's alloc_size
>> in terms of this, at least for the moment. Even in the long term
>> it's probably pretty pointless for clang to ever expose the start
>> offset functionality since clang only supports C-like languages
>> and probably (as you mentioned) this is pretty useless for them.
>
> As I mentioned in my other response, posix_memalign and friends do
> this. However, writing to the memory prior to the returned pointer
> would still be an error, so perhaps this does not matter.

Uhm, that's not what I understand from the man page. It even says
that the returned pointer can be used with free and realloc.

Fair enough; I retract my comment re: posix_memalign. What is true is
that custom allocators often use the system's allocators, add some kind
of header, and then return the bytes after the header. That might be irrelevant, however,
if accessing the bytes prior to the pointer is still an invalid action.
I have seen code where accessing the bytes in that header *is* a valid
action (such bytes contain pointers to memory pool structures, etc.).
I'm not sure supporting such codes in worth the effort, but I certainly
know that they exist.

-Hal