alloc_size metadata

>> >> I think this is a good point, here's a suggestion:
>> >>
>> >> Have the metadata name two functions, both assumed to have the
>> >> same signature as the tagged function, one which returns the
>> >> offset of the start of the allocated region and one which
>> >> returns the length of the allocated region. Alternatively,
>> >> these functions could take the same signature and additionally
>> >> the returned pointer of the tagged function, and then one
>> >> function can return the start of the region and the other the
>> >> length.
>> >
>> > Ok, so this seems to be the most general proposal, which can
>> > obviously handle all cases.
>>
>> I agree. Variation: have one function return the offset of the
>> start of the memory, and the other the offset of the end of the
>> memory (or the end plus 1), i.e. a range. This seems more uniform
>> to me, but I don't have a strong preference.
>>
>> > Something like this would work:
>> >
>> > define i8* @foo() {
>> > %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
>> > %call = tail call i8* @my_recalloc(i8* null, i32 42)
>> > nounwind, !alloc_size !{i32 %0}
>> > ret i8* %call
>> > }
>> >
>> > Basically I just added a function call as the metadata (it's not
>> > currently possible to add the function itself to the metadata;
>> > the function call is required instead).
>> > As long as the function is marked as readnone, I think it
>> > shouldn't interfere with the optimizers, and we can have a later
>> > pass to drop the metadata and remove the calls. I still don't
>> > like having the explicit calls there, though. Any suggestions
>> > to remove the functions calls from there?
>>
>> How about this:
>>
>> define i32 @lo(i32) {
>> ret i32 0
>> }
>>
>> define i32 @hi(i32 %n) {
>> ret i32 %n
>> }
>>
>> declare i8* @wonder_allocator(i32)
>>
>> define i8* @foo(i32 %n) {
>> %r = call i8* @wonder_allocator(i32 %n), !alloc !0
>> ret i8* %r
>> }
>>
>> !0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }
>
> This is the format that I had in mind.
>
>>
>> The main problem I see is that if you declare @lo and @hi to have
>> internal linkage then the optimizers will zap them. Maybe there's
>> a neat solution to that.
>
> I would consider the optimizer doing this a feature, not a problem.
> That having been said, we need to make sure that the optimzer does
> not zap them before the analysis/instrumentation passes get to run.

This is actually non-trivial to accomplish.
Metadata doesn't count as a user, so internal functions with no
other usage will get removed.

I thought that it is possible to have passes run before the optimizer
performs such deletions. Is this not practical? Another option is to
change the current implementation delete such functions in two phases:
in the first phase we leave functions with metadata references. In the
second phase (which runs near the end of the pipeline) we delete
functions regardless of metadata references.

Right now, if you list the users of a Value, the references coming from metadata won't appear. Metadata is not an user and doesn't count towards the number of uses of a value. That's why using anything about constant expressions risks disappearing.
Leaving non-used functions to be removed after all optimizations could be done. But then you would probably want to, for example, patch the pass manager so that it didn't run a function pass over dead functions, and so on.

On the other hand, we could use
@llvm.used to make sure the functions had (at least) one user, but
that's probably equivalent to not using internal linkage.
And I still want to make sure that these functions disappear in the
final binary..

Another thing that bothers me is the implementation on the
objectsize intrinsic. This intrinsic returns the *constant* size of
the pointed object given as argument (if the object has a constant
size). However, with this function scheme, the implementation would
be a bit heavy, since it would need to inline the @lo and @hi
functions, simplify the resulting expression, and then check if the
result is a ConstantInt. And remember that in general these functions
can be arbitrary complex.

I agree; we'd need to use SCEV or some other heavyweight mechanism to
do the analysis. In some sense, however, that would be the price of
generality. On the other hand, I see no reason why we could not write a
utility function that could accomplish all of that, so we'd only need
to work out the details once.

SCEV is not the answer here. You just want to know if the result of a function is constant given a set of parameters. Inlining + simplifications should do it. But doing an inlining trial is expensive.

I think that this kind of issue will come up again in the future. Any
time someone asks, "how can a frontend pass <some complicated
constraint or information> to the backend", this kind of functionality
will be needed.

Yes. Maybe we should have a separate mini-expression language for the metadata? I dunno if it's worth the effort..

Nuno

Hi Nuno,

This is actually non-trivial to accomplish.
Metadata doesn't count as a user, so internal functions with no
other usage will get removed.

I thought that it is possible to have passes run before the optimizer
performs such deletions. Is this not practical? Another option is to
change the current implementation delete such functions in two phases:
in the first phase we leave functions with metadata references. In the
second phase (which runs near the end of the pipeline) we delete
functions regardless of metadata references.

Right now, if you list the users of a Value, the references coming
from metadata won't appear. Metadata is not an user and doesn't count
towards the number of uses of a value. That's why using anything
about constant expressions risks disappearing.
Leaving non-used functions to be removed after all optimizations could
be done. But then you would probably want to, for example, patch the
pass manager so that it didn't run a function pass over dead
functions, and so on.

the functions could be declared to have linkonce_odr linkage. That way
they will be zapped after the inliner runs, but shouldn't be removed
before.

Another thing that bothers me is the implementation on the
objectsize intrinsic. This intrinsic returns the *constant* size of
the pointed object given as argument (if the object has a constant
size). However, with this function scheme, the implementation would
be a bit heavy, since it would need to inline the @lo and @hi
functions, simplify the resulting expression, and then check if the
result is a ConstantInt. And remember that in general these functions
can be arbitrary complex.

I agree; we'd need to use SCEV or some other heavyweight mechanism to
do the analysis. In some sense, however, that would be the price of
generality. On the other hand, I see no reason why we could not write a
utility function that could accomplish all of that, so we'd only need
to work out the details once.

SCEV is not the answer here. You just want to know if the result of a
function is constant given a set of parameters. Inlining +
simplifications should do it. But doing an inlining trial is expensive.

The hi/lo functions could be declared always_inline. Thus they will always
be inlined, either by the always-inliner pass or the usual one. You would
need to insert the instrumentation code or whatever that uses hi/lo before
any inliner runs, and run optimizations such as turning objectsize into a
constant after the inliner runs.

Ciao, Duncan.

Hi,

Sorry for the delay; comments below.

This is actually non-trivial to accomplish.
Metadata doesn't count as a user, so internal functions with no
other usage will get removed.

I thought that it is possible to have passes run before the optimizer
performs such deletions. Is this not practical? Another option is to
change the current implementation delete such functions in two phases:
in the first phase we leave functions with metadata references. In the
second phase (which runs near the end of the pipeline) we delete
functions regardless of metadata references.

Right now, if you list the users of a Value, the references coming
from metadata won't appear. Metadata is not an user and doesn't count
towards the number of uses of a value. That's why using anything
about constant expressions risks disappearing.
Leaving non-used functions to be removed after all optimizations could
be done. But then you would probably want to, for example, patch the
pass manager so that it didn't run a function pass over dead
functions, and so on.

the functions could be declared to have linkonce_odr linkage. That way
they will be zapped after the inliner runs, but shouldn't be removed
before.

I'm certainly not convinced. You cannot force all analysis to be run before inlining. You're basically saying that all passes that do analysis on buffer size must run quite early. The inliner is run pretty early!
At least in the case of the buffer overflow pass, I want it to run late, after most cleanups were done. Asan does exactly the same.

Another thing that bothers me is the implementation on the
objectsize intrinsic. This intrinsic returns the *constant* size of
the pointed object given as argument (if the object has a constant
size). However, with this function scheme, the implementation would
be a bit heavy, since it would need to inline the @lo and @hi
functions, simplify the resulting expression, and then check if the
result is a ConstantInt. And remember that in general these functions
can be arbitrary complex.

I agree; we'd need to use SCEV or some other heavyweight mechanism to
do the analysis. In some sense, however, that would be the price of
generality. On the other hand, I see no reason why we could not write a
utility function that could accomplish all of that, so we'd only need
to work out the details once.

SCEV is not the answer here. You just want to know if the result of a
function is constant given a set of parameters. Inlining +
simplifications should do it. But doing an inlining trial is expensive.

The hi/lo functions could be declared always_inline. Thus they will always
be inlined, either by the always-inliner pass or the usual one. You would
need to insert the instrumentation code or whatever that uses hi/lo before
any inliner runs, and run optimizations such as turning objectsize into a
constant after the inliner runs.

The semantics of the objectsize intrinsic is that it returns a constant value if it can figure out the objectsize, and return 0/-1 otherwise. So you cannot simply inline the functions and hope for the best. You need to run an inline trial: inline; try to fold the resulting expression into a constant; remove inlined code if it didn't fold to a constant.
You may say this is the price of generality. I don't know how slow would it be, though.

Today, after playing around with these things, I found another problem: inlining functions with this alloc metadata. Assuming that we attach the metadata to call sites in the front-end, if the function later gets inlined, then the metadata is lost. We can, however, allow the metadata to be attached to arbitrary instructions, so that the inliner can be taught to attach it to the returned expression.

Nuno

Hi,

Sorry for the delay; comments below.

>>>> This is actually non-trivial to accomplish.
>>>> Metadata doesn't count as a user, so internal functions with no
>>>> other usage will get removed.
>>>
>>> I thought that it is possible to have passes run before the
>>> optimizer performs such deletions. Is this not practical? Another
>>> option is to change the current implementation delete such
>>> functions in two phases: in the first phase we leave functions
>>> with metadata references. In the second phase (which runs near
>>> the end of the pipeline) we delete functions regardless of
>>> metadata references.
>>
>> Right now, if you list the users of a Value, the references coming
>> from metadata won't appear. Metadata is not an user and doesn't
>> count towards the number of uses of a value. That's why using
>> anything about constant expressions risks disappearing.
>> Leaving non-used functions to be removed after all optimizations
>> could be done. But then you would probably want to, for example,
>> patch the pass manager so that it didn't run a function pass over
>> dead functions, and so on.

Yes; I think the following would be better: For all functions that are
unused but still referenced by metadata, queue the passes on them that
would have run. If a pass then wants to inline one of these functions,
those queued passes can be run first.

>
> the functions could be declared to have linkonce_odr linkage. That
> way they will be zapped after the inliner runs, but shouldn't be
> removed before.

I'm certainly not convinced. You cannot force all analysis to be run
before inlining. You're basically saying that all passes that do
analysis on buffer size must run quite early.

I don't think that anyone said that :wink: -- But even it it were true, I
think the premise is incorrect. What is true is that analysis that
deals with tracking things tied to specific call sites should run prior
to inlining (which must be true because inlining can otherwise make
those call sites disappear, merge them with other calls, etc.).

To do bounds checking you need two things: First you need to know the
bounds (this requires tracking calls to allocation functions), and then
you need to look at memory accesses. My guess is that running the
analysis late helps much more with the second part than with the first.
So I would split this into two pieces. Prior to inlining, add whatever
is necessary around each call site so that you get the bounds data
that you need. You can tag these resulting values so that they're easily
recognizable to the later parts of the analysis (you might need to
artificially makes these 'used' so that DCE won't get rid of them).
Then, after more cleanup has been done by other optimization passes,
run the pass that instruments the memory accesses (then DCE anything
that you did not end up actually using).

The inliner is run
pretty early! At least in the case of the buffer overflow pass, I
want it to run late, after most cleanups were done. Asan does exactly
the same.

>>>> Another thing that bothers me is the implementation on the
>>>> objectsize intrinsic. This intrinsic returns the *constant* size
>>>> of the pointed object given as argument (if the object has a
>>>> constant size). However, with this function scheme, the
>>>> implementation would be a bit heavy, since it would need to
>>>> inline the @lo and @hi functions, simplify the resulting
>>>> expression, and then check if the result is a ConstantInt. And
>>>> remember that in general these functions can be arbitrary
>>>> complex.
>>>
>>> I agree; we'd need to use SCEV or some other heavyweight
>>> mechanism to do the analysis. In some sense, however, that would
>>> be the price of generality. On the other hand, I see no reason
>>> why we could not write a utility function that could accomplish
>>> all of that, so we'd only need to work out the details once.
>>
>> SCEV is not the answer here. You just want to know if the result
>> of a function is constant given a set of parameters. Inlining +
>> simplifications should do it. But doing an inlining trial is
>> expensive.
>
> The hi/lo functions could be declared always_inline. Thus they
> will always
> be inlined, either by the always-inliner pass or the usual one.
> You would need to insert the instrumentation code or whatever that
> uses hi/lo before any inliner runs, and run optimizations such as
> turning objectsize into a constant after the inliner runs.

The semantics of the objectsize intrinsic is that it returns a
constant value if it can figure out the objectsize, and return 0/-1
otherwise. So you cannot simply inline the functions and hope for the
best. You need to run an inline trial: inline; try to fold the
resulting expression into a constant; remove inlined code if it
didn't fold to a constant. You may say this is the price of
generality. I don't know how slow would it be, though.

My thought when proposing this mechanism was that DCE would eliminate
any unneeded instructions added by inlining.

Today, after playing around with these things, I found another
problem: inlining functions with this alloc metadata. Assuming that
we attach the metadata to call sites in the front-end, if the
function later gets inlined, then the metadata is lost. We can,
however, allow the metadata to be attached to arbitrary instructions,
so that the inliner can be taught to attach it to the returned
expression.

I don't understand how this would work exactly. Can you explain? Would
it be better to do the instrumentation prior to inlining?

Thanks again,
Hal

Hi Hal,

To do bounds checking you need two things: First you need to know the
bounds (this requires tracking calls to allocation functions), and then
you need to look at memory accesses. My guess is that running the
analysis late helps much more with the second part than with the first.
So I would split this into two pieces. Prior to inlining, add whatever
is necessary around each call site so that you get the bounds data
that you need. You can tag these resulting values so that they're easily
recognizable to the later parts of the analysis (you might need to
artificially makes these 'used' so that DCE won't get rid of them).

in that case, why not have the front-end do this part? I mean, rather
than the front-end outputting hi/lo functions and metadata so that some
LLVM pass can insert a few markers or whatever around/on call-sites that
a later LLVM pass recognizes, why not have the front-end insert those
"markers" directly?

Then, after more cleanup has been done by other optimization passes,
run the pass that instruments the memory accesses (then DCE anything
that you did not end up actually using).

Ciao, Duncan.

Hi Hal,

> To do bounds checking you need two things: First you need to know
> the bounds (this requires tracking calls to allocation functions),
> and then you need to look at memory accesses. My guess is that
> running the analysis late helps much more with the second part than
> with the first. So I would split this into two pieces. Prior to
> inlining, add whatever is necessary around each call site so that
> you get the bounds data that you need. You can tag these resulting
> values so that they're easily recognizable to the later parts of
> the analysis (you might need to artificially makes these 'used' so
> that DCE won't get rid of them).

in that case, why not have the front-end do this part? I mean, rather
than the front-end outputting hi/lo functions and metadata so that
some LLVM pass can insert a few markers or whatever around/on
call-sites that a later LLVM pass recognizes, why not have the
front-end insert those "markers" directly?

So long as this does not violate the "don't pay for what you don't
use" rule, I don't see any reason why not.

-Hal

Hi Hal,

Hi Hal,

To do bounds checking you need two things: First you need to know
the bounds (this requires tracking calls to allocation functions),
and then you need to look at memory accesses. My guess is that
running the analysis late helps much more with the second part than
with the first. So I would split this into two pieces. Prior to
inlining, add whatever is necessary around each call site so that
you get the bounds data that you need. You can tag these resulting
values so that they're easily recognizable to the later parts of
the analysis (you might need to artificially makes these 'used' so
that DCE won't get rid of them).

in that case, why not have the front-end do this part? I mean, rather
than the front-end outputting hi/lo functions and metadata so that
some LLVM pass can insert a few markers or whatever around/on
call-sites that a later LLVM pass recognizes, why not have the
front-end insert those "markers" directly?

So long as this does not violate the "don't pay for what you don't
use" rule, I don't see any reason why not.

the question then arises of what those "markers" should be, and kind of brings
things full circle to Nuno's original suggestion of calculating the hi/lo bounds
explicitly just before the call to wonder_malloc, and sticking metadata on the
call to say "this value holds the lower bound and this one the upper". The
problem with that is presumably that the optimizers will just zap the apparently
unused hi/lo values.

Ciao, Duncan.

Hi Hal,

>
>> Hi Hal,
>>
>>> To do bounds checking you need two things: First you need to know
>>> the bounds (this requires tracking calls to allocation functions),
>>> and then you need to look at memory accesses. My guess is that
>>> running the analysis late helps much more with the second part
>>> than with the first. So I would split this into two pieces. Prior
>>> to inlining, add whatever is necessary around each call site so
>>> that you get the bounds data that you need. You can tag these
>>> resulting values so that they're easily recognizable to the later
>>> parts of the analysis (you might need to artificially makes these
>>> 'used' so that DCE won't get rid of them).
>>
>> in that case, why not have the front-end do this part? I mean,
>> rather than the front-end outputting hi/lo functions and metadata
>> so that some LLVM pass can insert a few markers or whatever
>> around/on call-sites that a later LLVM pass recognizes, why not
>> have the front-end insert those "markers" directly?
>
> So long as this does not violate the "don't pay for what you don't
> use" rule, I don't see any reason why not.

the question then arises of what those "markers" should be, and kind
of brings things full circle to Nuno's original suggestion of
calculating the hi/lo bounds explicitly just before the call to
wonder_malloc, and sticking metadata on the call to say "this value
holds the lower bound and this one the upper". The problem with that
is presumably that the optimizers will just zap the apparently unused
hi/lo values.

I think this depends on when the passes run; looking at
PassManagerBuilder::populateFunctionPassManager it seems that the DCE
passes are all scheduled at the end. On the other hand, I recall that
other earlier passes also do some amount of DCE as well. That being the
case, if there is problem, we could develop some way to mark the
relevant values as 'used' and then remove those tags just prior to
running the real DCE passes at the end of the sequence.

-Hal

Hi,

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64 depending on the platform, and they must have the same number of arguments (not necessarily the same as the as allocation function). The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is a zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

Even if my_realloc() gets inlined later, the metadata can still be applied to the returned value (since it is not really specific to a call site). Of course some parameters of the allocation function may be deleted if the function gets inlined (i.e., nulled in the metadata), but I don't think we can workaround that problem. This is a best-effort approach, anyway.

To avoid these functions being removed, I propose a new linkage type. Something like internal_metadata (or hopefully a better name). This linkage would mean that a function can only be removed in codegen, and if it has no users. The difference to internal linkage, is that internal functions with no users can be deleted at any time.

So, what do you think about this new proposal? I guess it addresses all issues raised so far.

Thanks,
Nuno

The parameters are a separate metadata array or the alloc_size metadata
is variable length?

You'll probably want to write up some docs for the website on how this
is supposed to be laid out and work.

-eric

Quoting Eric Christopher <echristo@apple.com>:

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

The parameters are a separate metadata array or the alloc_size metadata
is variable length?

Variable length. I think that's the simplest solution, and the additional overhead of a separate array is probably not worth it.

You'll probably want to write up some docs for the website on how this
is supposed to be laid out and work.

Sure!

Nuno

Fair enough. I'd probably go the other way, but it's not a big deal.

-eric

Hi,

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64
depending on the platform, and they must have the same number of
arguments (not necessarily the same as the as allocation function).
The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is
a zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

Even if my_realloc() gets inlined later, the metadata can still be
applied to the returned value (since it is not really specific to a
call site). Of course some parameters of the allocation function may
be deleted if the function gets inlined (i.e., nulled in the
metadata), but I don't think we can workaround that problem. This is
a best-effort approach, anyway.

To avoid these functions being removed, I propose a new linkage
type. Something like internal_metadata (or hopefully a better name).
This linkage would mean that a function can only be removed in
codegen, and if it has no users. The difference to internal linkage,
is that internal functions with no users can be deleted at any time.

Is it possible to determine which functions are referenced in metadata
even though the metadata is not listed as a user? It seems like we
could do this without defining another linkage class.

So, what do you think about this new proposal? I guess it addresses
all issues raised so far.

I think this is good.

-Hal

Hi Nuno,

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64
depending on the platform, and they must have the same number of
arguments (not necessarily the same as the as allocation function).
The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is a
zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

suppose the size is %n+4. Then in the function you have to compute
   %s = add i32 %n, 4
and then put %s in the metadata
   !0 = metadata !{ i32 (i32)* @size, null, i32 %s }
However then the only use of %s is in the metadata, so basically any
optimization pass will zap it, right? So won't this only work if the
size etc calculation doesn't actually require any calculation, eg it
would work if you only pass function parameters and globals but that's
about it. Am I missing something?

Ciao, Duncan.

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64
depending on the platform, and they must have the same number of
arguments (not necessarily the same as the as allocation function).
The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is a
zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

suppose the size is %n+4. Then in the function you have to compute
   %s = add i32 %n, 4
and then put %s in the metadata
   !0 = metadata !{ i32 (i32)* @size, null, i32 %s }
However then the only use of %s is in the metadata, so basically any
optimization pass will zap it, right? So won't this only work if the
size etc calculation doesn't actually require any calculation, eg it
would work if you only pass function parameters and globals but that's
about it. Am I missing something?

Actually that's not what I had in mind.
Metadata will take %n (and not %s) as the parameter for @size function, and then @size(%n) itself returns %n+4. So no computations are supposed to be added to the caller of an allocation function. The metadata should only reference the parameters given to the allocation function. If the function is not inlined, then there is no risk of these parameters disappearing. If it is inlined, the probability is probably low, because you will most likely need the original parameters to compute the allocated size in the inlined function as well.

Nuno

Hi,

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64
depending on the platform, and they must have the same number of
arguments (not necessarily the same as the as allocation function).
The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is
a zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

Even if my_realloc() gets inlined later, the metadata can still be
applied to the returned value (since it is not really specific to a
call site). Of course some parameters of the allocation function may
be deleted if the function gets inlined (i.e., nulled in the
metadata), but I don't think we can workaround that problem. This is
a best-effort approach, anyway.

To avoid these functions being removed, I propose a new linkage
type. Something like internal_metadata (or hopefully a better name).
This linkage would mean that a function can only be removed in
codegen, and if it has no users. The difference to internal linkage,
is that internal functions with no users can be deleted at any time.

Is it possible to determine which functions are referenced in metadata
even though the metadata is not listed as a user? It seems like we
could do this without defining another linkage class.

Right now, no.
We would need a map from Function* to metadata, since we don't really want to scan the whole metadata every time we want to delete a function.
So, I don't know.. I'm pretty agnostic to either solution.

Nuno

>> Hi,
>>
>> So here is a new proposal:
>>
>> !0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }
>>
>> alloc_size_fn and offset_fn are functions that return either
>> i32/i64 depending on the platform, and they must have the same
>> number of arguments (not necessarily the same as the as allocation
>> function). The parameters are given in the metadata as well.
>> To accommodate the common case, offer_fn can be null, meaning it is
>> a zero offset.
>>
>> The usage would be something like this:
>>
>> %r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
>> !0 = metadata !{ i32 (i32)* @size, null, i32 %n }
>>
>> Even if my_realloc() gets inlined later, the metadata can still be
>> applied to the returned value (since it is not really specific to a
>> call site). Of course some parameters of the allocation function
>> may be deleted if the function gets inlined (i.e., nulled in the
>> metadata), but I don't think we can workaround that problem. This
>> is a best-effort approach, anyway.
>>
>>
>> To avoid these functions being removed, I propose a new linkage
>> type. Something like internal_metadata (or hopefully a better
>> name). This linkage would mean that a function can only be removed
>> in codegen, and if it has no users. The difference to internal
>> linkage, is that internal functions with no users can be deleted
>> at any time.
>
> Is it possible to determine which functions are referenced in
> metadata even though the metadata is not listed as a user? It seems
> like we could do this without defining another linkage class.

Right now, no.
We would need a map from Function* to metadata, since we don't really
want to scan the whole metadata every time we want to delete a
function. So, I don't know.. I'm pretty agnostic to either solution.

Fair enough. I'd prefer a more descriptive name if we use a linkage
class; how about metadata_referenced or referenced_by_metadata or
used_by_metadata, etc. (maybe also with the internal_ prefix as well)?

If there is an efficient way to maintain the map, I think that would be
better.

Thanks again,
Hal

Nuno Lopes wrote:

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64
depending on the platform, and they must have the same number of
arguments (not necessarily the same as the as allocation function).
The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is a
zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

suppose the size is %n+4. Then in the function you have to compute
    %s = add i32 %n, 4
and then put %s in the metadata
    !0 = metadata !{ i32 (i32)* @size, null, i32 %s }
However then the only use of %s is in the metadata, so basically any
optimization pass will zap it, right? So won't this only work if the
size etc calculation doesn't actually require any calculation, eg it
would work if you only pass function parameters and globals but that's
about it. Am I missing something?

Actually that's not what I had in mind.
Metadata will take %n (and not %s) as the parameter for @size function, and
then @size(%n) itself returns %n+4.

Just curious, what happens when @size is a declaration or a materializable function? Or what should happen?

   So no computations are supposed to be

added to the caller of an allocation function. The metadata should only
reference the parameters given to the allocation function. If the function
is not inlined, then there is no risk of these parameters disappearing. If
it is inlined, the probability is probably low, because you will most likely
need the original parameters to compute the allocated size in the inlined
function as well.

For as long as the optz'ns manage to RAUW the relevant Value's, the metadata will stay up to date.

Nick

Quoting Nuno Lopes <nunoplopes@sapo.pt>:

So here is a new proposal:

!0 = metadata !{ alloc_siz_fn, offset_fn, parameters* }

alloc_size_fn and offset_fn are functions that return either i32/i64
depending on the platform, and they must have the same number of
arguments (not necessarily the same as the as allocation function).
The parameters are given in the metadata as well.
To accommodate the common case, offer_fn can be null, meaning it is a
zero offset.

The usage would be something like this:

%r = call i8* @my_realloc(i32* %ptr, i32 %n), !alloc_size !0
!0 = metadata !{ i32 (i32)* @size, null, i32 %n }

suppose the size is %n+4. Then in the function you have to compute
   %s = add i32 %n, 4
and then put %s in the metadata
   !0 = metadata !{ i32 (i32)* @size, null, i32 %s }
However then the only use of %s is in the metadata, so basically any
optimization pass will zap it, right? So won't this only work if the
size etc calculation doesn't actually require any calculation, eg it
would work if you only pass function parameters and globals but that's
about it. Am I missing something?

Actually that's not what I had in mind.
Metadata will take %n (and not %s) as the parameter for @size function, and
then @size(%n) itself returns %n+4. So no computations are supposed to be
added to the caller of an allocation function. The metadata should only
reference the parameters given to the allocation function. If the function
is not inlined, then there is no risk of these parameters disappearing. If
it is inlined, the probability is probably low, because you will most likely
need the original parameters to compute the allocated size in the inlined
function as well.

Nuno

Ok, so please find in attach a patch that adds the alloc attribute.
Other things to follow are the discussion on a new linkage type, and refactoring of the memory built-ins analysis, clang support, etc.

Nuno

alloc_metadata.diff (8.12 KB)