RFC: To add __attribute__((regmask("preserve/clobbered list here"))) in clang

Hello Clang and LLVM Devs,

I have been working to add support for an attribute in clang and LLVM that helps
user to guide interprocedural register allocation. But the use case I am having
is very limited and thus I belieave it is good to have discussion on this before
sending a patch.

So for IPRA we have a situation where a function is calling a function which is
written in assembly and it is not defined in current module. IPRA’s scope is
limited to a module so for such externally defined function it uses default
calling convention but here as the function is written in assembly user can
provide exact register usage detials. So we dicided to mark declration of such
function with attribute((regmask(“clobbered list here”))) so LLVM can
construct regmask out of it and use it with IPRA to improve register allocation.

For this purpose I added support for this attribute in clang and clang codegen
this attribute as target dependent attribute and add to declaration. Then IPRA
constructs regmask from this and use during the optimization.

How ever after disussion on IRC with Joerg Sonnenberger and jroelofs we think
that this may effect other areas too. So it would be better to discuss this and
implement this as we all agree upon it.

Sincerely,
Vivek

Sorry I forget to add clang mailing list.

-Vivek

This situation is actually far more common and not restricted to
assembly at all. There are a number of functions already that have
special ABIs with much larger set of preserved registers. A typical
is __tls_get_addr in many ABIs. At the moment, we need hacks in the
target specific part of LLVM for handling this. Related (limited)
approaches for this are the preserve_most and preserve_all calling
conventions.

As mentioned in the IRC discussion, there are two important issues to be
considered here from my perspective.

(1) I really dislike an attribute providing a clobber list. Whether a
given register is clobbered or not is an implementation detail of a
specific version and can easily change. It is also something difficult
to reason about. The invariance that should be put into the ABI contract
is the inverse -- what registers a function is going to preserve. That
is even more important when looking at long time ABI stability. New
registers are introduced every so often. That shouldn't change the
meaning of a declaration.

The main reason for using a clobber list seems to be a concern about
verbosity. I think that can be mostly avoided by allowing the use of
register classes in the specifier, e.g. all-fp for i387 register,
all-sse2 for the SSE2 register set, all-avx for the AVX register etc.
At the same time, I consider a certain verbosity to be useful, since
ultimately, implementation and interface definition need to be carefully
compared.

(2) Should the attribute extend or replace the normal preserved
registers? Randomly clobbering registers is going to create all kinds of
fun issues with the backend assumptions. We already have such fun with
inline assembler. Extend-only semantic is much easier to support. It can
also be combined with a special CC with minimal default preservation and
well defined meanings e.g. for arguments passed in registers.

Joerg

So for IPRA we have a situation where a function is calling a function
which is written in assembly and it is not defined in current module. IPRA's scope is
limited to a module so for such externally defined function it uses default
calling convention but here as the function is written in assembly user can
provide exact register usage detials. So we dicided to mark declration of
such
function with __attribute__((regmask("clobbered list here"))) so LLVM can
construct regmask out of it and use it with IPRA to improve register
allocation.

This situation is actually far more common and not restricted to
assembly at all. There are a number of functions already that have
special ABIs with much larger set of preserved registers. A typical
is __tls_get_addr in many ABIs. At the moment, we need hacks in the
target specific part of LLVM for handling this. Related (limited)
approaches for this are the preserve_most and preserve_all calling
conventions.

As mentioned in the IRC discussion, there are two important issues to be
considered here from my perspective.

(1) I really dislike an attribute providing a clobber list. Whether a
given register is clobbered or not is an implementation detail of a
specific version and can easily change. It is also something difficult
to reason about. The invariance that should be put into the ABI contract
is the inverse -- what registers a function is going to preserve. That
is even more important when looking at long time ABI stability. New
registers are introduced every so often. That shouldn't change the
meaning of a declaration.

Interestingly your last point is the reason why I'd think a clobber list could be more appropriate for some cases: if I have a hand-written assembly function, and it is clobbering some registers, the fact that the client code enables AVX2 won’t make my routine clobbering these.

Maybe a syntax with +/- could be used to express things like “all vector registers but these”.

The main reason for using a clobber list seems to be a concern about
verbosity. I think that can be mostly avoided by allowing the use of
register classes in the specifier, e.g. all-fp for i387 register,
all-sse2 for the SSE2 register set, all-avx for the AVX register etc.
At the same time, I consider a certain verbosity to be useful, since
ultimately, implementation and interface definition need to be carefully
compared.

(2) Should the attribute extend or replace the normal preserved
registers? Randomly clobbering registers is going to create all kinds of
fun issues with the backend assumptions. We already have such fun with
inline assembler. Extend-only semantic is much easier to support. It can
also be combined with a special CC with minimal default preservation and
well defined meanings e.g. for arguments passed in registers.

Agree.

Overall I’m unsure how much applicability this attribute feature will have in practice though, or if it is worth the complexity to support it.

> (1) I really dislike an attribute providing a clobber list. Whether a
> given register is clobbered or not is an implementation detail of a
> specific version and can easily change. It is also something difficult
> to reason about. The invariance that should be put into the ABI contract
> is the inverse -- what registers a function is going to preserve. That
> is even more important when looking at long time ABI stability. New
> registers are introduced every so often. That shouldn't change the
> meaning of a declaration.

Interestingly your last point is the reason why I'd think a clobber
list could be more appropriate for some cases: if I have a hand-written
assembly function, and it is clobbering some registers, the fact that
the client code enables AVX2 won’t make my routine clobbering these.

I want to be as defensive about assumptions as possible. Mixing code
with different compilation options, inlining etc. can all have funny
side effects, so I think it is really important to avoid making
unnecessary and questionable assumptions. The difference for persist vs
clobber list in the error cases is quite different. If the persist list
is too large, at most it will cause a performance penalty. A clobber
list that isn't correct will result in difficult to trace down
miscompiles. There is a lot of code that could benefit e.g. from a
mostly-preserve-all dprintf and the like. I think that's a valid use
case and shows that ABI implications should be kept in mind.

Maybe a syntax with +/- could be used to express things like “all vector registers but these”.

Yes, there are a number of different ways to allow compact
specifications. The primary concern for me is that "vector register"
as base class is replaced with the appropiate ISA register classes
(matching the ISA features that added them). But in general, I believe
even a full manual specification of a preserve list can be very compact
and not much larger than a clobber list, if at all.

Joerg

I think that to use this information no pass from IPRA is required but we can do this earlier that IPRA as we have this information from the source file we can use it to build regmask for relevant call insertion while generating MI. Thus register allocators will be benefited automatically even with out IPRA.

-Vivek