Register Alias Sets

I notice that in X86GenRegisterInfo.inc, the AliasSets do not
include the register being queried. For example:

   const unsigned RAX_AliasSet = { X86::EAX, X86::AX, X86::AL, X86::AH,
0 };
   const unsigned EAX_AliasSet = { X86::RAX, X86::AX, X86::AL, X86::AH, 0 };

This makes it hard to do set comparisons. RAX and EAX really have
equivalent alias sets but one wouldn't know that from these definitions.

Is there a specific reason that these registers are left out? Would
things seriously break if they were added in?

I know RegAllocLinearScan.cpp does some checks for register aliasing
by looking for empty AliasSets. I imagine that'll get us into trouble.

                                 -Dave

I notice that in X86GenRegisterInfo.inc, the AliasSets do not
include the register being queried. For example:

const unsigned RAX_AliasSet = { X86::EAX, X86::AX, X86::AL, X86::AH,
0 };
const unsigned EAX_AliasSet = { X86::RAX, X86::AX, X86::AL,
X86::AH, 0 };

This makes it hard to do set comparisons. RAX and EAX really have
equivalent alias sets but one wouldn’t know that from these definitions.

Sure, but where these comparisons are needed, for example? RAX and EAX alias sets intersect and that’s enough to decide that these regs can’t be assigned simultaneously. In general, that what you would check anyway because for some architectures you won’t (theoretically, AFAIK) have equal alias sets at all, but they’ll intersect. Consider the following example (as I remember it from some paper on representing irregular architechtures):
There are 8 32-bit general purpose registers (R0-R7) and there are 7 64-bit registers (W0-W6). Ri reg is a high word of Wi and R(i+1) is a low word of Wi (i runs from 0 to 6). So the alias set for Wi is { Ri, R(i + 1)} (you can include Wi, of course). You can’t allocate neighbouring W-regs the same time as their alias sets intersect (but not equal).

Also consider saving space and one another minimalism issue: why return the queried reg in its alias set if you already know it?

Is there a specific reason that these registers are left out? Would
things seriously break if they were added in?

I think, that is how alias sets are more comfortable to use (at least in the existing LLVM code). Usually, one does smth special with one register and some other thing for every aliased reg. Probably, register allocator would be broken or at least there would be one unnecessary action each time smth’s done with alias set. For example, when you set some reg being used via a PhysRegTracker object, it automagically checks for every aliased reg and marks it being used, too. If the queried reg was included, PhysRegTracker could mark the specified reg twice and so on. Though the changes necessary to handle this don’t seem too difficult at first glance.

I know RegAllocLinearScan.cpp does some checks for register aliasing
by looking for empty AliasSets. I imagine that’ll get us into trouble.

Probably. As I already said it’d get us into a little overhead on each alias set operation (I mean without modifying its code, of course). I believe other allocators check alias sets as well (indirectly, may be).

However, I’m only a user of LLVM’s target definitions. May be, someone’ll give you more presice, correct and complete answer.

I notice that in X86GenRegisterInfo.inc, the AliasSets do not
include the register being queried. For example:

const unsigned RAX_AliasSet = { X86::EAX, X86::AX, X86::AL, X86::AH,
0 };
const unsigned EAX_AliasSet = { X86::RAX, X86::AX, X86::AL,
X86::AH, 0 };

This makes it hard to do set comparisons. RAX and EAX really have
equivalent alias sets but one wouldn’t know that from these definitions.

Does it really make it that hard to do set comparisons? The sets are themselves indexed by the register that the set applies to. If you have the index then you can explicitly add this to the set for your computation.

What do you mean by set comparisons? Why do you need to know that two registers have the same alias set?

The idea of these alias sets is to efficiently respond to the query "do these two register alias". Note that alias sets don't have many algebraic properties (like transitivity) that you might expect (e.g. AH aliases AX and AL aliases AX, but AH does not alias AL).

-Chris

Anton Vayvod wrote:

Sure, but where these comparisons are needed, for example? RAX and
EAX alias sets intersect and that's enough to decide that these regs
can't be assigned simultaneously.

One place these comparisons are used is to build a provably optimal
register class tree in a Smith/Ramsey/Holloway allocator. Building it
algorithmically is portable to all architectures without requiring
the person writing the machine-specific parts to care about what
the register allocator does.

In general, that what you would check anyway because for some
architectures you won't (theoretically, AFAIK) have equal alias sets
at all, but they'll intersect. Consider the following example (as I
remember it from some paper on representing irregular architechtures):

As Smith, et. al. point out, such architectures practically don't
exist in the real world. Most alias sets are either completely
disjoint, exactly equal or entirely contained.

There are 8 32-bit general purpose registers (R0-R7) and there are 7
64-bit registers (W0-W6). Ri reg is a high word of Wi and R(i+1) is
a low word of Wi (i runs from 0 to 6). So the alias set for Wi is {
Ri, R(i + 1)} (you can include Wi, of course). You can't allocate neighbouring W-regs the same time as their alias sets intersect (but
not equal).

Sure, the alias sets won't be equal, but that's not the point. The
point is that when they _are_ equal some very important register
allocation optimizations can be done.

BTW, is there an architecture out there that has register overlaps
like this? Maybe some DSP chip or something?

In any case I can work around the problem by building temporary
alias sets, but that's ugly and wasteful. It sounds like there
are too many dependencies on the current specification to make
changing it possible, though.

This might be something to think about for the future.

                                      -Dave

Christopher Lamb wrote:

Does it really make it that hard to do set comparisons? The sets are themselves indexed by the register that the set applies to. If you have the index then you can explicitly add this to the set for your computation.

By "hard" I didn't mean logically difficult. I meant that to do
so you've got to write a bunch of special-case code that really
just obscures what you're really trying to do.

                                 -Dave

Sure, but where these comparisons are needed, for example? RAX and
EAX alias sets intersect and that's enough to decide that these regs
can't be assigned simultaneously.

One place these comparisons are used is to build a provably optimal
register class tree in a Smith/Ramsey/Holloway allocator. Building it
algorithmically is portable to all architectures without requiring
the person writing the machine-specific parts to care about what
the register allocator does.

I strongly agree this is a good thing to do. Ideally the target description stuff has no idea what algorithms the code generator is using.

In any case I can work around the problem by building temporary
alias sets, but that's ugly and wasteful. It sounds like there
are too many dependencies on the current specification to make
changing it possible, though.

This sounds like you should just compute the sets you want (programatically) at the start of your pass. Then your algorithm uses that info instead. Why is this bad?

-Chris

Anton Vayvod wrote:

Sure, but where these comparisons are needed, for example? RAX and
EAX alias sets intersect and that’s enough to decide that these regs
can’t be assigned simultaneously.

One place these comparisons are used is to build a provably optimal
register class tree in a Smith/Ramsey/Holloway allocator. Building it
algorithmically is portable to all architectures without requiring
the person writing the machine-specific parts to care about what
the register allocator does.

I think Smith/Ramsey and Holloway compare alias sets of the whole register classes, not individual registers. So anyway you’d build alias sets of the register classes at the start of your allocator pass. And whether you add a register and its alias set to class’ alias set or you add just the alias set with one more register in it doesn’t make any difference (except one more line of code in the first case, of course).

LLVM represent target’s registers almost the same way Smith, et al. suggest except for the difference in alias(R) definition that you’ve mentioned above.

In general, that what you would check anyway because for some
architectures you won’t (theoretically, AFAIK) have equal alias sets
at all, but they’ll intersect. Consider the following example (as I
remember it from some paper on representing irregular
architechtures):

As Smith, et. al. point out, such architectures practically don’t
exist in the real world. Most alias sets are either completely
disjoint, exactly equal or entirely contained.

There are 8 32-bit general purpose registers (R0-R7) and there are 7
64-bit registers (W0-W6). Ri reg is a high word of Wi and R(i+1) is
a low word of Wi (i runs from 0 to 6). So the alias set for Wi is {
Ri, R(i + 1)} (you can include Wi, of course). You can’t allocate
neighbouring W-regs the same time as their alias sets intersect (but
not equal).

Sure, the alias sets won’t be equal, but that’s not the point. The
point is that when they are equal some very important register
allocation optimizations can be done.

The alias sets of R and W register classes are equal and again that is what Smith, et al. use in their work to build an optimal register class tree.

BTW, is there an architecture out there that has register overlaps
like this? Maybe some DSP chip or something?

Don’t know really :slight_smile: When I reviewed the Smith/Ramsey/Holloway paper it turned out my example was almost the same they used on the first figure. Though some other people mention unaligned register pairs (like if Wi weren’t single registers but register pairs). For example, Briggs writes about it when talks about coloring of register pairs.