[PATCH] Handle tied sub-operands in AsmMatcherEmitter

Hi Ulrich,

Thank you for looking at this. Apologies again for taking unjustifiably long to get back to you. This is really good stuff and I very much want to see this go in. I like it enough I’m going to try to talk you into doing even more work on improving this code. :wink:

Fair warning up front: You’re digging into some pretty fundamental problems in how the assemblers and code generators like to view instructions. As you correctly observe, we don’t really have very good solutions in place right now, just a bunch of hacks that get us where we need to go. I’d love to fix that, and this patch is a step in that direction. So I’m going to give a bit of a brain-dump here about a few different things. Adding llvmdev as well, since we’re veering into more high level design territory

Jakob,

this is another TableGen patch related to handling pre-inc instruction
which I need as prerequisite for the PowerPC asm parser.

The problem this patch attempts to fix is the handling of register tie
constraints in AsmMatcherEmitter, where one operand is tied to a
sub-operand of another operand. The typical scenario for this to happen is
the tie between the “write-back” register of a pre-inc instruction, and the
base register sub-operand of the memory address operand of that
instruction.

Now from reading the code, it seems that this scenario wasn’t ever supposed
to be properly supported by AsmMatcherEmitter. In fact, there seems to be
code that at one time may have completely rejected such instructions, but
now simply emits some warnings (and only when using debug mode). However,
even while those instructions may no longer be rejected, the code that is
generated doesn’t seem to be correct either.

It’s a bit outside the scope of what the matcher really wants to deal with, yes.

One underlying cause of the problem is the way tied operands are handled:
If operand I is tied to J (where I < J), then the actual operand is emitted
when at location I, and when at location J a copy of what is already at
location I is emitted. To ensure this can always be done,
buildInstructionOperandReference “canonicalizes” the assembler string such
that the “destination” operand (with smaller number) is always visited
first:

// If the named operand is tied, canonicalize it to the untied operand.
// For example, something like:
// (outs GPR:$dst), (ins GPR:$src)
// with an asmstring of
// “inc $src”
// we want to canonicalize to:
// “inc $dst”
// so that we know how to provide the $dst operand when filling in the
result.

This has the effect, in the example given above, that when generating the
output MI operand list, the assembler string operand originally found at
the location indicated by “$src” in the pattern will be placed in the MI
operand indicated by “$dst”, and later on, in the MI operand indicated by
“$src”, a copy of “$dst” is placed.

However, if the destination is only part of the source operand (e.g. with
a tie like “$wb = $mem.ptrreg”), this code currently also “canonicalizes”
the full $mem to $wb. This has the effect that in the output MI operand
list, in the place of “$wb” all the sub-operands indicated by “$mem” are
placed, and in the place of “$mem”, only a single sub-operand is copied.
This results in a wrong MI output list.

On the other hand, it is necessary to do this canonicalization, since the
MI operand list is created by appending one operand after the other, so it
is not possible to perform a “forward tie”: when encoding “$wb”, we’d have
to leave a slot open, and only after we’ve encoded “$mem”, we can then go
back and fill in the copy.

Yep.

One backend currently affected by this problem is ARM, where they work
around it by either providing custom matchers for pre-inc instructions, or
else just handling the whole instruction differently via special asm-parser
only instruction patterns.

You mean custom converters, not custom matchers, yes? The matcher is something completely different. I’m assuming you mean converters for the rest. If that’s not the case, please let me know and I’ll back up and re-think.

To avoid having to duplicate hacks like this, I’ve come up with the
attached patch, which fixes at least one part of the problem: if the
multi-part operand can be handled piecewise by the asm parser, we can still
do the “canonicalization” trick by moving up just the piece of the operand
that’s being tied. This is implemented by updating the canonicalization
logic in buildInstructionOperandReference, and modifying
buildInstructionOperandReference to allow ties to work on sub-operands.

There’s a few conflicting design goals here, and we don’t yet have a truly robust solution in place.

There are two core issues. First, the tied operands are an artifact of the way we do instruction selection for the compiler. They’re how we represent read-modify-write operands, basically. The assembler shouldn’t have to know or care about them at all. Specifically, they shouldn’t even be represented at all as an MCOperand. Rather, only the actual operand referenced by the instruction encoding should be an operand, and it should have an instruction-description notation that the operand is RMW. This means that the MI representation and the MCInst representations of the instructions will no longer have a one-to-one operand correspondence, so there will need to be logic to map between them for things like the MC lowering pass (the ill named AsmPrinter), the TargetInstrInfo layer which inherits from MCInstrInfo, and probably others. This is usually, in effect, what many of the ARM assembler aliases of the load/store instructions are doing manually. We just want to automate it.

Second, many (most?) of the time, the assembly parser would really like to talk about individual MCOperands only, not complex operands like memory addressing modes. We currently can’t do that very effectively, as you’ve encountered. Even at the MC layer, we still want to have a notion of “this collection of operands is a memory addressing mode,” but currently that concept is tightly coupled to the instruction parsing and printing. Mainly that’s constrained, ironically, due to the constant tokens (e.g., the ‘[‘ and ‘]’ in an ARM memory reference) that need to be checked, and when printed considered part of the operand. When parsing, we want to consider them as completely separate tokens that are trivially matchable as a plain literal token. When printing, however, we need to know that they’re part of the more generalized operand. My current thinking on this is that each complex operand would have an MIOperandInfo specifying the raw operands (just like now), but would also have an AsmFragment (name could probably stand some wordsmithing…) attribute which describes how to print the operand. The asm matcher code in tablegen can use these strings to flatten the assembler string used for both printing and matching from this, yet we still have the extra semantic information that this sub-string is all a memory address (so things like the annotated disassembly can still do their thing).

A hopefully clarifying example. Let’s say we have something like this operand for a reg+reg addressing mode.

def myMemOperand : … {

MIOperandInfo = (ops GPR:$base, GPR:$offset);
AsmFragment = “[$base, $offset]”;
}

When referenced in an instruction asm string something like so:

def … : … (ins GPR:$dst, myMemOperand:$addr)… {
let AsmString = “load $dst, $addr”;

}

TableGen can then combine those strings when building up for the asm matcher into:
“load $dst, [$addr.base, $addr.offset]"

To do that, TableGen will obviously need to construct up on a per-instruction basis the sub-operand references like “addr.base”. I think that should be pretty straightforward, though.

The printers don’t need to change at all if we don’t want them to. They can keep using the custom operand printing just like today and everything works fine. Later we can (perhaps should) change those, too, to use similar sub-string tricks with appropriate hooks for the annotated disassembly information. That can easily be a problem for another day, though, so long as the current printing stuff continues to work as-is.

Now, those changes unfortunately affect some of the other cases that still
cannot be implemented correctly, because the multi-part operand cannot be
handled piecewise (as is the case in the ARM back-end). In those, you now
see errors of the form “Instruction … has operand ‘$wb’ that doesn’t
appear in asm string.” This is because $wb is no longer incorrectly
canonicalized … If those instructions would have actually been used by
the asm parser, they would have generated wrong MI operand lists. However,
they aren’t (because the use custom matchers or are “shadowed” by
asm-parser only alias instructions), those errors really are spurious.

One of the cases is simple to fix: if we have a custom matcher, we don’t
need to run through buildInstructionOperandReference at all, since all the
matching is done by the custom matcher anyway. Thus we can just skip the
routine and avoid spurious errors.

However, the other case (where the instruction is shadowed by a asm-parser
only alias) cannot be detected automatically. To avoid a build break, the
patch demotes the FatalError to a simple Warning, which will now show up on
those handfull of ARM instruction patterns. To remove those warnings, we
could either mark the instructions as “isCodeGenOnly” or implement proper
custom matchers. I’d prefer to leave this up to the ARM back-end
maintainers, though …

The instructions that have an AsmParser alias should definitely be marked isCodeGenOnly. It’s a (probably harmless at the moment) bug if they’re not. Please go ahead and update them. That way the build will stay warning-clean after this goes in. I suggest doing that separately as an incremental patch before landing this one since it’s preparatory work, rather than being strictly part of this patch.

Does this approach look reasonable? I’d appreciate a review of the
patch …

It’s reasonable as a step towards improving things, yes. If you’re really interested in this area, I’d love to see some of the more aggressive refactoring talked about above get some traction.

Thanks again!

-Jim

Hi Jim,

Thank you for looking at this. Apologies again for taking
unjustifiably long to get back to you. This is really good stuff and
I very much want to see this go in. I like it enough I’m going to
try to talk you into doing even more work on improving this code. :wink:

Fair warning up front: You’re digging into some pretty fundamental
problems in how the assemblers and code generators like to view
instructions. As you correctly observe, we don’t really have very
good solutions in place right now, just a bunch of hacks that get us
where we need to go. I’d *love* to fix that, and this patch is a
step in that direction. So I’m going to give a bit of a brain-dump
here about a few different things. Adding llvmdev as well, since
we’re veering into more high level design territory

Thanks for the detailed reply, this really helps me understand
the "big picture" better!

One backend currently affected by this problem is ARM, where they work
around it by either providing custom matchers for pre-inc instructions,

or

else just handling the whole instruction differently via special

asm-parser

only instruction patterns.

You mean custom converters, not custom matchers, yes? The matcher is
something completely different. I’m assuming you mean converters for
the rest. If that’s not the case, please let me know and I’ll back
up and re-think.

Right, custom converters (i.e. AsmMatchConverter). Sorry for the
confusion with terminology ...

There are two core issues. First, the tied operands are an artifact
of the way we do instruction selection for the compiler. They’re how
we represent read-modify-write operands, basically. The assembler
shouldn’t have to know or care about them at all. Specifically, they
shouldn’t even be represented at all as an MCOperand. Rather, only
the actual operand referenced by the instruction encoding should be
an operand, and it should have an instruction-description notation
that the operand is RMW. This means that the MI representation and
the MCInst representations of the instructions will no longer have a
one-to-one operand correspondence, so there will need to be logic to
map between them for things like the MC lowering pass (the ill named
AsmPrinter), the TargetInstrInfo layer which inherits from
MCInstrInfo, and probably others. This is usually, in effect, what
many of the ARM assembler aliases of the load/store instructions are
doing manually. We just want to automate it.

I see. This could actually make things quite a bit simpler.

If I understand your point correctly, at the MCInst level, we don't
want multiple copies of the tied operand. In fact, would it make
sense to say that at the MCInst level, the list of operands we want
should correspond exactly to those named in the AsmString? The
rationale would be that any operand not named in the AsmString is
obviously not needed to emit the instruction as string, and thus
it clearly ought not be needed when emitting (or matching!) the
instruction in any other way either ...

Right now, we do have multiple copies, and the common TableGen
code attempts to fill them in (with copies of the operand value).
However, the ARM custom converters mentioned above actually do
not bother: instead, they simply fill those slots that do not
correspond to named operands in the AsmString with dummy entries
(constant 0). I can see now where this actually makes sense given
that those MCOperand entries never should be used for anything.

As a first step towards the goal of removing the duplicate operands
from MCInst, would it make sense to move the ARM approach to common
code: that is, have common code simply emit dummy entries for all
operand slots not named in the AsmString?

This would have several advantages:
- TableGen AsmMatcherEmitter would not have to track matching
  constraints at all, simplifying that code a bit
- it would work even in those cases where the back-end needs a
  custom matcher for multi-operand operands, where my current
  patch does not work
- most (all?) of the ARM custom converters could simply be
  removed, since the common code would do the same thing

At some later point, it should then be simple to completely remove
those dummy operands (and deal with issues like operand numbers no
longer matching those in MachineInstruction).

I've tried to put together a patch to explore this direction. As it
turns out, even this first step is not trivial to implement, since
back-ends do tend to use both operand slots of a tied operand on
occasion; this would need to be cleaned up first. However, if as
a first step towards the first step we emit dummy operands only for
tied *sub*-operands (i.e. the case that is currently completely
broken in common code anyway), everything seems to work.

The attached patch implements this; it is in fact a somewhat
simplified version of the orginial patch in this email thread.
Note that there is no longer any warning emitted if a named
operand is not found in the AsmString; instead we simply get
the dummy operand.

Second, many (most?) of the time, the assembly parser would really
like to talk about individual MCOperands only, not complex operands
like memory addressing modes.

[snip]

I agree that the approach you outline here looks reasonable,
and may be simplify parsing/matching complex operands. This
does look like a somewhat independent problem, however, and
I'd prefer to possibly addressing it as a second step later
on ...

The instructions that have an AsmParser alias should definitely be
marked isCodeGenOnly. It’s a (probably harmless at the moment) bug
if they’re not. Please go ahead and update them. That way the build
will stay warning-clean after this goes in. I suggest doing that
separately as an incremental patch before landing this one since
it’s preparatory work, rather than being strictly part of this patch.

Huh, that causes failures in the disassembler testsuite. Apparently,
the main patterns are used for code generation *and* the disassembler,
and separate patterns are used for the asm parser only. I didn't see
a way to mark a pattern for both codegen and disassembler, but not
the asm parser ...

It’s reasonable as a step towards improving things, yes. If you’re
really interested in this area, I’d love to see some of the more
aggressive refactoring talked about above get some traction.

Thanks again for the review!

Do you agree with the approach in this updated patch? If so, would
it be OK to commit as first step? Thanks!

(Or else, if you'd prefer me to check in the original version of
the patch, I'd certainly be happy to do so as well.)

(See attached file: diff-llvm-tblgen-tie)

Bye,
Ulrich

diff-llvm-tblgen-tie (4.57 KB)

I would like to add one more case here: Fixed register operands.

Some instructions, like x86's MUL and DIV, take operands in fixed registers. Currently, we handle that with COPY instructions to and from the fixed registers, but that is making code motion passes more complicated than they need to be. (Actually, they usually just run away when they see one of these instructions).

I would like to have MUL32r take two virtual register operands, one of them tied to the fixed register %EAX. Just like two-address instructions, it would be the register allocator's responsibility to satisfy the constraint. This would also make it possible to write proper isel patterns for MUL and DIV.

This doesn't need to happen right away, it is just a second case to consider of MI operands not corresponding directly to encoded MC operands.

/jakob

I'm wondering: is is not possible to handle this case by using a
register class containing just the one register?

Bye,
Ulrich

I think that would work too. Either way, you get MI operands that aren't encoded.

/jakob

Right. Well, in any case this would also seem to argue for the rule
I suggested in my other email: the MC operand list should consist of
exactly those operands that are named in the AsmString.

Bye,
Ulrich

Hi Jim,

Thank you for looking at this. Apologies again for taking
unjustifiably long to get back to you. This is really good stuff and
I very much want to see this go in. I like it enough I’m going to
try to talk you into doing even more work on improving this code. :wink:

Fair warning up front: You’re digging into some pretty fundamental
problems in how the assemblers and code generators like to view
instructions. As you correctly observe, we don’t really have very
good solutions in place right now, just a bunch of hacks that get us
where we need to go. I’d love to fix that, and this patch is a
step in that direction. So I’m going to give a bit of a brain-dump
here about a few different things. Adding llvmdev as well, since
we’re veering into more high level design territory

Thanks for the detailed reply, this really helps me understand
the “big picture” better!

One backend currently affected by this problem is ARM, where they work
around it by either providing custom matchers for pre-inc instructions,

or

else just handling the whole instruction differently via special

asm-parser

only instruction patterns.

You mean custom converters, not custom matchers, yes? The matcher is
something completely different. I’m assuming you mean converters for
the rest. If that’s not the case, please let me know and I’ll back
up and re-think.

Right, custom converters (i.e. AsmMatchConverter). Sorry for the
confusion with terminology …

There are two core issues. First, the tied operands are an artifact
of the way we do instruction selection for the compiler. They’re how
we represent read-modify-write operands, basically. The assembler
shouldn’t have to know or care about them at all. Specifically, they
shouldn’t even be represented at all as an MCOperand. Rather, only
the actual operand referenced by the instruction encoding should be
an operand, and it should have an instruction-description notation
that the operand is RMW. This means that the MI representation and
the MCInst representations of the instructions will no longer have a
one-to-one operand correspondence, so there will need to be logic to
map between them for things like the MC lowering pass (the ill named
AsmPrinter), the TargetInstrInfo layer which inherits from
MCInstrInfo, and probably others. This is usually, in effect, what
many of the ARM assembler aliases of the load/store instructions are
doing manually. We just want to automate it.

I see. This could actually make things quite a bit simpler.

If I understand your point correctly, at the MCInst level, we don’t
want multiple copies of the tied operand. In fact, would it make
sense to say that at the MCInst level, the list of operands we want
should correspond exactly to those named in the AsmString? The
rationale would be that any operand not named in the AsmString is
obviously not needed to emit the instruction as string, and thus
it clearly ought not be needed when emitting (or matching!) the
instruction in any other way either …

Yep, exactly right. In practice, there may need to be exceptions made, but hopefully we’d be able to figure out better ways to model those so we could make this an invariant.

Right now, we do have multiple copies, and the common TableGen
code attempts to fill them in (with copies of the operand value).
However, the ARM custom converters mentioned above actually do
not bother: instead, they simply fill those slots that do not
correspond to named operands in the AsmString with dummy entries
(constant 0). I can see now where this actually makes sense given
that those MCOperand entries never should be used for anything.

As a first step towards the goal of removing the duplicate operands
from MCInst, would it make sense to move the ARM approach to common
code: that is, have common code simply emit dummy entries for all
operand slots not named in the AsmString?

Sure. There’s a bit of a caveat that the original motivation for this patch originally intersects here. The first operand in the list isn’t necessarily the “real” one. We can probably fix this by using your suggestion above and consider whichever one of the two is used in the asm string as the real one. If both are used, make that an error and fix the strings that do that.

This would have several advantages:

  • TableGen AsmMatcherEmitter would not have to track matching
    constraints at all, simplifying that code a bit
  • it would work even in those cases where the back-end needs a
    custom matcher for multi-operand operands, where my current
    patch does not work
  • most (all?) of the ARM custom converters could simply be
    removed, since the common code would do the same thing

At some later point, it should then be simple to completely remove
those dummy operands (and deal with issues like operand numbers no
longer matching those in MachineInstruction).

Yep. Sounds great.

I’ve tried to put together a patch to explore this direction. As it
turns out, even this first step is not trivial to implement, since
back-ends do tend to use both operand slots of a tied operand on
occasion; this would need to be cleaned up first. However, if as
a first step towards the first step we emit dummy operands only for
tied sub-operands (i.e. the case that is currently completely
broken in common code anyway), everything seems to work.

OK.

The attached patch implements this; it is in fact a somewhat
simplified version of the orginial patch in this email thread.
Note that there is no longer any warning emitted if a named
operand is not found in the AsmString; instead we simply get
the dummy operand.

Second, many (most?) of the time, the assembly parser would really
like to talk about individual MCOperands only, not complex operands
like memory addressing modes.

[snip]

I agree that the approach you outline here looks reasonable,
and may be simplify parsing/matching complex operands. This
does look like a somewhat independent problem, however, and
I’d prefer to possibly addressing it as a second step later
on …

Yes, independent. Just related, and we were on the general topic anyway, so… :slight_smile:

The instructions that have an AsmParser alias should definitely be
marked isCodeGenOnly. It’s a (probably harmless at the moment) bug
if they’re not. Please go ahead and update them. That way the build
will stay warning-clean after this goes in. I suggest doing that
separately as an incremental patch before landing this one since
it’s preparatory work, rather than being strictly part of this patch.

Huh, that causes failures in the disassembler testsuite. Apparently,
the main patterns are used for code generation and the disassembler,
and separate patterns are used for the asm parser only. I didn’t see
a way to mark a pattern for both codegen and disassembler, but not
the asm parser …

Oh drat. I always forget about the disassembler in this things. Sounds like we somewhat define away this issue w/ the above that removes the need for the new warning?

Thinking about this more, I’m not sure I was correct in stating these need to be isCodeGenOnly, honestly. These instructions are a bit of a mess. This patch makes at least some steps towards fixing that, which is great.

It’s reasonable as a step towards improving things, yes. If you’re
really interested in this area, I’d love to see some of the more
aggressive refactoring talked about above get some traction.

Thanks again for the review!

Do you agree with the approach in this updated patch? If so, would
it be OK to commit as first step? Thanks!

Yep, this looks great!

Thanks again for doing this.

-Jim

Perhaps we could/should have tablegen collect singleton registers like this and auto-generate the singleton register classes, instruction operands, etc? There might need to be some syntax in the (ins) and (outs) lists, too? Unclear. In general, I like being able to write instructions like this in the naive way and have tblgen figure out the rest.

-Jim