generated HWEncoding based register decoders

Is there any reason why we can't generate HWEncoding based decoders for registers for mc disassemblers?

This is a concept patch to explore wether it'd work, and for my target, it does the right thing. I have one case where I have to shift a field over 2 bits, but I handle that in the glue. If I had a HWEncoding encoding on a per register class basis, I could have made it work without the extra shift. I have 8 register classes across 5 register files, so the code can handle at least a little complexity.

I work at silicon design time, and we update registers like people wash their hair. Having the tool auto generate most all that I need saves me the time and complexity to keep these up-to-date as we change the design. I suspect others might find register decoding useful. Some ports are so trivial that they can use register class ordering to decode. Another possibility is to reject the patch as a bad idea, and deprecate HWEncoding entirely and recommend people write their register class in encoding order, and use other means to change the allocation order. That might be a better way to do this. I started out with HWEncoding based post, then discovered there was no decoder; maybe that was just wrong?

I'd be interested in hearing what others might recommend (ditch HWEncoding?), and if people find the direction the patch does, in collaborating with someone if they want to finish off the patch. Open questions, always generate tables, even if a target doesn't use HWEncoding? Generate by default, or hide all the new routines unless someone asks for them? Move the code into the Disassembler generator out of the RegisterInfo generator?

Thoughts?

gendecode.diffs.txt (3.85 KB)

I suspect that, for many targets, this is possible. It is just that no one has done the work to make this happen. So that we’re on the same page, I believe you’re talking about getting rid of tables like this from lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp: // FIXME: These can be generated by TableGen from the existing register // encoding values! static const unsigned CRRegs[] = { PPC::CR0, PPC::CR1, PPC::CR2, PPC::CR3, PPC::CR4, PPC::CR5, PPC::CR6, PPC::CR7 }; static const unsigned CRBITRegs[] = { PPC::CR0LT, PPC::CR0GT, PPC::CR0EQ, PPC::CR0UN, PPC::CR1LT, PPC::CR1GT, PPC::CR1EQ, PPC::CR1UN, PPC::CR2LT, PPC::CR2GT, PPC::CR2EQ, PPC::CR2UN, PPC::CR3LT, PPC::CR3GT, PPC::CR3EQ, PPC::CR3UN, PPC::CR4LT, PPC::CR4GT, PPC::CR4EQ, PPC::CR4UN, PPC::CR5LT, PPC::CR5GT, PPC::CR5EQ, PPC::CR5UN, PPC::CR6LT, PPC::CR6GT, PPC::CR6EQ, PPC::CR6UN, PPC::CR7LT, PPC::CR7GT, PPC::CR7EQ, PPC::CR7UN }; … I’m against using implicit ordered in the TableGen file to generate the hardware encoding numbers. This would lead to unnecessary gymnastics assuming it were usable at all. Many targets adjust the allocation order anyway for a variety of reasons, so I’d tend to view this as an orthogonal concern. Moving the code to the disassembler generator makes sense, although at least on PowerPC, we have equivalent tables in the AsmParser and we should eliminate them in both places. I recommend only generating (or including) the tables upon request. One way to this is to add a flag variable to InstrInfo (declared in include/llvm/Target/Target.td) like decodePositionallyEncodedOperands. Another way is to add a flag to each register class indicating whether or not a table should be generated for that class. Another way is to always generate the tables for each register class for which the HWEncoding uniquely identifies each register, and surround those tables with some ifdef (like we do with GET_INSTRMAP_INFO). -Hal

I suspect that, for many targets, this is possible. It is just that no one has done the work to make this happen.

Ah, ok, just surprising to me then, as I find it more palatable to write the generator than to produce the list.

So that we're on the same page, I believe you're talking about getting rid of tables like this from lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp:

// FIXME: These can be generated by TableGen from the existing register
// encoding values!

static const unsigned CRRegs = {
  PPC::CR0, PPC::CR1, PPC::CR2, PPC::CR3,
  PPC::CR4, PPC::CR5, PPC::CR6, PPC::CR7
};

Yeah, my code does this for free:

extern const uint16_t PPCRegDecodingTable_CRRC = {
  /* [0] = */ PPC::CR0,
  /* [1] = */ PPC::CR1,
  /* [2] = */ PPC::CR2,
  /* [3] = */ PPC::CR3,
  /* [4] = */ PPC::CR4,
  /* [5] = */ PPC::CR5,
  /* [6] = */ PPC::CR6,
  /* [7] = */ PPC::CR7,
};

static const unsigned CRBITRegs = {
  PPC::CR0LT, PPC::CR0GT, PPC::CR0EQ, PPC::CR0UN,
  PPC::CR1LT, PPC::CR1GT, PPC::CR1EQ, PPC::CR1UN,
  PPC::CR2LT, PPC::CR2GT, PPC::CR2EQ, PPC::CR2UN,
  PPC::CR3LT, PPC::CR3GT, PPC::CR3EQ, PPC::CR3UN,
  PPC::CR4LT, PPC::CR4GT, PPC::CR4EQ, PPC::CR4UN,
  PPC::CR5LT, PPC::CR5GT, PPC::CR5EQ, PPC::CR5UN,
  PPC::CR6LT, PPC::CR6GT, PPC::CR6EQ, PPC::CR6UN,
  PPC::CR7LT, PPC::CR7GT, PPC::CR7EQ, PPC::CR7UN
};

And this:

extern const uint16_t PPCRegDecodingTable_CRBITRC = {
  /* [0] = */ PPC::CR0LT,
  /* [1] = */ PPC::CR0GT,
  /* [2] = */ PPC::CR0EQ,
  /* [3] = */ PPC::CR0UN,
  /* [4] = */ PPC::CR1LT,
  /* [5] = */ PPC::CR1GT,
  /* [6] = */ PPC::CR1EQ,
  /* [7] = */ PPC::CR1UN,
  /* [8] = */ PPC::CR2LT,
  /* [9] = */ PPC::CR2GT,
  /* [10] = */ PPC::CR2EQ,
  /* [11] = */ PPC::CR2UN,
  /* [12] = */ PPC::CR3LT,
  /* [13] = */ PPC::CR3GT,
  /* [14] = */ PPC::CR3EQ,
  /* [15] = */ PPC::CR3UN,
  /* [16] = */ PPC::CR4LT,
  /* [17] = */ PPC::CR4GT,
  /* [18] = */ PPC::CR4EQ,
  /* [19] = */ PPC::CR4UN,
  /* [20] = */ PPC::CR5LT,
  /* [21] = */ PPC::CR5GT,
  /* [22] = */ PPC::CR5EQ,
  /* [23] = */ PPC::CR5UN,
  /* [24] = */ PPC::CR6LT,
  /* [25] = */ PPC::CR6GT,
  /* [26] = */ PPC::CR6EQ,
  /* [27] = */ PPC::CR6UN,
  /* [28] = */ PPC::CR7LT,
  /* [29] = */ PPC::CR7GT,
  /* [30] = */ PPC::CR7EQ,
  /* [31] = */ PPC::CR7UN,

I'm against using implicit ordered in the TableGen file to generate the hardware encoding numbers. This would lead to unnecessary gymnastics assuming it were usable at all.

Ports do that today. The problem with doing that is that it is easy for the encode/decode to be mismatched. I already did that and burned time discovering that. Lanai does this for example, as do others. I'd love to see people migrate away from that method to one that always just works as then a new port writter can copy any of the ports and never suffer the it doesn't just work problem I faced.

Moving the code to the disassembler generator makes sense, although at least on PowerPC, we have equivalent tables in the AsmParser and we should eliminate them in both places.

Hum. Curious. I don't feel so bad about generating them then in the RegisterInfo area, as it is just RegisterInfo.

I recommend only generating (or including) the tables upon request.

Done. I did this in a way so that a first time port creator can easily discover the data in the .inc file, and if they want it, it then is obvious how to get it.

I did a quick look through all the ports, and most can benefit from the tables if they want, as most ports have them.

Ok?

gendecode.diffs.txt (2.5 KB)

Hi Mike, I think tablegenning these tables would be a great idea - I
too was recently caught out, getting garbage disassembly because I was
indexing into the register class rather than a decoder table. Your
patch seems sensible, though someone with more tablegen experience can
give a more thorough review. even the backends where you can get away
with relying on the order of registers in the regclass (e.g. XCore),
probably shouldn't.

See http://llvm.org/docs/DeveloperPolicy.html#making-and-submitting-a-patch
for guidelines on submitting a patch for inclusion. It's not a
requirement, but submitting to phabricator can help to get reviewers
(see http://llvm.org/docs/Phabricator.html for guidance). Fee free to
add me as a reviewer (asb). It looks like Hal (hfinkel) has
volunteered(!) and you could also add the TableGen code owner
(stoklund). Having a test case would also help - there's relatively
little testing of the C++ that tablegen generates, but you might look
at test/TableGen/AsmVariant.td for an example.

Once we have generated decoder tables, the common case by far will
involve the exact same implementation of DecodeFooRegisterClass, which
perhaps suggests there's an opportunity to delete more code by relying
on a default or generated implementation unless it needs to be
overridden (e.g. SPARC DecodeIntPairRegisterClass).

Best,

Alex

I'm against using implicit ordered in the TableGen file to generate the hardware encoding numbers. This would lead to unnecessary gymnastics assuming it were usable at all.

Ports do that today. The problem with doing that is that it is easy for the encode/decode to be mismatched. I already did that and burned time discovering that. Lanai does this for example, as do others. I'd love to see people migrate away from that method to one that always just works as then a new port writter can copy any of the ports and never suffer the it doesn't just work problem I faced.

Moving the code to the disassembler generator makes sense, although at least on PowerPC, we have equivalent tables in the AsmParser and we should eliminate them in both places.

Hum. Curious. I don't feel so bad about generating them then in the RegisterInfo area, as it is just RegisterInfo.

I recommend only generating (or including) the tables upon request.

Done. I did this in a way so that a first time port creator can easily discover the data in the .inc file, and if they want it, it then is obvious how to get it.

I did a quick look through all the ports, and most can benefit from the tables if they want, as most ports have them.

Ok?

Hi Mike, I think tablegenning these tables would be a great idea - I
too was recently caught out, getting garbage disassembly because I was
indexing into the register class rather than a decoder table. Your
patch seems sensible, though someone with more tablegen experience can
give a more thorough review. even the backends where you can get away
with relying on the order of registers in the regclass (e.g. XCore),
probably shouldn't.

See http://llvm.org/docs/DeveloperPolicy.html#making-and-submitting-a-patch
for guidelines on submitting a patch for inclusion. It's not a
requirement, but submitting to phabricator can help to get reviewers
(see http://llvm.org/docs/Phabricator.html for guidance). Fee free to
add me as a reviewer (asb). It looks like Hal (hfinkel) has
volunteered(!) and you could also add the TableGen code owner
(stoklund).

Mike, are you interested in follow-up on this? It would indeed be easier to go the code review using reviews.llvm.org (our phabricator code-review system). I'd really like to have this functionality, but we should talk about the details. We probably want to collect the values in a map, for example, instead of having the N^2 loop.

Thanks again,
Hal

Mike, are you interested in follow-up on this?

I posted it to seek inclusion into llvm. That remains the case.

It would indeed be easier to go the code review using reviews.llvm.org (our phabricator code-review system).

No problem, posted it there.

I'd really like to have this functionality, but we should talk about the details.

Followups to the other thread. https://reviews.llvm.org/D30960