[M68k] CCR usage / instruction encoding

Hi,

I've been angling at migrating changes from my proof-of-concept Mega Drive fork
back into mainline LLVM. There are a fair few smaller pieces (a few extra
instruction implementations, some minor bug fixes) but the fork itself made some
large changes in order to get a working demo on the hardware proper.

Additionally, as it was part of a proof-of-concept (and I lack experience
working on the LLVM backend), a lot of the original implementation is not going
to have been implemented in the best way.

I'm hesitant to try and migrate any of the smaller changes before some of the
larger questions are answered because it might result in redundant work.

The two biggest areas of concern I have are:

- CCR usage
  - At the moment, the M68k target doesn't actually support the 68000, because
    it relies on saving the CCR which wasn't introduced until the 68010.
    (You can save the SR, which includes the CCR, but that is a privileged
    instruction.)
  - In my PoC for the Mega Drive, I disabled allocation of the CCR and had the
    code generate CCR in the immediately preceding instruction - this was
    flaky because sometimes the instructions got split

- Instruction encoding
  - This is the point of discussion of PR48792: Review options between using
    CodeBeads v.s. TSFlags (48792 – Review options between using CodeBeads v.s. TSFlags for encoding M68k instructions)
  - The code-beads implementation is particularly painful for disassembly,
    even the landed disassembler feels messy
  - There is an instruction definition per set of operand permutations in
    the M68k backend, which makes manipulating the instructions in the
    backend outside of codegen quite difficult
  - I wrote up some thoughts in a previous message:
    [llvm-dev] [M68k] Code Beads discussion

I'm happy to put some effort into either of these problems, though I could
definitely benefit from some guidance.

I'm not entirely sure what the process is from here. Prepare an RFC
for any major changes and start landing incrementally once they achieve
some kind of consensus?

Thanks,
Ricky Taylor

Hi Ricky,

I can’t comment on the m68k stuff directly, but I can give some pointers on the general consensus building framework.

First of all, you seem to have a pretty good idea on what the next steps are, so that’s half of the problem solved. M68k also has a pretty solid sub-community, which makes driving local changes much easier.

On consensus building…

IIUC, most of those questions are related to the target itself and not the generic code generation framework (sel-dag, legalizatio, MIR, etc).

The only one I remember was more generic is the code-beads that needed table-gen support, but adding a new table-gen back-end is less controversial than changing how sel-dag works, so not super controversial.

I’m sure there will be dependencies between wider changes and local changes, and that’s up to the community how to divide the RFCs and work.

But when bringing RFCs to the attention of the wider community, it’s best if the sub-community is already in agreement on what they want to do and what are the valid alternatives (in case the first option doesn’t work).

So, I’d encourage you to start RFCs for the local changes targeting the sub-community, but using this list, so that other people can participate and give opinions.

Once there is consensus on the changes you can start another RFC to a wider audience, for example, to introduce code-beads into table-gen, with some alternatives.

All of the local changes that don’t need wider consensus can already be implemented in the target itself, approved by the local community, as long as it doesn’t change wider behaviour or break bots, etc.

Hopefully this is helpful in some way…

cheers,
–renato

Hi Ricky,

Hi,

I’ve been angling at migrating changes from my proof-of-concept Mega Drive fork
back into mainline LLVM. There are a fair few smaller pieces (a few extra
instruction implementations, some minor bug fixes) but the fork itself made some
large changes in order to get a working demo on the hardware proper.

Additionally, as it was part of a proof-of-concept (and I lack experience
working on the LLVM backend), a lot of the original implementation is not going
to have been implemented in the best way.

I’m hesitant to try and migrate any of the smaller changes before some of the
larger questions are answered because it might result in redundant work.

The two biggest areas of concern I have are:

  • CCR usage
  • At the moment, the M68k target doesn’t actually support the 68000, because
    it relies on saving the CCR which wasn’t introduced until the 68010.
    (You can save the SR, which includes the CCR, but that is a privileged
    instruction.)
  • In my PoC for the Mega Drive, I disabled allocation of the CCR and had the
    code generate CCR in the immediately preceding instruction - this was
    flaky because sometimes the instructions got split

I haven’t looked into this problem TBH.

  • Instruction encoding
  • This is the point of discussion of PR48792: Review options between using
    CodeBeads v.s. TSFlags (https://bugs.llvm.org/show_bug.cgi?id=48792)
  • The code-beads implementation is particularly painful for disassembly,
    even the landed disassembler feels messy
  • There is an instruction definition per set of operand permutations in
    the M68k backend, which makes manipulating the instructions in the
    backend outside of codegen quite difficult
  • I wrote up some thoughts in a previous message:
    https://lists.llvm.org/pipermail/llvm-dev/2021-February/148408.html

Sorry I missed your previous message on the mailing list. In addition to the TSFlags solution recently I’ve been studying if the tablegen-ed MCCodeEmitter (i.e. the -gen-emitter TG backend) can help on this problem. Though it’s mostly used by ISA that has (sorta) fixed-size instructions, the fact is that it never puts any limitation on the size of instructions as well as asking to have uniform size across all instructions. What’s more, despite the fact that M68k has variable-length instructions, it has more regular sizes since it’s always multiple of words (16 bits) and more importantly, as you also mentioned in your February post, in most cases we can determine the address mode of an instruction operand by only looking at the first word. In other words, even if we can’t carry every encoding info using the aforementioned tablegen solution or TSFlags, we can at least use them to specify the encoding of the first word.

Regarding the permutation problem…I also feel the pain in terms of maintenance, but I can’t really find any other solution based on the current tablegen design. To be fair, most of these permutations have been factored out using multiclass or class. Which, IMAO, is still readable for now. One really bold idea will be addressing this problem from the tablegen language itself by adding some sort of multidimensional table that, for instance, has two-dimension columns (corresponding to the source and destination operands’ addressing mode) and puts data (the complete instruction definition) in the third dimension.

That is indeed a pretty helpful summary, thanks! I'll try and keep to
that rough process.

Ricky,

Hi Ricky,
...

I haven't looked into this problem TBH.

Cool, in that case, let's worry about CCR allocation after at least
thinking through the
instruction encoding.

Regarding the permutation problem...I also feel the pain in terms of maintenance...

I'm still wondering if we can just represent EA operands as a single
operand type
(or rather, one per operand size). We only really need one parameter
per operand instance,
which is which modes are legal. The EA bits are encoded in very few
ways, so I suspect we
could legitimately use some TSFlags bits for that.

That way, we would only really be supporting a few broad cases of operand types:
- EA operands
- Immediate operands
- Register masks (and other special purpose operand types)

I don't think having a few permutations per instruction is bad, it's
just with the EA operand
types, we could end up with like 26^2 instruction instances if an
instruction had two
EA operands. I'm not sure that's feasible to solve even with multiclasses.

I've poked around the tablegen emitter code and it looks feasible to do a phased
migration with a small tweak to tablegen. We would then just need to check if we
have code beads defined for an instruction, if we don't we can follow a new
tablegen-based code path.

I think I'm going to do some further investigation into this as a
possible solution.
Give me a shout if there's something I've missed.

Thanks,
Ricky