[X86] Memory folding tables in x86 backend

Hi all,

Memory fold tables: Two way mapping of each register-form instruction to its corresponding memory-form instruction. E.g. Mapping ‘X86::ADD32rr’ to ‘X86::ADD32rm’.

Few months ago I started an effort to auto-generate the X86 memory folding tables currently held under lib/Target/X86/X86InstInfo.cpp as huge manually-maintained static arrays.

Automating the tables would reduce the maintenance burden for developers to repeatedly update them with every new (foldable) instruction added to X86 backend, and would provide us with all the missing entries (mostly AVX512 instructions).

After considering different approaches for generating the tables, it was decided that a TableGen backend would fit our needs the best, using the encoding information of the instructions (from .td files) to apply the matching between pairs.

Our main assumption was that two forms of the same instruction (register & memory forms) must share the same encoding information except for the part responsible for “How are the parameters passed to the instruction”, where one version defines one of its parameters to be a register while the other version defines it to be a memory operand, and all other parameters passed the same way in both versions.

Throughout the development of the backend, some exceptions were discovered. Some instructions has different behavior in their register and memory forms, for example, “bt” instruction (bit-test), which in its register form the relevant bits of the register parameter are the first 4, 5 or 6 bits (depending on mode and register size) while in the memory-folded instruction, the relevant bits of the memory operand are the first 16, 32 or 64 bits.

A TableGen backend was created for this purpose and committed in https://reviews.llvm.org/rL304088, including a small list of exceptions for cases similar to the bit-test instruction above.

After the patch was committed, two more exceptions were found which led to reverting the patch (https://reviews.llvm.org/rL304762).

Looking for safe-alternatives that will assure us correctness of each added entry, we narrowed our options to the following 3:

1- Recommitting TableGen’s backend as “disable by default”:

a- prepare a full instruction list of the exceptions in X86 ISA.

b- Mark these instructions with special flag to indicate illegal folding.

c- Integrate the backend into llorg while disabling it by default.

d- Provide a compile-time validation mechanism to make sure the pass does not break (as it’s not included in LLVM’s build).

e- Run the pass every defined period of time (1 week/2 weeks/1 month/2 months):

· If changes appear (new entries):

  • Make sure the entries are legal.

  • If they are not, add them to the exception list.

· If any changes made, commit them.

2- Recommitting TableGen’s backend while limiting the instructions to AVX512 and earlier.

a- prepare a full instruction list of the exceptions in X86 ISA.

b- Mark these instructions with special flag to indicate illegal folding.

c- Integrate the backend into llorg while limiting the generated instructions to be part of AVX512 extension or earlier.

d- If a new generation of instructions is added, go back to step ‘a’, and after we are sure everything is ok allow these instructions to be tablegened.

3- Give up on the auto-generation idea and manually update the current tables iteratively with new chunks of instructions until full state is achieved.

P.s. The TableGen backend emits more than 5200 entries, while the known exception at this point are ~20.

We prefer options #1/#2 as they remove the manual maintenance of these huge tables where there is high potential of adding new bugs, and reduce it to a one-man once-a-month/once for each new instruction generatiom work.

Please share your opinions with us.

Thanks,

Ayman Musa

Why can't you do a variation of this by tagging the instructions with a
flag? I.e. for the rm variant, add a flag that says "this has an
equivalent register operand version". Given that a lot of the instruction
patterns are created via multi-classes, I would expect that to require a
lot less than 5200 updates.

Joerg

+1 An opt-in ‘GenerateMemoryFolding’ style flag (defaulting to false) seems safer to me than than trying to ensure you’ve correctly flagged all illegal instruction fold.

You could then enable the flag on individual groups of instructions/multi-classes, testing and removing them from the manual tables in a relatively controlled fashion.

Similarly, new instructions can be enabled in a safe, staged approach without needing a ‘regenerate a full exceptions list’ stage.

Thanks for persevering with this Ayman!

Simon.

The counterargument to this is that it adds a lot of cruft to the td files. It should be a lot more rare for an instruction to be “unfoldable” than to be “foldable”.

Validation should be straight-foward for this: generate the table automatically using the new tool and see what the differences are. If the differences are due to incorrect additions, just add the flag to them.

BTW, adding a tblgen backend to automate this is totally awesome. Thank you for pushing on this Ayman!

-Chris

I’m being pessimistic but the previous attempt at this failed because the testing missed so many cases, mainly due to the scale of the multiple problems that the patch was trying to fix in one go. Adding initial tablegen support and then enabling a few related instructions at a time gives us a decent chance of easily bisecting + fixing individual problems. Chandler went into some detail on this in D32684 and rL304762.

Once the majority of instructions are enabled, identifying the remaining problem cases should be more straightforward and inverting the flag to an opt-out ‘DoNotGenerateMemoryFolding’ a much easier task, removing much of the temporary cruft - although if the process stalls part way through we could be stuck with it longer than liked.

I’d add that many of these folding table entries are only tested by the stack-folding tests, and despite our efforts they don’t have close to full coverage (especially the endless AVX512 instructions and variations….)

Sorry this comment is coming over as so very glass half empty!