Musings on the TableGen -emit-dag-isel backend

Your suggestion for two passes is indeed my plan if simply using 3-byte sizes is not acceptable. I don't want to duplicate all the logic in a second length-calculating function, so I would just have special logic for the three matching operators with children and use the existing function for the rest, passing a null output stream. Or I could conditionalize all the output on another function parameter so it isn't done at all.

I'm not convinced that anyone would notice a 4% increase in the size of the matching table, but I don't think that's my call. I have plenty to do while waiting for more comments. :wink:

Your suggestion for two passes is indeed my plan if simply using 3-byte sizes is not acceptable. I don't want to duplicate all the logic in a second length-calculating function, so I would just have special logic for the three matching operators with children and use the existing function for the rest, passing a null output stream. Or I could conditionalize all the output on another function parameter so it isn't done at all.

I'm not convinced that anyone would notice a 4% increase in the size of the matching table, but I don't think that's my call.

It's not necessarily about the total size, but cache pressure is an issue.

That said, if we are seriously thinking about the performance of the byte code, perhaps some of these opcodes should be reconsidered at a higher level anyway.

For example: The overall bytecode always begins with an OPC_SwitchOpcode implemented as a linear list of cases, often hundreds of them (depending on the target). A binary search over a jump table would be *much* better for those.

Cheers,
Nicolai

I've only just begun looking at the compiler's use of the matching table. You are correct about the outer SwitchOpcode: The AMDGPU target has 212 cases. The X86 has 452. However, there is a cache for the case offsets in the SelectionDAGIsel class.

Would it make sense for TableGen to generate the outer OPC_SwitchOpcode offset table?

I believe the code in SelectionDAGISel.cpp that consumes the table caches the outer SwitchOpcode in a map.