I’ll do a brief reply here and if you want more information we can talk further :).
Hi Quentin, Hi Amara,
I was following your discussion on D75086 regarding declaring types as legal even if they are smaller than the actual register class (e.g. s16 and gpr32). We are working on a backend which only has 32 and 64-bit registers and we recently had a problem regarding exactly this where we had to declare G_UNMERGE_VALUES and G_MERGE_VALUES with a smaller type of <s32 as legal, even though we don't have a register class that matches and instruction selection implicitly sign extends them to s32. The reasoning was that for example in the case of G_UNMERGE_VALUES, the legalizer widens the source type when widening the destination type, which we don't want (as the big type then doesn't fit our registers anymore). So we decided to only make sure that the bigger type matches a register class and simply allowed smaller types <s32.
What I am gathering now from this discussion is that this is basically an incorrect modelling and only works around another problem. Could you expand on this in more detail?
A few years ago we discussed the possibility and/or desire of allowing registers to be larger than the actual legalized type or whether they should be in sync with the legalized type.
For instance, consider
s8 = G_AND s8, s8
If your target supports `s32 = G_AND` then allowing registers to be larger than legalized type would yield that `s8 = G_AND s8, s8` is legal (and really any type smaller than or equal to s32 would be legal for that instruction).
Then the question was how do we represent that: the relevant bits are the low 8 bits, but the container is 32-bit.
s8(32) = G_AND s8(32), s8(32)
We explored that a little bit and found that it was dangerous to carry undefined bits around (the upper 24 bits in that case) and that there was no real use case for this. Tim (Northover) might remember more details (CC’ed).
Therefore, we decided that the legal types should exactly match what the container (the register) will be. So in theory, if you’re not following that model, you’re in violation of that and that’s where problems arise (and thus D75086 to workaround them).
I was thinking about implementing some custom legalization which uses G_EXTRACT or shifts to get what I want, but G_EXTRACT extracts as many bits as the destination type, so I would end up with the same problem again. Shifts are also not optimal, since we have a target instruction available to extract n bits from a given offset. Is it valid to emit target instructions during legalization,
Yes, that’s completely legal.
The whole idea of GISel is that the backends should be able to insert target specific instructions whenever they want. For instance, we use that extensively in the lowering of calls directly in the IRTranslator, i.e., at the very beginning of GISel.
i.e. before instruction selection, or would the best modelling be to just use the shifts and then use some pre-instruction-selection combiner to merge them to the desired target instruction?
That’s a question I cannot answer directly, it really depends on what are the implementation trade-offs on your end.
I have a personal preference for generic shifts as it is possible that other generic optimizations can get rid of them. Now, if those shifts are too difficult to combine away or if you know that they are not optimizable anyway, you may want to go directly with the target specific instruction.
Just keep in mind that whenever you use a target specific instruction, this is just basically an opaque object that the generic optimizations have to deal with.