Do you have specific examples in mind that would be expressible with
something more complicated that aren't handleable via an early-clobber
Not offhand, no. I'm mostly concerned about the readability of .td files.
Perhaps spelling it out more fully with "earlyclobber" rather than
"early" would help?
That's better. Is there any way you could convince TableGen to recognize
that 'constraints = "$success != $src", "$success != $ptr"' is semantically
equivalent to earlyclobber? Maybe check that the common operand in both
contraints is declared to interfere with all other operands and if that's
so, mark it earlyclobber.
It would be possible, yes. I'm concerned that would imply that one could specify a constraint that the output register couldn't overlap with one input register, but could with another, which isn't expressible. Thus my preference for specifying the constraint solely as an attribute of the output register rather than referencing the other operands explicitly.
When I'm writing .td files I really don't want to be concerned with the
nitty-gritty details of how the backend is implemented. I just want to
express the semantics I want. I think that was the motivation for the
switch from isThreeAddress to "$src = $dst."
I agree. We're definitely on the same page about what the goals are.
I have no objection going with "earlyclobber" initially but we should think
about ways to abstract codegen semantics whenever we can. If we can replace
"earlyclobber" with something clearer later on, all the better.
"earlycloibber" is a really bad name, though. Perhaps spell it "uniquereg"
or something else that gets at what it actually means?
I'm not hugely tied to the name. I chose it because it matches the usage in GCC documentation for inline assembly with the same concepts and how the concept is expressed elsewhere in the compiler. If I'm not mistaken, due to the GCC nomenclature, the linux kernel also refers to this sort of thing as an early-clobber.
It seems to me the root problem here is that the instruction has two outputs
and we don't want the output to be allocated to the same register as the
inputs. We have no way to express multiple outputs in TableGen.
Close, but not precisely. The issue is that the values in the input registers may still be needed in the hardware at the instruction stage where the output register needs to be written, or some other such timing issue that if the registers are the same the hardware can't guarantee proper access ordering for correct behavior. From LLVM's perspective, the instruction has only one output (the success value), and also has a side-effect (the store to memory). Another example of this issue is the ARM integer multiply instruction (MUL) on pre-v6 architectures, where if the destination register and the first source register are the same, the behaviour is undefined.
The name is trying to capture the idea that the hardware may clobber the value in the destination register early enough in the execution of the instruction that it could conflict with reading the value from the source register if they are the same.
I think you're right that it's best to go ahead with this for now and then if a better solution is arrived at later, we can update things to use that.