[RFC] Renaming MachineInstr methods returning operand ranges

Currently, we have the following methods returning operand ranges:

Name Returns Brokenness
operands() all operands Fine
explicit_operands() explicit operands Fine
implicit_operands() implicit operands Fine
defs() explicit defs Broken
uses() explicit uses + implicit defs + implicit uses Broken
explicit_uses() explicit uses Fine

defs() and uses() are two misnomers and have high potential for introducing bugs.
Here is an example of such bug.

D150868 tries to improve the situation by renaming defs() to explicit_defs() so that we have:

Name Returns Brokenness
uses() explicit uses + implicit defs + implicit uses Still broken
explicit_defs() explicit defs Fine
explicit_uses() explicit uses Fine

Ideally, we would have something like this for accessing defs/uses:

Name Returns
defs() explicit defs + implicit defs
uses() explicit uses + implicit uses
explicit_defs() explicit defs
explicit_uses() explicit uses

However, silently changing the semantics of defs() and uses() will most certainly be not appreciated by downstream targets. Even for upstream code fixing all uses of uses() is difficult, because it is not always obvious from the context what the replacement should be.

@jayfoad suggested two possible ways of solving the issue (assuming defs() has already been renamed to explicit_defs()):

    • Introduce all_defs() and all_uses() for explicit+implicit defs/uses, respectively.
    • Eliminate uses of uses() in the codebase.
    • Deprecate uses() and remove it in the future.
    • Rename uses()uses_and_implicit_operands().The new name is intentionally ugly to motivate developers to avoid using it.
    • Give downstream targets time to migrate. Use this time to eliminate uses of uses_and_implicit_operands() in the codebase.
    • Re-introduce uses() and defs() with the desired semantics, deprecate uses_and_implicit_operands() and remove it in the future.

The first method is smooth, but eliminating uses of uses() might take indefinite amount of time due to the lack of motivation. Not all developers may like the new all_ prefix.

The second methods will give us (eventually) the usual names defs() and uses(), but its reliability highly depends on how much time we give downstream targets to migrate.

The two methods may be combined:

    • Introduce the new all_* methods, clone uses()uses_and_implicit_operands(), instantly deprecate uses().
    • Take our time to clean up the codebase (eliminate uses of uses_and_implicit_defs()).
    • Remove uses() sometime in the future.

So, this RFC is basically a poll, which method do you prefer? Please share your thoughts.

I have a mild preference for #2

I would love to have iterators that return only the defs and only the uses, but I really don’t mind whether they are called defs/uses or all_defs/all_uses. So no strong preference on #1/#2/#3. If I was doing the work I would probably go for #1 because it seems simplest to implement.

+1 on the general idea

I’d say mostly you should take the path that’s easiest to actually get done.

I worry a bit about how easy it actually is to remove all uses of uses() across all backends, and as long as the “broken” method with a short name is around, it is likely to accidentally get new uses. So if I was to do the work, I would adopt the “rename to something ugly” as part of my strategy, because that can be done relatively quickly and reduces the risk of “regressions” in the transition.

As for whether the final result has all_{defs,uses}() or just {defs,uses}(), I can see advantages to both in the long run. Not having the all_ prefix is nice because it’s shorter, having it is nice because it serves as an implicit call-out to novice developers / developers who are not steeped in backend code that there is more to it than they might think (it’s easy to forget about implicit operands). I don’t have a strong preference either way.

By the way, there are not many uses of uses(). Probably just because of its weird behavior.
There are 64 uses total, of which 13 in CodeGen, 6 in GlobalISel and the rest are in Target (~5 per backend avg). In comparison, there are 520 uses of operands(). (The numbers may not be precise.)

Renaming uses() to something ugly sounds good as long a we provide interfaces to access all uses and all defs, but there are no such interfaces currently. all_defs() and all_uses() would be of a great help here. Another option is to use operands() instead of all_defs() / all_uses() as in most other places, wait for re-introduction of defs() and uses(), and possibly “modernize” some uses of operands() after that. Some existing uses of operands() could also be modernized then. Does that sound good?

If we’re going to implement all_defs/all_uses then I have a slight preference for doing that before touching any of the code that uses these accessors.

It would certainly be a short cut. So the question narrows down to whether we want them or not, and it looks like the poll isn’t getting many voices. My position is neutral with mild preference for defs/uses.

Using ’operands ()’ for the time being could postpone this decision and would go in-line with existing practices.

The GlobalISel use case is simpler because G_* instructions aren’t allowed to have implicit operands

Are you working on an implementation? I put up a proof-of-concept here but I’m happy to abandon it if it is stepping on your toes:

No, I’m not, and actually don’t currently have enough time for that.

Since GlobalISel doesn’t care about implicit operands, I think it would be better if it used defs()/uses() instead of explicit_defs()/explicit_uses(). This is the thing I don’t like about D150868.

I looked at the patches and now I’m confused.
I was thinking about use() / all_uses() returning all uses, not only register ones.
On the other hand, iteration only over register operands seems to be the most common operation on operands in CodeGen.
Now I’m more inclined to omitting the all suffix, or maybe come up with some other names.

I think MachineOperand terminology is pretty clear that “uses” and “defs” mean uses and defs of registers. Other operands, like immediates and regmasks, are neither uses nor defs.

Thanks for addressing this. My preference is for #2.

I’m wondering about regmasks—they are just compacted representations of multiple defs, so having them covered by “defs” seems to make sense, but on the other hand, if “defs” strictly implies “isReg”, then regmasks would not be included. Maybe we could also add “clobbers” to iterate over dead implicit defs and regmasks?

That’s true, but explicit_uses() and uses() return operands of other kinds as well. This discrepancy can be confusing.
(It looks like most uses of explicit_uses() only care about register operands though.)

Good point about explicit_uses. I had not realised that. (I already knew uses was weird.)

Many uses of explicit_uses are immediately followed by an isReg check. Perhaps we could migrate all the uses that are not followed by an isReg check to… something else? And then fold the isReg check into explicit_uses itself.


Perhaps we could migrate all the uses that are not followed by an isReg check to… something else?

explicit_operands() looks appropriate.