Proposal: On re-purposing/reorganizing MIR sigils ('&', '$', '%').

Hi

A few of us have discussed enhancing the MIR vregs to include support for named-vregs. At the moment named regs are only supported for physical registers and number regs are reserved for vregs.

We’ve decided that to properly implement a syntax for MIR named vregs we first need to reorganized the sigils used for physical registers and external symbols so our proposal is to swap the sigil used for external symbols (‘$’) for the ampersand (‘&’) and re-purpose dollar-sign for physregs so that physregs have the dollar-sign sigil and vregs have the percentage (‘%’) sigil:

essentially:

BL &__divsi3 …

$eax = …

%123 = …

%vregFooBar = …

I have an initial patch attached (replaces ‘$’ for ‘&’ for external symbols). Hoping to open some dialog with the community before doing more implementation work.

Thanks

PL

external_symbols_dollar_to_amp.patch (88.2 KB)

Can we use %% for vregs? Seems slightly easier to remember %/%% than $/%. Also, %eax and $some_symbol are already familiar from typical assembly syntax and we probably don’t want to break that association.

It’s all a bikeshed, but being more consistent with assembly is probably a win.

– Sean Silva

When we discussed this our line of thought was like this:

  • LLVM IR already uses %name for SSA values which is closer to what a vreg is than to what a physreg is. It would be neat to draw that parallel to llvm IR.
  • We wanted another sigil for physregs so they are easy to differentiate from vregs to allow people to differentiate vregs/physregs even if they don’t know all the physreg names of a particular architecture.
  • The $ was somewhat arbitrary because we had few characters left without a meaning in .mir and I found the ampersand better matches symbol names than physregs:
  • The ampersand in &symbolname should have familiar semantics for people using C/C++.

So I would describe this as being closer to llvm IR than to typical assembly syntax, but I think that is apropriate for the LLVM machine intermediate language.

  • Matthias

When we discussed this our line of thought was like this:

- LLVM IR already uses %name for SSA values which is closer to what a vreg
is than to what a physreg is. It would be neat to draw that parallel to
llvm IR.
- We wanted another sigil for physregs so they are easy to differentiate
from vregs to allow people to differentiate vregs/physregs even if they
don't know all the physreg names of a particular architecture.
- The $ was somewhat arbitrary because we had few characters left without
a meaning in .mir and I found the ampersand better matches symbol names
than physregs:
- The ampersand in `&symbolname` should have familiar semantics for people
using C/C++.

So I would describe this as being closer to llvm IR than to typical
assembly syntax, but I think that is apropriate for the LLVM machine
intermediate language.

That rationale makes a lot of sense to me. SGTM.

-- Sean Silva

Thanks for the feedback.