Using braces in InstAlias asm string

Hi,

I’m adding MC support for a set of instructions containing braces and I’m

hitting an issue with InstAlias. I’ve reproduced it with the ldraa instruction -

the first AArch64 instruction I found in AArch64InstrFormats.td that uses

InstAlias.

The instruction takes a dst register, base address and optional offset. An

alias is used to implement no offset form:

multiclass AuthLoad<bit M, string asm, Operand opr> {

def indexed : BaseAuthLoad<M, 0, (outs GPR64:$Rt),

(ins GPR64sp:$Rn, opr:$offset),

asm, “\t$Rt, [$Rn, $offset]”, “”, opr>;

def writeback : BaseAuthLoad<M, 1, (outs GPR64sp:$wback, GPR64:$Rt),

(ins GPR64sp:$Rn, opr:$offset),

asm, “\t$Rt, [$Rn, $offset]!”,

“$Rn = $wback,@earlyclobber $wback”, opr>;

def : InstAlias<asm # “\t$Rt, [$Rn]”,

(!cast(NAME # “indexed”) GPR64:$Rt, GPR64sp:$Rn, 0)>; ← HERE

def : InstAlias<asm # “\t$Rt, [$wback]!”,

(!cast(NAME # “writeback”) GPR64sp:$wback, GPR64:$Rt, 0), 0>;

}

Example usage:

$ echo “[0x20,0x04,0x20,0xf8]” | ./bin/llvm-mc -triple=aarch64 -show-encoding -mattr=+v8.3a -disassemble

.text

ldraa x0, [x1] // encoding: [0x20,0x04,0x20,0xf8]

To recreate the issue I tried adding braces around the dst reg and base

address, e.g.

ldraa {x0, [x1]}

The first thing I tried was double braces “{{}}” which seems to work for the asm

string in an instruction definition, e.g.

diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

index 6b23c7c…62bf67d 100644

— a/llvm/lib/Target/AArch64/AArch64InstrFormats.td

+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

@@ -1632,7 +1632,7 @@ class BaseAuthLoad<bit M, bit W, dag oops, dag iops, string asm,

multiclass AuthLoad<bit M, string asm, Operand opr> {

def indexed : BaseAuthLoad<M, 0, (outs GPR64:$Rt),

(ins GPR64sp:$Rn, opr:$offset),

  • asm, “\t$Rt, [$Rn, $offset]”, “”, opr>;
  • asm, “\t{{$Rt, [$Rn, $offset]}}”, “”, opr>;

def writeback : BaseAuthLoad<M, 1, (outs GPR64sp:$wback, GPR64:$Rt),

(ins GPR64sp:$Rn, opr:$offset),

asm, “\t$Rt, [$Rn, $offset]!”,

but doesn’t work when defining an InstAlias, e.g.

diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

index 6b23c7c…44fb9a3 100644

— a/llvm/lib/Target/AArch64/AArch64InstrFormats.td

+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

@@ -1638,7 +1638,7 @@ multiclass AuthLoad<bit M, string asm, Operand opr> {

asm, “\t$Rt, [$Rn, $offset]!”,

“$Rn = $wback,@earlyclobber $wback”, opr>;

  • def : InstAlias<asm # “\t$Rt, [$Rn]”,
  • def : InstAlias<asm # “\t{{$Rt, [$Rn]}}”,

(!cast(NAME # “indexed”) GPR64:$Rt, GPR64sp:$Rn, 0)>;

where I’m see the following error from TableGen:

[17/124] Building AArch64GenAsmMatcher.inc…

FAILED: cd ./llvm-project/build && ./llvm-project/build/bin/llvm-tblgen -gen-asm-matcher -I ./llvm-project/llvm/lib/Target/AArch64 -I /usr/include/libxml2 -I ./llvm-project/build/include -I ./llvm-project/llvm/include -I ./llvm-project/llvm/lib/Target ./llvm-project/llvm/lib/Target/AArch64/AArch64.td --write-if-changed -o lib/Target/AArch64/AArch64GenAsmMatcher.inc -d lib/Target/AArch64/AArch64GenAsmMatcher.inc.d

Included from ./llvm-project/llvm/lib/Target/AArch64/AArch64.td:431:

Included from ./llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.td:588:

./llvm-project/llvm/lib/Target/AArch64/AArch64InstrFormats.td:1641:3: error: Instruction ‘anonymous_2396’ has operand ‘Rt’ that doesn’t appear in asm string!

def : InstAlias<asm # “\t{{$Rt, [$Rn]}}”,

^

Included from ./llvm-project/llvm/lib/Target/AArch64/AArch64.td:431:

./llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.td:932:17: note: instantiated from multiclass

defm LDRAA : AuthLoad<0, “ldraa”, simm10Scaled>;

^

[17/71] Building AArch64GenDAGISel.inc…

ninja: build stopped: subcommand failed.

which led me to think it’s trying to do some form of substitution for braces, so

I tried the following hack using an OR where both sides are the same in the asm

string:

diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

index 6b23c7c…97ed49e 100644

— a/llvm/lib/Target/AArch64/AArch64InstrFormats.td

+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

@@ -1638,7 +1638,7 @@ multiclass AuthLoad<bit M, string asm, Operand opr> {

asm, “\t$Rt, [$Rn, $offset]!”,

“$Rn = $wback,@earlyclobber $wback”, opr>;

  • def : InstAlias<asm # “\t$Rt, [$Rn]”,
  • def : InstAlias<asm # “\t{{$Rt, [$Rn]}|{$Rt, [$Rn]}}”,

(!cast(NAME # “indexed”) GPR64:$Rt, GPR64sp:$Rn, 0)>;

def : InstAlias<asm # “\t$Rt, [$wback]!”,

Which seems to work fine.

Example usage:

$ echo “[0x20,0x04,0x20,0xf8]” | ./bin/llvm-mc -triple=aarch64 -show-encoding -mattr=+v8.3a -disassemble

.text

ldraa {x0, [x1]} // encoding: [0x20,0x04,0x20,0xf8]

FWIW I also tried double escaping, e.g.

diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

index 6b23c7c…1d74aaf 100644

— a/llvm/lib/Target/AArch64/AArch64InstrFormats.td

+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td

@@ -1638,7 +1638,7 @@ multiclass AuthLoad<bit M, string asm, Operand opr> {

asm, “\t$Rt, [$Rn, $offset]!”,

“$Rn = $wback,@earlyclobber $wback”, opr>;

  • def : InstAlias<asm # “\t$Rt, [$Rn]”,
  • def : InstAlias<asm # “\t\{$Rt, [$Rn]\}”,

(!cast(NAME # “indexed”) GPR64:$Rt, GPR64sp:$Rn, 0)>;

Which actually worked on LLVM 9 but warnings are emitted:

[20/70] Building CXX object lib/Target/AArch64/MCTargetDesc/CMakeFiles/LLVMAArch64Desc.dir/AArch64InstPrinter.cpp.o

In file included from ./llvm-project/llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp:39:

lib/Target/AArch64/AArch64GenAsmWriter.inc:23094:23: warning: use of non-standard escape character ‘{’ [-Wpedantic]

/* 7811 */ “ldraa {$\x01, [$\x02]}\0”

^~

On upstream master I also hit a bunch of the following asserts, just to clarify

addv is unrelated and I made no changes there. This is just an example from one

of the many failures: