Question about porting LLVM - a single instruction op mnemonic with multiple operand forms

Hello all,

I am at the adding Instruction Set stage of adding new target support into LLVM. There is a single instruction op mnemonic with multiple operand forms. For example: Add R1, R2 & Add @R1, R2. I found that there is similar case in x86 instruction set, such like ADD reg, reg & ADD mem, reg. However, the solution of x86 is adding suffix of instruction and translating instruction op mnemonic into ADDrr & ADDmr. I don’t want to translate single instruction op mnemonic with different operand forms into multiple op mnemonics. I am wondering to know whether is another solution of this problem or not?? Which target should I look for it??

thanks a lot

yi-hong

I have this same problem in our backend. I solve it by adding a pseudo instruction at instruction selection that transforms @R1 into R1, so only a single pattern is required. I then can propogate the pseudo instruction after instruction selection.

Micah

Hello Villmow,

Is it your backend EFI Byte Code Virtual Machine?? Would you mind to give me an example about what pseudo instruction you add??

thanks a lot

yi-hong

2011/1/19 Villmow, Micah <Micah.Villmow@amd.com>

The backend is for AMDIL

I have a simple pattern that loads all immediate values to registers first and that way no instruction patterns need duplicate forms for both registers and immediates.

Micah

Hello,

Having separate instruction patterns, one per encoding, is the correct answer. They are not the same instruction, from LLVM's perspective, even though they share the same mnemonic. X86 is doing things the right way for this.

Consider that the instruction printing, instruction selection (isel pattern), and binary encoding will all need to do things differently depending on the types of the operands. Likewise, the scheduling itinerary will almost certainly be different (memory vs. register). That's best handled by having separate instruction definitions.

-Jim

"Villmow, Micah" <Micah.Villmow@amd.com> writes:

I have this same problem in our backend. I solve it by adding a pseudo
instruction at instruction selection that transforms @R1 into R1, so
only a single pattern is required. I then can propogate the pseudo
instruction after instruction selection.

What's the rationale behind this approach? It seems a bit clumsy to me.
An instruction with varying addressing modes is not a single
instruction. They have different encodings, for starters. Defining
separate patterns for them is the "clean" LLVM approach. I would think
your approach adds the danger of the pseudo-instruction getting lost at
some point.

If the redundancy in specifying patterns for different addressing modes
is the problem, I have some stuff to submit that helps with that. It
will probably take quite a bit of patch churn before it makes it to
trunk, however.

                               -Dave

From: David A. Greene [mailto:greened@obbligato.org]
Sent: Friday, January 21, 2011 2:59 PM
To: Villmow, Micah
Cc: Lu Mitnick; llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] Question about porting LLVM - a single
instruction op mnemonic with multiple operand forms

"Villmow, Micah" <Micah.Villmow@amd.com> writes:

> I have this same problem in our backend. I solve it by adding a
pseudo
> instruction at instruction selection that transforms @R1 into R1, so
> only a single pattern is required. I then can propogate the pseudo
> instruction after instruction selection.

What's the rationale behind this approach? It seems a bit clumsy to
me.
An instruction with varying addressing modes is not a single
instruction. They have different encodings, for starters. Defining
separate patterns for them is the "clean" LLVM approach. I would think
your approach adds the danger of the pseudo-instruction getting lost at
some point.

[Villmow, Micah] It isn't that I have different addressing modes. There is
only a single instruction, but multiple different types of input arguments.
The pseudo instruction's that I use simplify the patterns by only requiring
me to write patterns for registers and not register, immediates, addresses
and any other operand type. So, instead of having to write patterns like
ADDrr, ADDri, ADDrs, etc.. for 16 different register classes, I only write
a single ADD multi-class pattern that takes only registers as inputs. I do
luckily have another compiler that takes my output and generates a
binary to remove any pseudo instructions.

Micah