LLVM IR, Instructions, Backend, AsmPrinter

Dear,

I am new to llvm and hence have very little idea about a problem that is my university project.
I am supposed to assume a X86 CPU supporting an instruction ADDenc that adds two encrypted operands. The original ADD also exists and should operate on unencrypted operands.

My task is to transform C programs into the new X86 assembly that supports ADDenc. I have very little idea about this.

I have been suggested the following approaches

  1. Adding a new Instruction ADDenc in the current X86 LLVM backend and make the necessary changes.
  2. Adding a new LLVM IR instruction addenc that recognises the operands at this level instead of general LLVM add, then add an instruction in the target X86 to transform addenc of LLVM to ADDenc of X86.

I have been given an LLVM pass that can run on LLVM IR and decide which are the encrypted operands.

Any help of any kind will be helpful as I know very little about LLVM.

There is a third option: if you have an assembler that understands the new ADDenc instruction, you can probably add inline assembly code that performs the AddEnc instruction. This will only work if you’re doing ahead-of-time compilation and clang is configured to use your new assembler, but if you’re transforming C code, that is most likely what you’re doing. That said, if you’re sufficiently confident with working with the LLVM code generator, I think you should add support for the AddEnc instruction in the X86 Backend. As to whether you should add an intrinsic or modify the backend to just figure out where to use AddEnc, I’m guessing that adding an intrinsic would be better, but people more familiar with the code generator infrastructure should comment. Regards, John Criswell

Dear John,

Since the pass that I have can transform IR, I was thinking to put in an inline assembly code using the module asm in llvm.

Replacing the llvm IR instruction %tmp4 = add nsw i32 %tmp2, %tmp3 by something like module asm “ADDenc dst,src1,src2” (instruction in the new X86 architecture).

But here I am not sure about the register allocation to tmp4,tmp2 and tmp3. Can you comment?

Regards,

Pratik

I believe what you want to do is to first create an InlineAsm object in your Module that represents the inline assembly code that you want to execute. You then create a CallInst that “calls” the inline assembly code with the desired inputs. See . That said, I think “Module asm” is for inline assembly code that does not belong in a function, so what you want to create is “regular” inline assembly code. Yes, this is (more or less) what you want to do. I believe you first create an InlineAsm object which acts like an LLVM function: you specify a function type (which describes its inputs and return values) and then an inline assembly constraint string that describes whether the operands need to be pointers to memory, register, etc. You then create a CallInst in which the value called is the InlineAsm object and the arguments are the LLVM values that you pass into the inline assembly code (in this case, %tmp2 and %tmp3). The CallInst object itself will be the result of the inline assembly code. LLVM’s inline assembly is just like GCC’s and uses the exact same constraints. First learn how to use GCC inline assembly (if you don’t know already), and then understanding LLVM’s will be straightforward. Regards, John Criswell