Bind a LLVM variable to a CPU register

I have a constant parameter in a LLVM function. Is there a way to
reserve a CPU register such that it also holds the value of the
parameter in LLVM x86 codegen ?

Thanks

Xin

put "register" keyword on that variable.

E.g.,
register int i;
...

Whether it will be register allocated is totally up to the compiler, with respect to the rest of the program.

Chuck

Not really, no.

Not really, no.

If you really, really, wanted to do it, you could:

1) Hack the code generator to not use that register. It might be as simple as modifying the TableGen file to not know that the register exists.
2) Use inline asm to put the constant into that register and fetch it from that register.

The real question is: what larger goal are you trying to accomplish? Holding a constant value in a register might not be the best way to do what you're doing.

-- John T.

E.g.,
register int i;
...

Whether it will be register allocated is totally up to the compiler,
with respect to the rest of the program.

Yeah, e.g. clang just ignores "register" keyword :slight_smile:

Not really, no.

If you really, really, wanted to do it, you could:

1) Hack the code generator to not use that register. It might be as simple
as modifying the TableGen file to not know that the register exists.
2) Use inline asm to put the constant into that register and fetch it from
that register.

The real question is: what larger goal are you trying to accomplish?
Holding a constant value in a register might not be the best way to do what
you're doing.

I am using LLVM as the code generator for my system emulator. I need
one register holding the address of the CPUState struct. most of the
emulation code is generated by LLVM. a small amount of code is not (
typically binary patched) . I must make sure that i know in which
register is the CPUState held when i am patching some code.

Xin

Not really, no.

If you really, really, wanted to do it, you could:

1) Hack the code generator to not use that register. It might be as simple
as modifying the TableGen file to not know that the register exists.

In fact this is what i am thinking at this point. but i want to keep
the modification to the LLVM as little as possible. There is no
existing APIs to disable the use a machine register, is there ?

Not really, no.

If you really, really, wanted to do it, you could:

1) Hack the code generator to not use that register. It might be as simple
as modifying the TableGen file to not know that the register exists.

In fact this is what i am thinking at this point. but i want to keep
the modification to the LLVM as little as possible. There is no
existing APIs to disable the use a machine register, is there ?

or is there a way i can stick a register dependence ( pre-instruction
dependence and/or post-instruction dependence ) on a LLVM IR node
before i sent it to the codegenerator.

Is there just one CPUState struct for the entire simulator, or are there multiple CPUState structures (e.g., one per simulated CPU) and you want to keep them handy because you use them all the time?

If it's the former, why not just use a global variable?

If it's the latter, there's probably a few options:

1) Use a thread-local global variable. Each simulated process is a thread.

2) Store the CPUState struct at the bottom or top of the stack. To get a pointer to it, mask some bits off the stack pointer using inline asm code. The Linux kernel uses this hack to find the process structure of the user-space process running on the current CPU.

3) Change the code generator. I don't work with the code generator, so I don't know how involved the change is, but I suspect it's pretty small.

-- John T.

Not really, no.

If you really, really, wanted to do it, you could:

1) Hack the code generator to not use that register. It might be as
simple
as modifying the TableGen file to not know that the register exists.
2) Use inline asm to put the constant into that register and fetch it
from
that register.

The real question is: what larger goal are you trying to accomplish?
Holding a constant value in a register might not be the best way to do
what
you're doing.

I am using LLVM as the code generator for my system emulator. I need
one register holding the address of the CPUState struct. most of the
emulation code is generated by LLVM. a small amount of code is not (
typically binary patched) . I must make sure that i know in which
register is the CPUState held when i am patching some code.

Is there just one CPUState struct for the entire simulator, or are there
multiple CPUState structures (e.g., one per simulated CPU) and you want to
keep them handy because you use them all the time?

If it's the former, why not just use a global variable?

If it's the latter, there's probably a few options:

There are more than one emulated CPUs.

1) Use a thread-local global variable. Each simulated process is a thread.

2) Store the CPUState struct at the bottom or top of the stack. To get a
pointer to it, mask some bits off the stack pointer using inline asm code.
The Linux kernel uses this hack to find the process structure of the
user-space process running on the current CPU.

these 2 are definitely possible ways. but the CPU pointer is used
extensively over the JITed emulation code. every use might need a mov
to a register on x86.

3) Change the code generator. I don't work with the code generator, so I
don't know how involved the change is, but I suspect it's pretty small.

I think i will go this way. i would agree with you that the changes
might be small.

> 3) Change the code generator. I don't work with the code generator, so I
> don't know how involved the change is, but I suspect it's pretty small.
>
I think i will go this way. i would agree with you that the changes
might be small.

  You can take X86RegisterInfo::getReservedRegs (lib/Target/X86/X86RegisterInfo.cpp)
a look. :wink:

Regards,
chenwj

John Criswell wrote:

The real question is: what larger goal are you trying to accomplish?
Holding a constant value in a register might not be the best way to do
what you're doing.

Not meaning to hijack the thread, but I'd encountered this issue when
considering how to implement a two-stack Forth-like language
interpreter with LLVM.

It has been common for an old-school Forth implementation to keep the
data stack pointer in a register; and some implementations additionally
kept the top-of-stack value in its own register.

As for the TOS-in-register optimization, I've no doubt it would be far
preferable to allow the LLVM optimizer to make such decisions --
at least assuming we're building a "subroutine threaded" implementation
where compiled definitions consist of actual IR instructions for LLVM
to optimize, rather than lists of addresses for the interpreter to
fetch and call.

But regarding the data stack pointer itself, it does feel a bit weird
to be passing the pointer around as an 'argument' to every primitive.
(And putting the data stack pointer in a global would impose a single-
threaded model on the interpreter.)

That said, i suspect that if one were to design an interpreter for
such a language with LLVM's strengths in mind -- such as ability to
inline primitives combined with IR optimization passes -- that having
to pass the data stack pointer as an 'argument' to all primitives
wouldn't amount to an actual issue of concern.

However if one were attempting to implement a more traditional Forth-
style interpreter such as a direct threaded VM, where LLVM would be
unable to optimize the IR beyond the level of each individual non-
inlineable primitive, then I figure not being able to dedicate a
register to the data stack pointer would be likely significant.

(Although why someone would want to implement the latter in LLVM
apart from as an exercise, I don't know.)

I still hope to get around to trying the former someday: inlineable
primitives, optimization, and passing the data stack as an argument,
and see what the optimized IR looks like.

Regards,

Bill