Run-time optimization using profile data

Hello:

I am new to llvm. I am looking for an example somewhere, or a
walkthrough/guide on how to do runtime optimization using llvm. Ideally, I
would like to:

1. Compile the program from C to LLVM or native with LLVM information
embedded in the binary.
2. Run the binary under LLVM's interpreter, and profile the data. I hope
LLVM has support for all this, and I don't have to insert my own
instructions for profiling.
3. A callback that gets called when a function or a basic block gets hot.
Ideally, I would like to transform this basic block and connected ones in
the graph. So ideally, I would like my function to get called if a trace or
a collection of basic blocks gets hot.
4. I would like to do some transformations in the IR. i.e. from LLVM->LLVM
transforms on the aforementioned hot regions/blocks.
5. I would then like to have control over the JIT as well. In the above
LLVM->LLVM transform, I would have placed some "special" instructions (like
maybe illegal opcodes, or something like that), and once the JIT is about to
translate those, my routine should get called. I will then transform those
instructions into "special" native instructions.
6. I then want to execute the newly written binary and remove profile
instrumentation, but leave my special native instructions intact.

I would like to know if there is such an example in the LLVM package. If
not, where in the cpp files should I begin hacking to do each of these
steps.

Regards,

Hello:

I am new to llvm. I am looking for an example somewhere, or a
walkthrough/guide on how to do runtime optimization using llvm. Ideally, I
would like to:

1. Compile the program from C to LLVM or native with LLVM information
embedded in the binary.
2. Run the binary under LLVM's interpreter, and profile the data. I hope
LLVM has support for all this, and I don't have to insert my own
instructions for profiling.

That isn't too hard. It isn't really getting used, but there's some
code for this in the LLVM tree. See
llvm/lib/Transforms/Instrumentation, and the related runtime support
in llvm/runtime/libprofile.

3. A callback that gets called when a function or a basic block gets hot.
Ideally, I would like to transform this basic block and connected ones in
the graph. So ideally, I would like my function to get called if a trace or
a collection of basic blocks gets hot.

4. I would like to do some transformations in the IR. i.e. from LLVM->LLVM
transforms on the aforementioned hot regions/blocks.

There's isn't really any existing code for this, but there isn't
anything fundamentally tricky about it; it's just a matter of writing
code to analyze the profile data and a transformation pass which uses
the data.

5. I would then like to have control over the JIT as well. In the above
LLVM->LLVM transform, I would have placed some "special" instructions (like
maybe illegal opcodes, or something like that), and once the JIT is about to
translate those, my routine should get called. I will then transform those
instructions into "special" native instructions.

You want to have the JIT recompile the code as it's running? The JIT
doesn't currently support that, and it's relatively tricky to write.

Messing with code generation to add platform-specific intrinsics isn't
too hard; you'd need to provide more details to say anything specific,
though.

6. I then want to execute the newly written binary and remove profile
instrumentation, but leave my special native instructions intact.

Stripping out the instrumentation should be a very simple transformation pass.

-Eli

5. I would then like to have control over the JIT as well. In the above
LLVM->LLVM transform, I would have placed some "special" instructions (like
maybe illegal opcodes, or something like that), and once the JIT is about to
translate those, my routine should get called. I will then transform those
instructions into "special" native instructions.

You want to have the JIT recompile the code as it's running? The JIT
doesn't currently support that, and it's relatively tricky to write.

You are not the first user of JIT that wants something like this. There is definite interests in having an alternative call back mechanism. I think there are two issues:

1. Replace the default lazy compilation callback, e.g. see X86JITInfo.cpp:
static TargetJITInfo::JITCompilerFn JITCompilerFunction

with a custom one that can run dynamic optimization passes which make use of the profile data.

2. Add custom logic to determine when a call instruction is replaced with a call to a stub which would in turn invoke the compilation callback.

Evan