Dear LLVM community:
We would like to bring up a discussion about JITing and LLVM’s cl:opt
mechanism.
LLVM have global configurations cl:opt
that affects how the pass pipelines are being configured by default. Reading the code, one possible assumption was likely that the option will be set once and remain constant through out compilation.
This assumption likely comes from the case where compiler was used in a driver setting, for a single target backend.
As the field of compilation evolves, we start to increasing see the use-cases of JITing, where compilation is being embedded in a long running process through out a time span.
Example projects that uses LLVM in this way includes: Numba, PyTorch, Julia, Taichi, TVM and possible more that I cannot list here. The assumption that cl:opt
being configured globally still likely hold for a default setting.
But things becomes more interesting as we compose things for Hetro environment, e.g. GPU. Compilation for GPU usually comes with two set of pipelines, one for the host(e.g. x86) that drives the computation, one for the device(e.g. nvptx) code.
Sometimes both paths reuses the same passes(e.g. loop unrolling), and then there is a need to stitch the end result together – all in the same process. This becomes problematic if we decide that the GPU path and host should use a different global configurations, e.g. max-unroll-count
should be 100
for GPU but remains 0
for the host.
We bough up this discussion in the TVM community.
A0: One ppossible solution would be to simply not use LLVM as JITing engine, but simply use it as a one-time CLI, and reload LLVM for each compilation. This way the global cl::opt
can get reset per compilation invocations, and of course defeats the purpose of using LLVM as a JIT library.
A1. The most ideal solution to this is likely having an API that can quickly pull out a default pipeline but still able to reconfigure certain passes in a way that is independent from the global cl::opt
. I have limited understanding of the internally but my quick reading of the code seems to that it is a bit intertwined(but I am not an expert here and would love to see suggestions).
A2. Our last solution, is an enhanced workaround for the particular problem, is to record the cl::opt
when entering a RAII scope, and reset them when exiting the scope. So different compilation pipeline can have different defaults for CPU and GPU pipeline.
It works as follows:
void MyCodegen() {
{
With<LLVMCLOption<int>> scope("unroll-max-count", 2);
// max-unroll-count set to 2 here, pipeline 1
{
With<LLVMCLOption<int>> scope("unroll-max-count", 3);
// max-unroll-count set to 3 here
}
// max-unroll-count set to 2 here
}
// global option reset to default.
}
We would like to bring discussion to the LLVM community as JITing for Hetro Computation is only likely going to be increasingly popular. It is also likely going to be an issue that will be faced by other packages when they attempt to configure target specific pipelines differently.
It would great to get LLVM community’s thoughts on this matter.
Thanks