Optimization - Converting Globals to Constants

I'm working on the implementation of a high-performance financial trading simulation system. The simulations are CPU bound so faster is always better. I'm trying to determine the optimal architecture.

So far, I'm very pleased with LLVM. I've been able to get our Basic-flavor scripting language for defining the simulation rules to perform at the equivalent of code that was written in C++. So that means that we're going to be using LLVM for the product. Most of the simulation system will be written in C++ and compiled with Clang which will then be linked with the user-written simulation scripts.

Simulations are generally many different tests of the same rule sets with slightly different parameters. You might run thousands, tens of thousand, or even millions of different combinations of parameters. A typical test might run different combinations of three or four different parameters varying over a range of values. For example, you might have parameter A vary from 1 to 50, parameter B vary from 10 to 100 in steps of 10, paramter C vary from 15 to 20 for a total combination of 3,000 different tests (50 X 10 X 6). For large tests you can use different types of optimization to narrow down the parameter space (genetic algorithms etc.) but no matter what there will be a lot of tests.

Essentially all of the variables in the simulation can be varied by the user, but typically only a very small subset actually are for any given test series. There might be 50 different variables where 3 or 4 are varied during a test series. Which means that for a given series of tests, most of the variables will have a fixed value, i.e. they will really be constants. Most of them will be floating point constants, in fact. The problem is that the system won't know until runtime which of the 50-odd variables will have a fixed value.

Right now, the parameters that change are defined in the simulation system C++ code as global variables. One example might be a per-share or per-trade commission charge. Typically only one or the other is chosen and then a fixed value is used. So a given test might use $10 per trade, while another might use $0.01 per share. For the duration of a typical test series, that value would be fixed and the same for all the different tests, i.e. if you were running 3,000 tests they would all typically use the same value for the commission charge.

It strikes me that with LLVM it ought to be possible to convert the variables to constants as part of an optimization pass that is run just before starting a simulation, or perhaps with a custom pass that I'll have to write myself. After the variables are converted to constants in the IR, it seems like further optimization could result in considerable speed increases for certain kinds of expression evaluation.

Is there some relatively easy way to do the conversion of global floating point variables to floating point constants after loading the bitcode? Or will I have to write a custom pass to do this? It seems like the Global Variable Optimizer pass should do what I want if I set things up properly. Is that right? If so, what do I need to do to give the variables an initial value and make sure that the Global Variable Optimizer can recognize that these variables can indeed be optimized into constants?

- Curtis

You'll essentially have to do it yourself, I think, but it's not hard:
just call GlobalVariable::setInitializer and
GlobalVariable::setConstant on the globals in question once the
bitcode is loaded, then run instcombine to propagate the constants. I
don't think globalopt has enough information in your situation to do
the optimization in question.


Since you know they are constant, this might be easiest for you. LLVM
should do it if your program has a main, you run internalize, you
write to them before any reads, and you don't pass them by address.
But since you are compiling the script, it is more certain to do it
there (but the more you avoid taking the address of globals, the
better the optimizer deals with them).