Preexecution

Coming from wizer Question: Why does Wizer work? (sicerely) · Issue #60 · bytecodealliance/wizer · GitHub I’d like to raise the following question to the llvm community:

Has preexecution of binaries be considered for the llvm project/ what is the current state of the art?

As shown by wizer, its possible to get real world (in this case e.g. regex) benefits of programs compiled with llvm (in this case: rust). This seemed to me like an issue because llvm typically does constant folding/ dead code elimination already so it is counterintuitive that a tool like wizer (or wasm opt) can produce such performance (or size) benefits.

For the following considerations I’d like to restrict my arguments to statically linked programs which are compiled with all the source available at once.

The main problem for advanced constant folding seems to be that memory allocation is not considered to be “constant”. If the allocator is known at compile time (say mimalloc or whatever) it could be possible to actually bake the memory allocations into the binary/ do them at compile time, like wizer does.

One other benefit besides performance I see is simplicity of languages targetting llvm. At the moment, each language has to basically implement a llvm backend and an interpreter, where the interpreter is used for advanced constant folding/ preexecution (miri in rusts case/ cpp has constexpr/ consteval support) and llvm for the final output.
LLVM trying to abstract common functionality needed by languages could/ (maybe should?) fill this gap by doing preexecution itself.

Another possible benefit (although this is a bit handwavy) is that more dynamic languages like python/ ruby etc. could potentially benefit from such an approach because llvm would effectively do some tree shaking/ initialisation work. (roughly related Bytecode Alliance and also Removing support for Emacs unexec from Glibc [LWN.net] (as mentioned by Nick Fitzgerald))

One might argue that doing memory allocations at compile time might increase the size of binaries very much but I’d argue that is basically a similar consideration as -Os vs -O3.
Another consideration is that preexecution might not terminate but I’d argue that llvm probably has many very slow optimisations (exponential or not always unsolvable) which have some budgets/ bounds which determine “how hard the compiler tries”.

I’d be very interested in the current state of the art research in this area/ considerations beeing made.

I wouldn’t think that constant folding would have anything to do with it. The main benefits would come from miscellaneous startup initializations (running all the global ctors and whatnot) which yes means a lot of memory allocations. Like on the LWN.net thread talking about Emacs, where it’s all about a hacky shortcut to set up initial memory allocator state.

I remember doing this with a monster Pascal program back in the 80s, on an OS where you could (a) start a program, (b) it would stop itself, (c) you could save the stopped process state as an executable file, which (d) would just continue from the stop point when you ran the saved executable file. Multiple seconds of initialization time just disappeared. Modern OSes don’t like to have useful features like that.

Making programs be restartable/continuable like that, in a way that’s both generic and portable, would be hard. I don’t keep up on language research so maybe someone has done it, but I would guess not.

sure I think the term constant folding is quite stretched when applied to this issue, dynamic memory allocation has a somewhat large contrast to constant folding, however it can be argued that memory allocated at compile time, can be a constant at runtime and is thus still a constant. That is effectively what wizer does as well, turning dynamic allocations into constants.

(d) would just continue from the stop point when you ran the saved executable file

this sounds very interesting indeed :smiley:

Some more datapoints which show that languages (using llvm) are exploring these directions:
zig: use case: comptime allocator · Issue #1291 · ziglang/zig · GitHub
rust: Heap allocations in constants · Issue #20 · rust-lang/const-eval · GitHub
circle: circle/meta4.cxx at 9cf42a1c3ab58a2f1fc46775fe2495c29a4f846f · seanbaxter/circle · GitHub
jai Demo: Base language, compile-time execution - YouTube (this one is a real stretch as it executes code which requires user input during the compilation)

All of this is different from build systems executing code (but it replaces the need for those). The difference with build systems executing code during compilation is that the resulting artefacts have to be put into the main program somehow whereas as in the example of circle, the json input is directly embedded into the cpp file while being converted to cpp. In a traditional build step this would involve generating a separate cpp file and including that.

Reading this blogpost This year in LLVM (2022) I came across the support of llvm for Memory effect modelling and [RFC] Attributes for Allocator Functions in LLVM IR. The allocator support sounds at least roughly to me like it could possibly do some of the optimisations which wizer does.