BOLT Open Projects

Below is the list of BOLT project ideas with a brief description of each. Once a project is picked for active development, expect to start a new topic and file an RFC when suitable. Comment on this thread to add new ideas to the list.

  • CFG Disassembler.
    BOLT symbolizes disassembly output and reconstructs control flow for detected functions, including that for indirect branches corresponding to jump tables. Such functionality by itself is useful for analyzing binary code. BOLT outputs the control flow graph under the “-print-cfg” option, but a dedicated command-line tool will provide a better user experience. Alternatively, we can integrate the CFG output into llvm-objdump.

  • CFG Visualization.
    Expand on the CFG disassembler. Use GUI to display the graph.

  • Memory Instrumentation for Sanitizers.
    Sanitizers (asan, memsan, etc.) primarily rely on the compiler for instrumentation, limiting their visibility into assembly and pre-compiled third-party code. Loads and stores missed during instrumentation can lead to false positives and false negatives in the tool output. BOLT can add missing instrumentation and provide a better experience running sanitizers.

  • MCPlus Serialization.
    MCPlus, the internal representation used by BOLT, is built on top of the MC/MCInst layer. Adding text serialization form for MCPlus can provide several benefits. First, the compiler can emit MCPlus directly, eliminating the need to disassemble and reconstruct CFG in BOLT. Second, BOLT can save and re-load IR, opening an opportunity to edit pre-compiled binaries using assembly-like language.

  • Static Data Layout Optimization.
    Similar to how BOLT modifies code layout based on profile data, it can optimize the layout of static data. Read-only data will be easier to reorder without requiring extra information from the linker if the original data is preserved. With enough info from the compiler/linker, the reordering can be extended to all static data.

  • Raising IR.
    Raising IR to MachineInstruction or even LLVM IR level can provide further opportunities for application optimization. It’s a nontrivial task and may not always be done in a performance-efficient way. However, having a subset of the functions raised to a higher-level IR can still benefit performance. Look into related projects such as McSema and llvm-mctoll.

  • Profile-driven Register Reallocation.
    This one is related to the raising IR project but can also be approached independently. Higher quality profile available to BOLT may open opportunities for better register allocation.

  • Optimizing Linux Kernel.
    This is a work in progress: LPC 2021 - Toolchains and Kernel MC - YouTube

  • Code Prefetching.
    Software code prefetching is described in Chapter 5 of “AsmDB: Understanding and Mitigating Front-End Stalls in Warehouse-Scale Computers” (AsmDB: Understanding and Mitigating Front-End Stalls in Warehouse-Scale Computers – Google Research). It relies on the presence of “code prefetch” instruction. X86 lacks such an instruction, but it’s possible to experiment with L2 data prefetch.

  • Reduce Binary Overhead.
    BOLT creates a new segment where it places optimized code (unless run with “-use-old-text” option), which results in a binary size increase. While this size bloat does not cause any performance regressions, it may become an undesirable effect of binary rewriting. BOLT can “compress” unoptimized code by removing gaps created by moving away optimized functions and expand the existing code segment.

  • Support Stripped Binaries with Split Functions.
    BOLT will need to correctly process stripped binaries with split functions to optimize pre-built binaries from a typical Linux distro.

10 Likes

Hyped to see Linux kernel mentioned there!

2 Likes

This one reminds me a bit of Dagger, which as far as I understand uses the tablegen patterns to automate the raising:

1 Like

I’m a fun of this work !

1 Like

I have a POC change (~100 lines) that I can share with anyone who wants to pick this project.

Cool! I am very interested in adding missing instrumentation for sanitizers with BOLT.
Can you share the POC code? I want to try it. :wink:

Tiny prototype is here: âš™ D129225 [BOLT][HACK] Add memory unpoisoning

I will setup “office hours” for BOLT. Feel free to join for the discussion.

1 Like

Any plan to support the instrument method for AArch64? I am happy to help with it.

I believe @yota9 or @yavtuk might have more info. Indeed, that’s an interesting project.

We are one step away from upstreaming the patch, need to complete few internal procedures, hope to be finished soon

2 Likes

Reduce Binary Overhead.

I am developing a patch to rewrite the entire file, that is to reassign address to every section from scratch, rewrite .text and eh_frames and not create additional segments. It already works fine on x86(tested on redis, clang, lld), but since i threw away the old mapping code i now have to backport the regular logic of not touching original allocatable sections and non-relocation mode. When we get it working in both rewriting and regular mode and receive internal approval, we’ll open RFC. Also i think -use-old-text and probably -usu-gnu-stack should be deprecated in this patch, because they’re workarounds for limited ability to change binary layout, and when we rewrite all sections we can put program headers and sections anywhere we want.

1 Like

That’s fantastic! Looking forward to this RFC and the patch.