How to benchmark a new compiler?

I was creating my own optimization algorithm by myself, and this one is for some special cases that is not too common to find.

But I think I still need to benchmark my own optimizer to make sure that other cases will not be slow down.(According to phase ordering problem)

Is the project test-suite will apply to my demand?

How could I compare with the two optimizers?

I have two folders, one is old-llvm-project which is the compiler without my algorithm, and the other is new-llvm-project which contains my algorithm.

Is it correct to build test-suite twice respectively by two llvm-project and get two test-suite-build and benchmark them?

Is it correct to build test-suite twice respectively by two llvm-project and get two test-suite-build and benchmark them?


You can use llvm-test-suite/utils/ to compare the results.

I usually build the test suite using LNT.

By default, results will be output in a JSON file located in the test result directory. E.g.


You can use on two of those JSON files to diff the results.

Some of the benchmarks can be somewhat noisy, so you’ll want to account for that when you measure (e.g. run multiple times and average or something.)

So, when I re-build my own optimizer(maybe rewrite some code), I also need to re-build the test-suite, right?

If so, that will take too much time:(

And comparing two optimizer, I need to:

‘/build/bin/llvm-lit -v -j 1 -o new.josn test-suite-build-new/. ‘
‘/build/bin/llvm-lit -v -j 1 -o original.json test-suite-build-old/. ‘

Get two json files,

test-suite/utils/ original.json new.json

To compare two optimizers.

Am I right?

I will do some research on lnt. But I‘m afraid that it need sudo to install. And I can’t use this command on the computer of my school.

lnt can be installed in a virtual environment which does not require any use of sudo as far as I know.

The way I do it is I have a shell script which runs LNT, and then some functions that allow me to specify a specific test suite (e.g. the LLVM test suite, CTMark, etc).

It looks something like this:

run_lnt() {
    local build_dir=$1; shift
    local test_name=$1; shift
    local flags=$1; shift
    local extra_flags=$1;
    local test_sandbox=${SANDBOX}/${test_name}

    $SANDBOX/bin/lnt runtest test-suite \
    --sandbox ${test_sandbox} \
    --cc ${build_dir}/bin/clang \
    --cxx ${build_dir}/bin/clang++ \
    --test-suite /path/to/testsuite \
    --cflags "${flags}" \
    --cxxflags "${flags}" \
    --cmake-cache /path/to/cmake/cache/if/you/want \ # can be used for targets, optimization flags, etc...
    --build-threads 10 \ # build with a lot of threads
    --threads 1 \ # only use one thread to execute the tests
    --cmake-define TEST_SUITE_REMOTE_HOST=${DEVICE} \ # running tests on a separate device
    --benchmarking-only \

run_test_suite() {
    local compiler=$1 # Path to compiler to use
    local test_suffix=$2 # Name for directory in the sandbox, e.g. test-suite-O3
    local lnt_flags=$3 # Allow for changes in LNT settings
    local compiler_flags=$4 # C/C++ flags
    run_lnt ${compiler} test-suite-${test_suffix} "${compiler_flags}" "${lnt_flags}"

run_test_suite ${PATH_TO_BUILD} Os "" "-Os"
run_test_suite ${PATH_TO_BUILD} O3 "" "-O3"

This will put results in sandbox/test-suite-Os and sandbox/test-suite-O3 respectively. Results are put in a subdirectory with a timestamp.

The timestamp subdirectory will contain a JSON file named something like “output.json”. It will contain data like this for every benchmark:

      "code": "PASS",
      "elapsed": 2.7198410034179688,
      "metrics": {
        "compile_time": 143.48569999999998,
        "exec_time": 2.0744,
        "hash": "df833c8a74e6d0db9420d6f475c0d933",
        "link_time": 0.1073,
        "size": 412304,
        "size.__bss": 608,
        "size.__common": 8,
        "size.__const": 3040,
        "size.__cstring": 17388,
        "size.__data": 2107,
        "size.__got": 488,
        "size.__stubs": 672,
        "size.__text": 282480,
        "size.__unwind_info": 2260
      "name": "test-suite :: WhateverTheTestWas",

The script can be used to compare any of the metrics in the file. E.g. exec_time, size.__text, etc.

Thanks for your reply! I studied this tool these days.

I’m using the UI which provided by lnt to compare two compilers.
lnt runserver /myperf

I compare two compiler and only got two result of Performance Improvements – execution_time

Is that means the performances of execution time of other benchmarks (about 300 cases) are tied?