Testing utility for building and updating CFG

Hi folks,

I’m working on adding an API for incremental updates to DominatorTree and I noticed that there isn’t a simple way to quickly build and update CFG (just for testing) – one has to either build CFG programmatically, or write IR by hand and manually map pointers to basic blocks. The downside is that it tends to be pretty verbose and not easy to update (e.g. adding a new edge often involves changing the type of terminator instruction).

My idea is to create a simple format for building arbitrary CFG’s and new utilities for updating it.

It can look something like this:
b ModuleName FunctionName // Build module ‘ModuleName’ with a single

// function ‘FunctionName’.

a entry if // Add new basic blocks (entry and if) and

// connect them with an edge.

a if if.then // Add new basic block (if.then) and create

// an edge from if to if.then

a if if.else

a if.then foo

a if.else foo

a bar // Add a new block bar, but don’t create

// any edges.

e // Finish constructing initial CFG

i foo bar // Insert and edge from foo to bar.

d if if.then // Delete the edge from if to if.then.

i if bar // Insert an edge from if to bar.

Under the hood, the utility would build IR with just switch instructions.

Then you could assign it to a string and use in a unit test:

CFGBuilder B(MyString);

Function *F = B.getFunction();

while (!B.finished()) {

Update U = B.applyNextUpdate(); // B changes the CFG.

// U stores type of update (insertion/deletion) and a pair of BB’s

// (From and To).

doSomethingWithTheUpdate(U);

}

Other CFG passes (e.g. LoopInfo, NewGVN) could also use it for testing. It would be also possible to hook it up to a fuzzer at some point in the future.

What do you think about having such a utility in llvm?

~Kuba

Hi folks,

I’m working on adding an API for incremental updates to DominatorTree and
I
noticed that there isn’t a simple way to quickly build and update CFG
(just
for testing) -- one has to either build CFG programmatically, or write IR
by hand and manually map pointers to basic blocks. The downside is that
it
tends to be pretty verbose and not easy to update (e.g. adding a new edge
often involves changing the type of terminator instruction).

Hi Jakub,

I really like the idea of automating the testing and to even prepare for
extensive fussing.

My idea is to create a simple format for building arbitrary CFG’s and new
utilities for updating it.

It can look something like this:
b ModuleName FunctionName // Build module ‘ModuleName’ with a single

                         // function ‘FunctionName’.

a entry if // Add new basic blocks (entry and if) and

                         // connect them with an edge.

a if if.then // Add new basic block (if.then) and create

                         // an edge from if to if.then

a if if.else

a if.then foo

a if.else foo

a bar // Add a new block bar, but don’t create

                         // any edges.

e // Finish constructing initial CFG

i foo bar // Insert and edge from foo to bar.

d if if.then // Delete the edge from if to if.then.

i if bar // Insert an edge from if to bar.

Under the hood, the utility would build IR with just switch instructions.

Then you could assign it to a string and use in a unit test:

CFGBuilder B(MyString);

Function *F = B.getFunction();

while (!B.finished()) {

   Update U = B.applyNextUpdate(); // B changes the CFG.

// U stores type of update (insertion/deletion) and a pair of BB’s

// (From and To).

   doSomethingWithTheUpdate(U);

}

Other CFG passes (e.g. LoopInfo, NewGVN) could also use it for testing.
It
would be also possible to hook it up to a fuzzer at some point in the
future.

What do you think about having such a utility in llvm?

I like the idea and if it helps to have extensive unit tests in LLVM,
that's great. I am slightly worried about a new DSL (which always has a
maintenance cost). However, this seems well contained and simple, so I
would expect it to remain unchanged and up-to-date for a while.

Some thoughts (which I mentioned earlier).

# lit compatibility and maybe even opt compatability

I believe that LLVM lit tests are the easiest tests to write and debug.
If there is a chance we can write such unit tests as lit tests,
I would really appreciate this. It might even make sense to allow these
update-sets to be read through an opt pass, such that
dominators, loop passes, ... can be tested as part of the pass pipeline.

Off the cuff, I’d probably err on the side of improving/simplifying the API for building/querying the CFG if it’s possible to go far enough along that route to address the issues (fuzzing could still be done there - by taking a byte array from libFuzzer and using the bytes to drive API calls to build a CFG, for example (is this byte less than → make an edge, etc)). But I realize there might be good reasons it’s not practical - it’d be worth understanding/documenting those reasons before starting on a new tool/format/etc, I think.

Tobias, David,

Thanks for sharing your opinions and concerns.

After some prototyping and an offline discussion with David, I think that it would be better to take a few steps back of the initial proposal. At this point it is not clear if and how such a tool would be useful for testing things other than dominators. It would certainly be a burden to commit and later maintain even a very small and simplistic DSL, especially if it was to be exposed by other tools (e.g. opt).

I believe that a better way forward would be to create a small and simple library-only solutions used only by dominator tests – at least for now. I created a prototype which you can find here: https://reviews.llvm.org/D34798.

With this design it shouldn’t be expensive to extend it later, when it gains actual users (like the mentioned LoopInfo and NewGVN), or to even create a think wrapper for handling a custom DSL.

Please let me know what you think.

Thanks,

~Kuba

Hi,

I'm fine with the library approach you're ultimately decided to go
with, but I was looking forward to having an LLVM tool that I could
use to easily translate terse graph specs into IR (which I could then
modify by hand to create test cases etc.).

-- Sanjoy